| Model Type | | large language model, bilingual, auto-regressive, transformer-based, decoder-only |
|
| Use Cases |
| Areas: | |
| Applications: | | Chat applications, Sentiment analysis, Document summarization |
|
| Primary Use Cases: | | Arabic Natural Language Processing, Developing chat assistants, Mechanistic interpretability analyses |
|
| Limitations: | | Limited to responses in Arabic and English, Potential for generating biased or incorrect information |
|
|
| Supported Languages | | primary_language (Arabic), other_languages (English) |
|
| Training Details |
| Data Sources: | | Web pages, Wikipedia articles, News articles, Social network content, Code data, Books, Scientific papers, Synthetic translation |
|
| Data Volume: | |
| Methodology: | | Pre-trained from scratch or adaptively from Llama-2 |
|
| Context Length: | |
| Hardware Used: | | Condor Galaxy (CG) supercomputer platform, Cerebras CS-2 Wafer-Scale Engines |
|
| Model Architecture: | | Transformer-based, decoder-only architecture using SwiGLU and ALiBi for Jais models and RoPE with Grouped Query Attention for adapted models |
|
|
| Responsible Ai Considerations |
| Mitigation Strategies: | | Multiple techniques to reduce bias, but some bias and errors are likely. |
|
|
| Input Output |
| Input Format: | |
| Output Format: | |
|