Model Type | Large Language Model, Bilingual, Causal Language Model, Decoder |
|
Use Cases |
Areas: | |
Applications: | Research, Natural language understanding and generation, Mechanistic interpretability analyses, Quantitative studies of Arabic cultural phenomena, Development of chat apps, Sentiment analysis, Summarization |
|
Limitations: | Model is bilingual, optimized for Arabic and English; not for other languages, Prohibited for use in illegal activities, Not for sensitive information handling |
|
|
Additional Notes | The models are aimed at enhancing research and commercial applications for Arabic NLP. The methodology focuses on improving contexts and extending language capabilities. |
|
Supported Languages | Arabic (Proficient), English (Proficient) |
|
Training Details |
Data Sources: | Web (publicly available web pages, Wikipedia articles, news articles, and social network content), Code data in various programming languages, Books (publicly available Arabic and English books), Scientific (subset of ArXiv papers), Synthetic data (translated from English to Arabic) |
|
Data Volume: | |
Methodology: | Auto-regressive training with enhancements like SwiGLU and ALiBi for Jais-family; uses RoPE and Grouped Query Attention for Jais-adapted. |
|
Context Length: | |
Model Architecture: | Transformer-based, decoder-only (GPT-3 architecture with advancements for better context handling) |
|
|
Release Notes |
Version: | |
Notes: | 590M parameters, context length 2048 |
|
Version: | |
Notes: | 1.3B parameters, context length 2048 |
|
Version: | |
Notes: | 2.7B parameters, context length 2048 |
|
Version: | |
Notes: | 6.7B parameters, context length 2048 |
|
Version: | |
Notes: | 13B parameters, context length 2048 |
|
Version: | |
Notes: | 30B parameters, context length 8192 |
|
Version: | |
Notes: | 30B parameters, context length 16384 |
|
|
|