| Model Type | | bilingual, large language model, text generation |
|
| Use Cases |
| Areas: | |
| Applications: | | Natural language understanding and generation, Mechanistic interpretability analyses, Chat assistants, Sentiment analysis, Document summarization |
|
| Primary Use Cases: | | Arabic and English language tasks |
|
| Limitations: | | Handling or generating personal, confidential, or sensitive information, High-stakes decisions without human oversight |
|
| Considerations: | | Model should not be used beyond its designed language proficiency or for making critical decisions without human involvement. |
|
|
| Additional Notes | | The model unlocks numerous use cases in Arabic NLP, with strategies extensible to other low and medium resource languages. |
|
| Supported Languages | | languages_supported (Arabic (MSA), English), proficiency (Optimized for Arabic, strong in English) |
|
| Training Details |
| Data Sources: | | Web, Code, Books, Scientific, Synthetic, ArXiv papers |
|
| Data Volume: | |
| Methodology: | | Instruction fine-tuned for dialog |
|
| Context Length: | |
| Hardware Used: | | Cerebras CS-2 Wafer-Scale Engines |
|
| Model Architecture: | | Transformer-based, decoder-only architecture, Jais models are trained from scratch, while Jais adapted models are built on Llama-2. |
|
|
| Safety Evaluation |
| Methodologies: | | Bias and misinformation assessments |
|
| Risk Categories: | |
| Ethical Considerations: | | Prohibits use for harmful, misleading, or inappropriate content. |
|
|
| Responsible Ai Considerations |
| Fairness: | | Efforts made to minimize biases, but biases may still be present. |
|
| Transparency: | | The training and tuning processes are documented. |
|
| Accountability: | | Users must ensure the model is used ethically and legally. |
|
| Mitigation Strategies: | | Incorporated fine-tuning with diverse Arabic-English prompt-response pairs. |
|
|
| Input Output |
| Input Format: | | Text prompts in either Arabic or English |
|
| Accepted Modalities: | |
| Output Format: | |
| Performance Tips: | | Model is optimized for bilingual tasks; ensure prompts are framed within supported languages. |
|
|
| Release Notes |
| Version: | |
| Date: | |
| Notes: | | Introduction of 20 models across various sizes, featuring improved context handling and precision. |
|
|
|