| Model Type | | text generation, instruction following |
|
| Use Cases |
| Areas: | |
| Applications: | | General purpose AI systems, Memory/compute constrained environments, Latency bound scenarios, Strong reasoning (code, math, logic) |
|
| Primary Use Cases: | | Intended for use in broad commercial and research fields |
|
| Limitations: | | Not designed for all downstream purposes, Limited language support outside English |
|
| Considerations: | | Evaluate and mitigate for accuracy, safety, and fairness |
|
|
| Supported Languages | | multilingual (English), others (10% multilingual) |
|
| Training Details |
| Data Sources: | | Publicly available documents, Newly created synthetic data, High quality chat format supervised data |
|
| Data Volume: | |
| Methodology: | | Supervised fine-tuning and Direct Preference Optimization (DPO) |
|
| Context Length: | |
| Training Time: | |
| Hardware Used: | |
| Model Architecture: | | Dense decoder-only Transformer |
|
|
| Responsible Ai Considerations |
| Fairness: | | Awareness of language variety performance, representation of harms and stereotypes |
|
| Transparency: | | Disclosures on potential behaviors |
|
| Accountability: | | Developers are responsible |
|
| Mitigation Strategies: | | Follow best practices and implement additional mitigations for sensitive deployment contexts |
|
|
| Input Output |
| Input Format: | |
| Accepted Modalities: | |
| Output Format: | |
| Performance Tips: | | Include specific tokens for improved reliability |
|
|