| Model Type | | dense, decoder-only, Transformer |
|
| Use Cases |
| Areas: | |
| Applications: | | Memory/compute constrained environments, Latency bound scenarios, Strong reasoning (code, math, and logic) |
|
| Primary Use Cases: | | commercial applications, research use in English |
|
| Limitations: | | Not evaluated for all downstream purposes, Performance primarily in English |
|
| Considerations: | | Developers should evaluate for accuracy, safety, and fairness, especially in high-risk scenarios. |
|
|
| Additional Notes | | Integration with transformers development version 4.40.0, using flash attention by default. Optimized for various iterations of GPU and CPU hardware. |
|
| Supported Languages | |
| Training Details |
| Data Sources: | | Publicly available documents, synthetic data, high-quality educational data, code |
|
| Data Volume: | |
| Methodology: | | Supervised fine-tuning and Direct Preference Optimization |
|
| Context Length: | |
| Training Time: | |
| Hardware Used: | |
| Model Architecture: | | dense decoder-only Transformer |
|
|
| Safety Evaluation |
| Risk Categories: | | misinformation, bias, offensiveness |
|
| Ethical Considerations: | | Developers should ensure the model complies with relevant laws and regulations. |
|
|
| Responsible Ai Considerations |
| Fairness: | | Evaluated for instructional following and safety measures. |
|
| Transparency: | | Developers should inform users they are interacting with an AI system. |
|
| Accountability: | | Developers are responsible for their specific use cases complying with laws. |
|
| Mitigation Strategies: | | Use available safety classifiers or custom solutions. |
|
|
| Input Output |
| Input Format: | | Chat format with <|user|> and <|assistant|> tags |
|
| Accepted Modalities: | |
| Output Format: | | Generated text in response to input prompts |
|
| Performance Tips: | | For NVIDIA V100 or earlier, use attn_implementation="eager" |
|
|