| Model Type | |
| Use Cases |
| Areas: | |
| Applications: | | assistant-like chat, natural language generation tasks |
|
| Primary Use Cases: | | English language research & applications |
|
| Limitations: | | Use in languages other than English |
|
| Considerations: | | Developers must comply with the Acceptable Use Policy and Llama 3 Community License. |
|
|
| Additional Notes | | Optimized for handling very long contexts with minimal training adjustments. |
|
| Supported Languages | |
| Training Details |
| Data Sources: | | SlimPajama dataset, UltraChat chat dataset |
|
| Data Volume: | |
| Methodology: | |
| Context Length: | |
| Training Time: | |
| Hardware Used: | | Crusoe Energy high performance L40S cluster |
|
| Model Architecture: | | auto-regressive optimized transformer with RoPE |
|
|
| Safety Evaluation |
| Methodologies: | | red teaming, adversarial evaluations |
|
| Risk Categories: | | cybersecurity, child safety |
|
| Ethical Considerations: | | Residual risks and trade-offs between helpfulness and alignment noted. |
|
|
| Responsible Ai Considerations |
| Fairness: | | Efforts to reduce biases and ensure model safety. |
|
| Transparency: | | Documentation and methodologies publicly available. |
|
| Accountability: | | Users are responsible for ensuring applications are compliant with use policies. |
|
| Mitigation Strategies: | | Use of Llama Guard and Code Shield safeguards for safe deployments. |
|
|
| Input Output |
| Input Format: | |
| Accepted Modalities: | |
| Output Format: | |
| Performance Tips: | | Use RoPE scaling and appropriate hardware for long context handling. |
|
|
| Release Notes |
| Version: | |
| Date: | |
| Notes: | | Initial release of Llama-3 70B Instruct Gradient 262K |
|
|
|