| Model Type | | text generation, instruction-tuned |
|
| Use Cases |
| Areas: | | commercial applications, research |
|
| Applications: | |
| Primary Use Cases: | | Natural language generation tasks |
|
| Limitations: | | Use only in accordance with laws |
|
| Considerations: | | Safety testing and tuning recommended |
|
|
| Additional Notes | | Uses proprietary EasyContext Blockwise RingAttention library for long context training. |
|
| Supported Languages | |
| Training Details |
| Data Sources: | | SlimPajama, UltraChat, publicly available instruction datasets |
|
| Data Volume: | |
| Methodology: | |
| Context Length: | |
| Training Time: | | 100-516 minutes per context length |
|
| Hardware Used: | |
| Model Architecture: | | auto-regressive transformer architecture |
|
|
| Safety Evaluation |
| Methodologies: | | red-teaming, adversarial evaluations, CyberSecEval |
|
| Findings: | | residual risks remain, over-refusal reduced |
|
| Risk Categories: | | CBRNE, Cyber Security, Child Safety |
|
| Ethical Considerations: | | Open approach for better, safer products |
|
|
| Responsible Ai Considerations |
| Fairness: | | Access to many different backgrounds |
|
| Transparency: | |
| Accountability: | | Developers responsible for safe use |
|
| Mitigation Strategies: | | Meta Llama Guard and Code Shield safeguards |
|
|
| Input Output |
| Input Format: | |
| Accepted Modalities: | |
| Output Format: | |
| Performance Tips: | |
|
| Release Notes |
| Version: | |
| Date: | |
| Notes: | | Initial release of Llama 3 model |
|
|
|