| Model Type | | Auto-regressive language model, Text Generation, Dialogue |
|
| Additional Notes | | English language model with potential for fine-tuning for other languages under license conditions. |
|
| Training Details |
| Data Sources: | | Publicly available online data |
|
| Data Volume: | |
| Methodology: | | Supervised fine-tuning (SFT) and Reinforcement Learning with Human Feedback (RLHF) |
|
| Context Length: | |
| Hardware Used: | | Meta's Research SuperCluster, H100-80GB GPUs |
|
| Model Architecture: | | Optimized transformer architecture |
|
|
| Safety Evaluation |
| Methodologies: | | Red-teaming, Adversarial evaluations |
|
| Findings: | | Llama 3 is significantly less likely to falsely refuse to answer prompts than Llama 2 |
|
| Risk Categories: | | Misinformation, Insecure coding |
|
| Ethical Considerations: | | Iterative testing was done to assess safety related to CBRNE threats. |
|
|
| Responsible Ai Considerations |
| Fairness: | | Model is designed to be inclusive and helpful across a wide range of use cases. |
|
| Transparency: | | Efforts are made to maintain transparency through open community contributions. |
|
| Accountability: | | Meta encourages developers to be responsible for customizing safety for their use case. |
|
| Mitigation Strategies: | | Meta Llama Guard 2 and Code Shield for safety. |
|
|
| Input Output |
| Input Format: | |
| Accepted Modalities: | |
| Output Format: | |
|
| Release Notes |
| Version: | |
| Date: | |
| Notes: | | Additional parameters and context length. |
|
| Version: | |
| Date: | |
| Notes: | | Initial release with Grouped-Query Attention. |
|
|
|