| Model Type | | auto-regressive, generative text model |
|
| Use Cases |
| Areas: | | Commercial and research applications in English |
|
| Applications: | | Natural language generation tasks, Assistant-like chat |
|
| Primary Use Cases: | | Pretrained models can be adapted for various tasks |
|
| Limitations: | | Use in languages other than English, Violates laws or regulations |
|
| Considerations: | | Specific formatting required to get expected features for chat. |
|
|
| Additional Notes | | Carbon footprint of pretraining is offset by Metaβs program. |
|
| Supported Languages | | languages_zero_shot (/0/), proficiency_level (/0/), default_language (/0/) |
|
| Training Details |
| Data Sources: | | A new mix of publicly available online data |
|
| Data Volume: | |
| Methodology: | | Pretrained using auto-regressive architecture and fine-tuned with supervised learning and reinforcement learning with human feedback. |
|
| Context Length: | |
| Training Time: | |
| Hardware Used: | | Meta's Research Super Cluster, third-party cloud compute |
|
| Model Architecture: | | Optimized transformer architecture |
|
|
| Safety Evaluation |
| Methodologies: | | Evaluation on standard academic benchmarks |
|
| Findings: | | Outperform open-source chat models on most benchmarks tested, Par with some closed-source models |
|
| Ethical Considerations: | | Before deploying applications, developers should perform safety testing tailored to specific applications. |
|
|
| Responsible Ai Considerations |
| Fairness: | | Testing covers English scenarios, cannot predict nor cover all scenarios. |
|
| Accountability: | | Developers should perform safety testing tailored to specific applications. |
|
| Mitigation Strategies: | | Tuned with reinforcement learning with human feedback for alignment. |
|
|
| Input Output |
| Input Format: | |
| Accepted Modalities: | |
| Output Format: | |
| Performance Tips: | | Bigger models (70B) use Grouped-Query Attention for improved scalability. |
|
|