| Model Type | | Large Language Model, Instruction Following |
|
| Use Cases |
| Areas: | |
| Applications: | | Language generation, Chat assistance, Instruction following |
|
| Primary Use Cases: | | Instruction following, Multi-turn dialogue |
|
| Limitations: | | Potential to generate offensive or unethical content under adversarial conditions |
|
| Considerations: | | Continuous improvement and encourage responsible usage. |
|
|
| Additional Notes | | Unofficial checkpoint for research purposes. |
|
| Supported Languages | |
| Training Details |
| Data Sources: | | Preference data mix, Prompt collection for RLHF training |
|
| Data Volume: | |
| Methodology: | | Iterative DPO with online RLHF |
|
| Training Time: | |
| Model Architecture: | | Iterative DPO based training |
|
|
| Safety Evaluation |
| Methodologies: | |
| Findings: | | Potential for offensive content under adversarial conditions |
|
| Risk Categories: | | Offensive content, Ethical considerations |
|
| Ethical Considerations: | | Safety and ethical considerations are integral to the alignment process. |
|
|
| Responsible Ai Considerations |
| Fairness: | |
| Transparency: | | Technical report available |
|
| Accountability: | | Developers and affiliated institution |
|
| Mitigation Strategies: | | Continuous improvement in model safety |
|
|
| Input Output |
| Input Format: | |
| Accepted Modalities: | |
| Output Format: | |
| Performance Tips: | | Optimal performance on CUDA-enabled devices |
|
|
| Release Notes |
| Version: | |
| Notes: | | Initial release of the unofficial checkpoint showcasing online iterative RLHF. |
|
|
|