| Model Type |  | 
| Use Cases | 
| Areas: |  |  | Applications: | | Research, Text Generation | 
 |  | Primary Use Cases: |  |  | Limitations: | | May produce hallucinations or unreliable outputs | 
 |  | Considerations: | | Manual checks required for safety | 
 |  | 
| Additional Notes | | Developed with grants from Andreessen Horowitz (a16z) | 
 | 
| Supported Languages | | en (general), zh (general) | 
 | 
| Training Details | 
| Data Sources: | | JosephusCheung/GuanacoDataset, Open-Orca/OpenOrca, stingning/ultrachat, meta-math/MetaMathQA, liuhaotian/LLaVA-Instruct-150K, jondurbin/airoboros-3.1, WizardLM/WizardLM_evol_instruct_V2_196k, RyokoAI/ShareGPT52K, RyokoAI/Fandom23K, milashkaarshif/MoeGirlPedia_wikitext_raw_archive, wikipedia, wiki_lingua, fnlp/moss-003-sft-data, garage-bAInd/Open-Platypus, LDJnr/Puffin, openbmb/llava_zh, BAAI/COIG, TigerResearch/tigerbot-zhihu-zh-10k, liwu/MNBVC, teknium/openhermes | 
 |  | Data Volume: |  |  | Methodology: | | Identical structure to LLaMA2, using synthetic data | 
 |  | Model Architecture: | | LLaMA2 architecture without scaling of RoPE | 
 |  | 
| Safety Evaluation | 
| Risk Categories: | | misinformation, bias, objectionable content, pornography, violence, offensive language | 
 |  | Ethical Considerations: | | Model trained on unfiltered internet data | 
 |  | 
| Responsible Ai Considerations | 
| Fairness: | | Synthetic data utilized for some language variants | 
 |  | Accountability: | | Developers have not vetted all content | 
 |  | Mitigation Strategies: | | Users advised to filter certain keywords | 
 |  | 
| Input Output | 
| Input Format: |  |  | Accepted Modalities: |  |  | Output Format: |  |  |