| Model Type |  | 
| Use Cases | 
| Areas: | | Research, Commercial applications | 
 |  | Applications: | | Natural language processing, Coding, Mathematics, Chatbots | 
 |  | Primary Use Cases: | | Generating long texts, Understanding structured data, Multilingual text processing | 
 |  | 
| Supported Languages | | languages_supported (29 languages including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more), proficiency (Multilingual support) | 
 | 
| Training Details | 
| Data Sources: | | various sources mentioned in the technical report | 
 |  | Methodology: | | Pretraining & Post-training | 
 |  | Context Length: |  |  | Model Architecture: | | transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias | 
 |  | 
| Input Output | 
| Input Format: | | Chat-based structured prompts | 
 |  | Accepted Modalities: |  |  | Output Format: | | Generated text following the prompt schema with a max of 8192 tokens | 
 |  | Performance Tips: | | Use vLLM for processing long texts; ensure proper configuration of rope scaling for long contexts | 
 |  |