| Model Type | | Mixture-of-Experts (MoE), code language model | 
 | 
| Use Cases | 
| Areas: | | code-specific tasks, math and reasoning, programming languages extension | 
 |  | Applications: | | AI code assistance, software development, research in code intelligence | 
 |  | Primary Use Cases: | | Code completion, Code insertion, Chatbot assistance for coding queries | 
 |  | Limitations: | | Optimal performance requires specified hardware, Compatibility with certain APIs necessary | 
 |  | 
| Additional Notes | | Supported languages expanded to 338 from 86. Allows for commercial use. | 
 | 
| Supported Languages | | languages_supported (, programming languages: 338, extended from 86), competence_level (high proficiency in code-specific tasks) | 
 | 
| Training Details | 
| Data Sources: | | DeepSeekMoE framework, intermediate checkpoint of DeepSeek-V2, additional 6 trillion tokens | 
 |  | Data Volume: |  |  | Methodology: | | Mixture-of-experts mechanism for enhanced coding and reasoning | 
 |  | Context Length: |  |  | Hardware Used: | | BF16 format inference requires 8*80GB GPUs | 
 |  | Model Architecture: | | Mixture-of-Experts with active parameters | 
 |  | 
| Input Output | 
| Input Format: |  |  | Accepted Modalities: |  |  | Output Format: | | Model-generated text responses | 
 |  | Performance Tips: | | Use of specified HF or vLLM frameworks for optimal inference. | 
 |  |