Model Type | transformer-based, decoder-only, large language model, instruction fine-tuned |
|
Use Cases |
Areas: | research, commercial applications |
|
Applications: | natural language tasks, coding tasks |
|
Primary Use Cases: | few-turn question answering |
|
Limitations: | not suitable for non-English languages, does not support native code execution |
|
Considerations: | Review associated risks; follow acceptable use policy. |
|
|
Supported Languages | primary_language (English), proficiency_level (High) |
|
Training Details |
Data Sources: | |
Data Volume: | |
Methodology: | |
Context Length: | |
Model Architecture: | Mixture-of-Experts (MoE) with rotary position encodings, gated linear units, and grouped query attention. 16 experts, 4 chosen per input. |
|
|
Input Output |
Input Format: | |
Accepted Modalities: | |
Output Format: | |
|
Release Notes |
Version: | |
Notes: | Initial release of DBRX Instruct, instruction fine-tuned model. |
|
|
|