Model Type | transformer-based, large language model, mixture-of-experts |
|
Use Cases |
Areas: | commercial applications, research |
|
Primary Use Cases: | text completion, coding tasks |
|
Limitations: | Designed primarily for English, does not support non-English languages, Not for native code execution or function-calling |
|
Considerations: | Use with caution for general English-language and coding tasks; additional testing recommended for safety |
|
|
Additional Notes | DBRX Base is a mixture-of-experts large language model with a fine-grained approach using 16 experts and RoPE. It is distributed under an open model license. |
|
Supported Languages | English (high proficiency) |
|
Training Details |
Data Sources: | |
Data Volume: | |
Methodology: | |
Context Length: | |
Hardware Used: | |
Model Architecture: | decoder-only with next-token prediction, rotary position encodings, gated linear units, grouped query attention |
|
|
Input Output |
Input Format: | |
Accepted Modalities: | |
Output Format: | |
Performance Tips: | Use PR https://github.com/AutoGPTQ/AutoGPTQ/pull/625 and combine_tensors.sh script |
|
|