Model Type | multilingual, large language model, decoder-only transformer |
|
Use Cases |
Areas: | research, commercial applications |
|
Limitations: | Potentially produces inaccurate, biased, or offensive content like all LLMs |
|
Considerations: | Developers should conduct safety tests and optimization of the model before deployment |
|
|
Supported Languages | Chinese (excellent), English (excellent), others (supported in over 40 languages) |
|
Training Details |
Data Volume: | |
Methodology: | |
Context Length: | |
Model Architecture: | Decoder-only Transformer with 16k context length |
|
|
Release Notes |
Version: | XVERSE-65B-Chat-GPTQ-Int4 |
|
Date: | |
Notes: | Released quantification model supporting vLLM inference |
|
Version: | |
Date: | |
Notes: | Continual Pre-Training with 3.2 trillion tokens, improved mathematical & coding abilities |
|
Version: | |
Date: | |
Notes: | Initial release of base model |
|
|
|