XVERSE 65B By xverse: Benchmarks, Features and Detailed Analysis. Insights on XVERSE 65B.

Arxiv:2005.14165 Arxiv:2112.11446 Arxiv:2201.11990 Arxiv:2203.15556 Arxiv:2204.02311 Arxiv:2211.05100 Arxiv:2302.13971 Autotrain compatible Custom code Pytorch Region:us Sharded Xverse

Model Card on HF 🤗: https://huggingface.co/xverse/XVERSE-65B

XVERSE 65B Benchmarks

LLME Score: 0.12083

^nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

What is the LLM Explorer Rank (Score)

🌟 Advertise your project 🚀

XVERSE 65B Parameters and Internals

Model Type

multilingual, large language model

Use Cases

Areas:

academic research, commerical use

Applications:

multilingual tasks, text generation, dialogue, summarization

Primary Use Cases:

Chinese question answering, English question answering, language comprehension, common sense questioning, logical reasoning, math solving, coding

Limitations:

may produce inaccurate, biased, or offensive content

Considerations:

Developers should conduct safety tests before deployment.

Supported Languages

en (54.91), zh (31.09), ru (3.15), ja (3.22), de (1.52), es (0.91), fr (0.73), pl (0.48), it (0.36), pt (0.34), nl (0.20), cs (0.27), sv (0.15), ko (0.18), fi (0.14), ar (0.12), ro (0.11), bg (0.10), th (0.10), da (0.09), hu (0.19), no (0.07), hi (0.07), iw (0.06), fa (0.07), sl (0.05), et (0.04), lv (0.03), sk (0.08), ms (0.05), ca (0.06), sr (0.03), tr (0.23), uk (0.24), id (0.13), mr (0.08), lt (0.05), kk (0.02), ta (0.03)

Training Details

Data Sources:

web pages, code, encyclopedia, books, academic papers, QA, other

Data Volume:

2.6 trillion tokens

Methodology:

FlashAttention2, 3D parallelism with virtual pipeline

Context Length:

16000

Hardware Used:

A800 80G GPU, 1500GB memory for training

Model Architecture:

Decoder-only Transformer

Input Output

Input Format:

tokenized input using BPE with vocabulary size 100,534

Accepted Modalities:

text

Output Format:

text

Performance Tips:

Use bfloat16 for better fine-tuning performance

Release Notes

Version:

2023/11/29

Date:

2023-11-29

Notes:

Update model architecture and additional pre-training data information.

Version:

2023/11/24

Date:

2023-11-24

Notes:

Update the related information of pre-training data.

Version:

2023/11/06

Date:

2023-11-06

Notes:

Released the XVERSE-65B base model.

LLM Name	XVERSE 65B
Repository 🤗	https://huggingface.co/xverse/XVERSE-65B
Model Size	65b
Required VRAM	133.9 GB
Updated	2025-09-23
Maintainer	xverse
Model Type	xverse
Model Files	3.8 GB: 1-of-28 4.9 GB: 2-of-28 4.9 GB: 3-of-28 4.9 GB: 4-of-28 4.9 GB: 5-of-28 4.9 GB: 6-of-28 4.9 GB: 7-of-28 4.9 GB: 8-of-28 4.9 GB: 9-of-28 4.9 GB: 10-of-28 4.9 GB: 11-of-28 4.9 GB: 12-of-28 4.9 GB: 13-of-28 4.9 GB: 14-of-28 4.9 GB: 15-of-28 4.9 GB: 16-of-28 4.9 GB: 17-of-28 4.9 GB: 18-of-28 4.9 GB: 19-of-28 4.9 GB: 20-of-28 4.9 GB: 21-of-28 4.9 GB: 22-of-28 4.9 GB: 23-of-28 4.9 GB: 24-of-28 4.9 GB: 25-of-28 4.9 GB: 26-of-28 4.9 GB: 27-of-28 2.7 GB: 28-of-28
Model Architecture	XverseForCausalLM
License	apache-2.0
Context Length	16384
Model Max Length	16384
Transformers Version	4.30.2
Tokenizer Class	PreTrainedTokenizerFast
Vocabulary Size	100534
Torch Data Type	bfloat16

Best Alternatives to XVERSE 65B

Best Alternatives	Context / RAM	Downloads	Likes
XVERSE 65B Chat	16K / 132.8 GB	127	13
XVERSE 65B 2	16K / 134.6 GB	16	10
XVERSE 65B Chat GPTQ Int4	8K / 37 GB	17	1

Note: green Score (e.g. "73.2") means that the model is better than xverse/XVERSE-65B.

Rank the XVERSE 65B Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation

What open-source LLMs or SLMs are you in search of? 51534 in total.

Email us: info@extractum.io. Our Privacy Policy | Terms and Conditions | Suggest an improvement.

Our Social Media →

Original data from HuggingFace, OpenCompass and various public git repos.

Release v20241124

Support LLM Explorer

XVERSE 65B by xverse

» All LLMs » xverse » XVERSE 65B URL Share it on

XVERSE 65B Benchmarks

XVERSE 65B Parameters and Internals

Best Alternatives to XVERSE 65B

Rank the XVERSE 65B Capabilities

What open-source LLMs or SLMs are you in search of? 51534 in total.