Tess 2.0 Mixtral 8x22B by migtissera

 ยป  All LLMs  ยป  migtissera  ยป  Tess 2.0 Mixtral 8x22B   URL Share it on

  Merged Model   Autotrain compatible   Endpoints compatible   Mixtral   Moe   Region:us   Safetensors   Sharded   Tensorflow

Tess 2.0 Mixtral 8x22B Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Tess 2.0 Mixtral 8x22B (migtissera/Tess-2.0-Mixtral-8x22B)
๐ŸŒŸ Advertise your project ๐Ÿš€

Tess 2.0 Mixtral 8x22B Parameters and Internals

Model Type 
General Purpose Large Language Model
Additional Notes 
The model is uncensored and aims to follow instructions effectively. Caution is advised as it may produce inappropriate or biased content.
Training Details 
Data Sources:
Tess-2.0 dataset
Data Volume:
~25K high-quality code and general training samples
Methodology:
LIMA (Less-Is-More) principles
Training Time:
1-epoch fine-tuning
Input Output 
Input Format:
SYSTEM: USER: ASSISTANT:
Accepted Modalities:
text
LLM NameTess 2.0 Mixtral 8x22B
Repository ๐Ÿค—https://huggingface.co/migtissera/Tess-2.0-Mixtral-8x22B 
Merged ModelYes
Model Size140.6b
Required VRAM216.8 GB
Updated2025-09-14
Maintainermigtissera
Model Typemixtral
Model Files  5.0 GB: 1-of-59   4.8 GB: 2-of-59   4.8 GB: 3-of-59   4.8 GB: 4-of-59   4.8 GB: 5-of-59   4.8 GB: 6-of-59   4.8 GB: 7-of-59   4.8 GB: 8-of-59   4.8 GB: 9-of-59   4.8 GB: 10-of-59   4.8 GB: 11-of-59   4.8 GB: 12-of-59   4.8 GB: 13-of-59   4.8 GB: 14-of-59   4.8 GB: 15-of-59   4.8 GB: 16-of-59   4.8 GB: 17-of-59   4.8 GB: 18-of-59   4.8 GB: 19-of-59   4.8 GB: 20-of-59   4.8 GB: 21-of-59   4.8 GB: 22-of-59   4.8 GB: 23-of-59   4.9 GB: 24-of-59   5.0 GB: 25-of-59   5.0 GB: 26-of-59   4.9 GB: 27-of-59   4.8 GB: 28-of-59   4.8 GB: 29-of-59   4.8 GB: 30-of-59   4.8 GB: 31-of-59   4.8 GB: 32-of-59   4.8 GB: 33-of-59   4.8 GB: 34-of-59   4.8 GB: 35-of-59   4.8 GB: 36-of-59   4.8 GB: 37-of-59   4.8 GB: 38-of-59   4.8 GB: 39-of-59   4.8 GB: 40-of-59   4.8 GB: 41-of-59   4.8 GB: 42-of-59   4.8 GB: 43-of-59   4.8 GB: 44-of-59   4.8 GB: 45-of-59
Model ArchitectureMixtralForCausalLM
Licenseapache-2.0
Context Length65536
Model Max Length65536
Transformers Version4.40.0.dev0
Vocabulary Size32000
Torch Data Typefloat16

Best Alternatives to Tess 2.0 Mixtral 8x22B

Best Alternatives
Context / RAM
Downloads
Likes
Zephyr Orpo 141B A35b V0.164K / 207.2 GB32269
Mixtral 8x22B Instruct V0.164K / 221.4 GB10718733
WizardLM 2 8x22B64K / 216.8 GB9721405
Mixtral 8x22B V0.164K / 221.6 GB4961229
Mixtral 8x22B V0.164K / 212 GB1185672
Mixtral 8x22B V0.364K / 221.4 GB383
...ixtral 8x22B Instruct V0.1 FP864K / 140.9 GB1820
Dolphin 2.9.2 Mixtral 8x22b64K / 207.2 GB924941
Dolphin 2.9.2 Mixtral 8x22b64K / 207.2 GB931140
XLAM 8x22b R64K / 211.8 GB107545
Note: green Score (e.g. "73.2") means that the model is better than migtissera/Tess-2.0-Mixtral-8x22B.

Rank the Tess 2.0 Mixtral 8x22B Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 51368 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241124