Llama3 8B Chinese Chat by shenzhi-wang

 ยป  All LLMs  ยป  shenzhi-wang  ยป  Llama3 8B Chinese Chat   URL Share it on

  Autotrain compatible Base model:finetune:meta-llama... Base model:meta-llama/meta-lla...   Conversational   Doi:10.57967/hf/2316   En   Endpoints compatible   Instruct   Llama   Llama-factory   Orpo   Region:us   Safetensors   Sharded   Tensorflow   Zh

Llama3 8B Chinese Chat Benchmarks

๐ŸŒŸ Advertise your project ๐Ÿš€

Llama3 8B Chinese Chat Parameters and Internals

Model Type 
text generation, multimodal
Additional Notes 
Model primarily fine-tuned for Chinese & English users with abilities like roleplaying & tool-using. For optimal performance, model's identity is not fine-tuned.
Supported Languages 
Chinese (high), English (medium)
Training Details 
Data Sources:
mixed Chinese-English dataset
Data Volume:
~100K preference pairs
Methodology:
ORPO (Reference-free Monolithic Preference Optimization with Odds Ratio)
Context Length:
8192
Model Architecture:
Meta-Llama-3
Input Output 
Input Format:
instruction-based prompts
Accepted Modalities:
text
Output Format:
text
Release Notes 
Version:
v2.1
Date:
May 6, 2024
Notes:
Training dataset is 5x larger (~100K preference pairs). Enhancements in roleplay, function calling, math. Less prone to including English words in Chinese responses.
Version:
v2
Date:
Apr. 29, 2024
Notes:
Increases in training data size from 20K to 100K; improved performance in roleplay, tool using, and math.
Version:
v1
Notes:
Significantly reduces issues of 'Chinese questions with English answers' and the mixing of Chinese and English in responses.
LLM NameLlama3 8B Chinese Chat
Repository ๐Ÿค—https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat 
Base Model(s)  Meta Llama 3 8B Instruct   meta-llama/Meta-Llama-3-8B-Instruct
Model Size8b
Required VRAM16.1 GB
Updated2025-06-09
Maintainershenzhi-wang
Model Typellama
Instruction-BasedYes
Model Files  5.0 GB: 1-of-4   5.0 GB: 2-of-4   4.9 GB: 3-of-4   1.2 GB: 4-of-4
Supported Languagesen zh
Model ArchitectureLlamaForCausalLM
Licensellama3
Context Length8192
Model Max Length8192
Transformers Version4.40.0
Tokenizer ClassPreTrainedTokenizerFast
Padding Token<|eot_id|>
Vocabulary Size128256
Torch Data Typebfloat16
Llama3 8B Chinese Chat (shenzhi-wang/Llama3-8B-Chinese-Chat)

Quantized Models of the Llama3 8B Chinese Chat

Model
Likes
Downloads
VRAM
... Chinese Chat AWQ 4bit Smashed0205 GB

Best Alternatives to Llama3 8B Chinese Chat

Best Alternatives
Context / RAM
Downloads
Likes
...otron 8B UltraLong 4M Instruct4192K / 32.1 GB3284108
UltraLong Thinking4192K / 16.1 GB3672
...a 3.1 8B UltraLong 4M Instruct4192K / 32.1 GB17624
...a 3.1 8B UltraLong 2M Instruct2096K / 32.1 GB8759
...otron 8B UltraLong 2M Instruct2096K / 32.1 GB52615
Zero Llama 3.1 8B Beta61048K / 16.1 GB9581
...otron 8B UltraLong 1M Instruct1048K / 32.1 GB180845
...a 3.1 8B UltraLong 1M Instruct1048K / 32.1 GB138729
....1 1million Ctx Dark Planet 8B1048K / 32.3 GB902
...dger Nu Llama 3.1 8B UltraLong1048K / 16.2 GB303
Note: green Score (e.g. "73.2") means that the model is better than shenzhi-wang/Llama3-8B-Chinese-Chat.

Rank the Llama3 8B Chinese Chat Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 48023 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241124