Yi 34B 200K DARE Merge V5 AWQ by TheBloke

 ยป  All LLMs  ยป  TheBloke  ยป  Yi 34B 200K DARE Merge V5 AWQ   URL Share it on

  Merged Model   4-bit   Autotrain compatible   Awq Base model:brucethemoose/yi-34... Base model:quantized:brucethem...   En   Llama   Quantized   Region:us   Safetensors   Sharded   Tensorflow

Yi 34B 200K DARE Merge V5 AWQ Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Yi 34B 200K DARE Merge V5 AWQ (TheBloke/Yi-34B-200K-DARE-merge-v5-AWQ)
๐ŸŒŸ Advertise your project ๐Ÿš€

Yi 34B 200K DARE Merge V5 AWQ Parameters and Internals

Model Type 
yi, text-generation
Additional Notes 
Yi tends to run "hot" by default, and it really needs MinP to cull the huge vocabulary. 24GB GPUs can run Yi-34B-200K models at 45K-75K context with exllamav2.
Input Output 
Input Format:
SYSTEM: {system_message} USER: {prompt} ASSISTANT:
Release Notes 
Notes:
Various densities were tested with perplexity tests and long context prompts. Relatively high densities seem to perform better, contrary to the findings of the Super Mario paper. This particular version is merged with more than the "recommended" max density of 0.5. It seems to result in even better perplexity, but I'm not sure if this translates to better output. Weights that add up to 1 seem to be optimal. Dare Ties is also resulting in seemingly better, lower perplexity merges than a regular ties merge, task arithmetic, or a slerp merge. SUS Chat is not a 200K model, hence it was merged at a very low density to try and preserve Yi 200K's long context performance while "keeping" some of SUS parameters.
LLM NameYi 34B 200K DARE Merge V5 AWQ
Repository ๐Ÿค—https://huggingface.co/TheBloke/Yi-34B-200K-DARE-merge-v5-AWQ 
Model NameYi 34B 200K DARE Merge v5
Model Creatorbrucethemoose
Base Model(s)  Yi 34B 200K DARE Merge V5   brucethemoose/Yi-34B-200K-DARE-merge-v5
Merged ModelYes
Model Size34b
Required VRAM19.3 GB
Updated2025-09-21
MaintainerTheBloke
Model Typellama
Model Files  10.0 GB: 1-of-2   9.3 GB: 2-of-2
Supported Languagesen
AWQ QuantizationYes
Quantization Typeawq
Model ArchitectureLlamaForCausalLM
Licenseother
Context Length200000
Model Max Length200000
Transformers Version4.35.2
Tokenizer ClassLlamaTokenizer
Padding Token<unk>
Vocabulary Size64000
Torch Data Typefloat16

Best Alternatives to Yi 34B 200K DARE Merge V5 AWQ

Best Alternatives
Context / RAM
Downloads
Likes
Opus V1 34B AWQ195K / 19.2 GB71
Smaug 34B V0.1 AWQ195K / 19.2 GB62
Yi 34B 200K RPMerge AWQ195K / 19.2 GB71
Tess 34B V1.5B AWQ195K / 19.3 GB83
...34B 200K DARE Megamerge V8 AWQ195K / 19.3 GB62
...ey 34B 200K Chat Evaluator AWQ195K / 19.3 GB65
Deepmoney 34B 200K Base AWQ195K / 19.3 GB61
Nous Capybara Limarpv3 34B AWQ195K / 19.3 GB61
Bagel DPO 34B V0.2 AWQ195K / 19.3 GB88
Bagel 34B V0.2 AWQ195K / 19.3 GB52
Note: green Score (e.g. "73.2") means that the model is better than TheBloke/Yi-34B-200K-DARE-merge-v5-AWQ.

Rank the Yi 34B 200K DARE Merge V5 AWQ Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 51507 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241124