Baize V2 13B SuperHOT 8K Fp16 by TheBloke

 ยป  All LLMs  ยป  TheBloke  ยป  Baize V2 13B SuperHOT 8K Fp16   URL Share it on

  Arxiv:2304.01196   Autotrain compatible   Custom code   Ext 8k   Fp16   Llama   Pytorch   Quantized   Region:us   Sharded

Baize V2 13B SuperHOT 8K Fp16 Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Baize V2 13B SuperHOT 8K Fp16 (TheBloke/Baize-v2-13B-SuperHOT-8K-fp16)
๐ŸŒŸ Advertise your project ๐Ÿš€

Baize V2 13B SuperHOT 8K Fp16 Parameters and Internals

Model Type 
text-generation, chatbot
Use Cases 
Limitations:
Model declines to engage with topics, questions and instructions related to unethical, controversial, or sensitive issues.
Training Details 
Methodology:
Supervised fine-tuning and self-distillation with feedback (SDF)
Context Length:
8192
Input Output 
Input Format:
Expected conversations to begin and include markers: Human statements start with [|Human|] and AI assistant statements start with [|AI|]
Output Format:
Responses in Markdown format
LLM NameBaize V2 13B SuperHOT 8K Fp16
Repository ๐Ÿค—https://huggingface.co/TheBloke/Baize-v2-13B-SuperHOT-8K-fp16 
Model Size13b
Required VRAM39.1 GB
Updated2025-08-18
MaintainerTheBloke
Model Typellama
Model Files  14.7 GB: 1-of-3   14.7 GB: 2-of-3   9.7 GB: 3-of-3
Context Length8k
Quantization Typefp16
Model ArchitectureLlamaForCausalLM
Licenseother
Context Length8192
Model Max Length8192
Transformers Version4.30.0.dev0
Tokenizer ClassLlamaTokenizer
Beginning of Sentence Token<s>
End of Sentence Token</s>
Unk Token<unk>
Vocabulary Size32000
Torch Data Typefloat16

Best Alternatives to Baize V2 13B SuperHOT 8K Fp16

Best Alternatives
Context / RAM
Downloads
Likes
Llama13b 32K Illumeet Finetune32K / 26 GB50
...Maid V3 13B 32K 6.0bpw H6 EXL232K / 10 GB51
...Maid V3 13B 32K 8.0bpw H8 EXL232K / 13.2 GB51
WhiteRabbitNeo 13B V116K / 26 GB2630425
CodeLlama 13B Python Fp1616K / 26 GB309425
CodeLlama 13B Instruct Fp1616K / 26 GB309828
...Llama 13B Instruct Hf 4bit MLX16K / 7.8 GB11702
CodeLlama 13B Fp1616K / 26 GB1166
Codellama 13B Bnb 4bit16K / 7.2 GB965
Airophin 13B Pntk 16K Fp1616K / 26 GB15164
Note: green Score (e.g. "73.2") means that the model is better than TheBloke/Baize-v2-13B-SuperHOT-8K-fp16.

Rank the Baize V2 13B SuperHOT 8K Fp16 Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 50729 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241124