Llama 3.2 Kapusta JapanChibi 3B V1 by Khetterman

 ยป  All LLMs  ยป  Khetterman  ยป  Llama 3.2 Kapusta JapanChibi 3B V1   URL Share it on

  Merged Model   3b   Autotrain compatible Base model:aellm/llama-3.2-chi... Base model:axcxept/ezo-llama-3... Base model:khetterman/llama-3....   Bfloat16   Chat   Conversational   Creative   En   Endpoints compatible   Instruct   Llama   Llama-3   Llama-3.2   Not-for-all-audiences   Region:us   Ru   Safetensors   Sharded   Tensorflow

Llama 3.2 Kapusta JapanChibi 3B V1 Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
๐ŸŒŸ Advertise your project ๐Ÿš€

Llama 3.2 Kapusta JapanChibi 3B V1 Parameters and Internals

LLM NameLlama 3.2 Kapusta JapanChibi 3B V1
Repository ๐Ÿค—https://huggingface.co/Khetterman/Llama-3.2-Kapusta-JapanChibi-3B-v1 
Base Model(s)  Khetterman/Llama-3.2-Kapusta-3B-v8   AELLM/Llama-3.2-Chibi-3B   AXCXEPT/EZO-Llama-3.2-3B-Instruct-dpoE   Khetterman/Llama-3.2-Kapusta-3B-v8   AELLM/Llama-3.2-Chibi-3B   AXCXEPT/EZO-Llama-3.2-3B-Instruct-dpoE
Merged ModelYes
Model Size3b
Required VRAM7.2 GB
Updated2025-06-09
MaintainerKhetterman
Model Typellama
Instruction-BasedYes
Model Files  5.0 GB: 1-of-2   2.2 GB: 2-of-2
Supported Languagesen ru
Model ArchitectureLlamaForCausalLM
Context Length131072
Model Max Length131072
Transformers Version4.45.2
Tokenizer ClassPreTrainedTokenizerFast
Vocabulary Size128256
Torch Data Typebfloat16
Llama 3.2 Kapusta JapanChibi 3B V1 (Khetterman/Llama-3.2-Kapusta-JapanChibi-3B-v1)

Best Alternatives to Llama 3.2 Kapusta JapanChibi 3B V1

Best Alternatives
Context / RAM
Downloads
Likes
Llama 3.2 3B Instruct128K / 6.5 GB15180021505
DeepSeek R1 Distill Llama 3B128K / 6.5 GB94713
Orpheus 3B 0.1 Pretrained128K / 6.6 GB104510
Llama 3.2 3B Instruct128K / 6.5 GB21505466
Llama 3.2 3B RP Toxic Fuse128K / 6.4 GB92
Llama 3.2 3B Bespoke Thought128K / 6.4 GB6613
Zeitgeist 3B V1128K / 6.5 GB1055
...lama 3.2 Rabbit Ko 3B Instruct128K / 6.5 GB16388
ReasoningCore 3B T1 1128K / 6.5 GB331
... 3.2 3B Math Instruct RE1 ORPO128K / 6.5 GB480
Note: green Score (e.g. "73.2") means that the model is better than Khetterman/Llama-3.2-Kapusta-JapanChibi-3B-v1.

Rank the Llama 3.2 Kapusta JapanChibi 3B V1 Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 48023 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241124