TinyLlama 1.1B 32K Instruct 8.0bpw H8 EXL2 by LoneStriker

 ยป  All LLMs  ยป  LoneStriker  ยป  TinyLlama 1.1B 32K Instruct 8.0bpw H8 EXL2   URL Share it on

TinyLlama 1.1B 32K Instruct 8.0bpw H8 EXL2 is an open-source language model by LoneStriker. Features: 1.1b LLM, VRAM: 1.2GB, Context: 32K, Quantized, Instruction-Based, LLM Explorer Score: 0.12.

  Conversational Dataset:doctor-shotgun/capybar... Dataset:doctor-shotgun/no-robo... Dataset:huggingfaceh4/no robot... Dataset:jondurbin/airoboros-3....   Dataset:ldjnr/capybara   Dataset:ldjnr/verified-camel Dataset:unalignment/toxic-dpo-...   En   Exl2   Instruct   Llama   Quantized   Region:us

TinyLlama 1.1B 32K Instruct 8.0bpw H8 EXL2 Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
TinyLlama 1.1B 32K Instruct 8.0bpw H8 EXL2 (LoneStriker/TinyLlama-1.1B-32k-Instruct-8.0bpw-h8-exl2)
๐ŸŒŸ Advertise your project ๐Ÿš€

TinyLlama 1.1B 32K Instruct 8.0bpw H8 EXL2 Parameters and Internals

Model Type 
text generation
Training Details 
Data Sources:
LDJnr/Capybara, jondurbin/airoboros-3.2, unalignment/toxic-dpo-v0.1, LDJnr/Verified-Camel, HuggingFaceH4/no_robots, Doctor-Shotgun/no-robots-sharegpt, Doctor-Shotgun/capybara-sharegpt
Methodology:
The model was trained as a full finetune for 3 epochs using a single A100 GPU for around 3.5 hours.
Training Time:
3.5 hours
Hardware Used:
A100 GPU
Input Output 
Input Format:
A modified multi-turn Alpaca instruction format: ### Instruction: {system prompt} ### Input: {user message} ### Response: {model response} ### Input: {user message} ### Response: {model response} (etc.)
LLM NameTinyLlama 1.1B 32K Instruct 8.0bpw H8 EXL2
Repository ๐Ÿค—https://huggingface.co/LoneStriker/TinyLlama-1.1B-32k-Instruct-8.0bpw-h8-exl2 
Model Size1.1b
Required VRAM1.2 GB
Updated2026-03-29
MaintainerLoneStriker
Model Typellama
Instruction-BasedYes
Model Files  1.2 GB
Supported Languagesen
Quantization Typeexl2
Model ArchitectureLlamaForCausalLM
Context Length32768
Model Max Length32768
Transformers Version4.37.0.dev0
Tokenizer ClassLlamaTokenizer
Padding Token</s>
Vocabulary Size32000
Torch Data Typebfloat16

Best Alternatives to TinyLlama 1.1B 32K Instruct 8.0bpw H8 EXL2

Best Alternatives
Context / RAM
Downloads
Likes
...1B 32K Instruct 3.0bpw H6 EXL232K / 0.5 GB50
...nish English Asistant 16bit V22K / 2.2 GB50
TinyKiller NSFW DPO 1.1B32K / 2.2 GB120
...llama 1.1B 16K Instructions V432K / 2.2 GB80
TinyLlama 1.1B 32K Instruct32K / 2.2 GB55713
Palmer 002 32K32K / 2.2 GB70
...lama 1.1B 16K Instructions Rag32K / 2.2 GB250
...diate 1.5T PTBR Instruct V3 8K8K / 2.2 GB98
Tinyllama Coder Py 4bit V104K / 0.7 GB20080
TinyLlama 1.1B Instruct 3T4K / 2.2 GB560
Note: green Score (e.g. "73.2") means that the model is better than LoneStriker/TinyLlama-1.1B-32k-Instruct-8.0bpw-h8-exl2.

Rank the TinyLlama 1.1B 32K Instruct 8.0bpw H8 EXL2 Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 52721 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Check out Ag3ntum โ€” our secure, self-hosted AI agent for server management.
Release v20260328a