Llama 2 13B GPTQ is an open-source language model by TheBloke. Features: 13b LLM, VRAM: 7.3GB, Context: 4K, License: llama2, Quantized, HF Score: 53.3, LLM Explorer Score: 0.12, Arc: 59.1, HellaSwag: 81.5, MMLU: 54.5, TruthfulQA: 37.1, WinoGrande: 76.2, GSM8K: 11.3.
Llama 2 13B GPTQ Benchmarks
nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Llama 2 13B GPTQ Parameters and Internals
Model Type
Use Cases
Areas:
Applications: assistant-like chat, natural language generation
Primary Use Cases:
Limitations: English-focused, not tested in all languages., Potentially unpredictable outputs.
Considerations: Follow specific input formatting to align with intended use cases.
Additional Notes Compatible with AutoGPTQ and major GPTQ clients. Choose quantization parameters based on hardware needs.
Supported Languages
Training Details
Data Sources: Publicly available online data
Data Volume:
Methodology: Pretrained and fine-tuned using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF)
Context Length:
Training Time: January 2023 to July 2023
Hardware Used:
Model Architecture: Optimized transformer architecture.
Safety Evaluation
Methodologies: Internal evaluations library
Findings: May produce inaccurate, biased or objectionable responses; testing primarily in English.
Risk Categories:
Ethical Considerations: Before deploying applications, perform safety testing tailored to your use case.
Responsible Ai Considerations
Fairness: Testing primarily in English, does not guarantee unbiased outputs in all languages.
Transparency: Evaluation data and results are disclosed.
Accountability: Developers responsible for application-specific safety testing.
Mitigation Strategies: Community feedback and iterative improvements.
Input Output
Input Format:
Accepted Modalities:
Output Format:
Performance Tips: Select appropriate quantization parameters for VRAM efficiency and accuracy.
Release Notes
Version:
Date:
Notes: Pretrained on 2 trillion tokens with fine-tuning using RLHF for dialog applications.
LLM Name Llama 2 13B GPTQ Repository 🤗 https://huggingface.co/TheBloke/Llama-2-13B-GPTQ Model Name Llama 2 13B Model Creator Meta Base Model(s) Llama 2 13B Hf meta-llama/Llama-2-13b-hf Model Size 13b Required VRAM 7.3 GB Updated 2026-04-08 Maintainer TheBloke Model Type llama Model Files 7.3 GB Supported Languages en GPTQ Quantization Yes Quantization Type gptq Model Architecture LlamaForCausalLM License llama2 Context Length 4096 Model Max Length 4096 Transformers Version 4.32.0.dev0 Tokenizer Class LlamaTokenizer Beginning of Sentence Token <s> End of Sentence Token </s> Unk Token <unk> Vocabulary Size 32000 Torch Data Type float16
Best Alternatives to Llama 2 13B GPTQ
Note: green Score (e.g. "73.2 ") means that the model is better than TheBloke/Llama-2-13B-GPTQ .
Expand
Rank the Llama 2 13B GPTQ Capabilities
🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟
Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation
Expand
Check out
Ag3ntum — our secure, self-hosted AI agent for server management.
Release v20260328a