Opt 125M is an open-source language model by Facebook. Features: 125m LLM, VRAM: 0.3GB, Context: 2K, License: other, HF Score: 29.2, LLM Explorer Score: 0.2, Arc: 22.9, HellaSwag: 31.5, MMLU: 26, TruthfulQA: 42.9, WinoGrande: 51.6, GSM8K: 0.1.
Opt 125M Benchmarks
nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Opt 125M Parameters and Internals
Model Type
Use Cases
Areas: Research, Commercial Applications
Applications: Text generation, Prompting for downstream tasks
Primary Use Cases: Text generation, Prompting for evaluation of downstream tasks
Limitations: Bias in training data, Quality issues in generation diversity and hallucination.
Considerations: Bias in training data can affect fine-tuned versions.
Additional Notes The model card discusses ethical considerations related to model biases due to the nature of the training data.
Training Details
Data Sources: BookCorpus, CC-Stories, The Pile, Pushshift.io Reddit, CCNewsV2
Data Volume:
Methodology: Causal language modeling (CLM) using GPT2 byte-level BPE.
Context Length:
Training Time: 33 days of continuous training
Hardware Used:
Model Architecture: Open Pretrained Transformers (OPT), a suite of decoder-only pre-trained transformers.
Safety Evaluation
Methodologies: Evaluation using prompts similar to GPT-3
Findings: Model is strongly biased, can have quality issues such as hallucination.
Risk Categories:
Ethical Considerations: Data contains unfiltered content from the internet leading to biases.
Responsible Ai Considerations
Fairness: Acknowledges bias in training data.
Transparency: Bias and safety acknowledged in official model card.
Accountability: Encouraging responsible AI research by making models available for study.
Mitigation Strategies: Sharing models to allow broader study and understanding of biases.
Input Output
Input Format:
Accepted Modalities:
Output Format:
Performance Tips: Using top-k sampling by setting `do_sample` to `True` for non-deterministic generation.
LLM Name Opt 125M Repository ๐ค https://huggingface.co/facebook/opt-125m Model Size 125m Required VRAM 0.3 GB Updated 2026-04-02 Maintainer Facebook Model Type opt Model Files 0.3 GB Supported Languages en Model Architecture OPTForCausalLM License other Context Length 2048 Model Max Length 2048 Transformers Version 4.21.0.dev0 Beginning of Sentence Token </s> End of Sentence Token </s> Unk Token </s> Vocabulary Size 50272 Torch Data Type float16 Activation Function relu Errors replace
Best Alternatives to Opt 125M
Expand
Rank the Opt 125M Capabilities
๐ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐
Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation
Expand
Check out
Ag3ntum โ our secure, self-hosted AI agent for server management.
Release v20260328a