Cerebras GPT 111M by cerebras

 ยป  All LLMs  ยป  cerebras  ยป  Cerebras GPT 111M   URL Share it on

Cerebras GPT 111M is an open-source language model by cerebras. Features: 111m LLM, VRAM: 0.5GB, License: apache-2.0, HF Score: 27.8, LLM Explorer Score: 0.13, Arc: 20.2, HellaSwag: 26.7, MMLU: 25.5, TruthfulQA: 46.3, WinoGrande: 47.8.

  Arxiv:2101.00027   Arxiv:2203.15556   Arxiv:2304.03208   Dataset:the pile   En   Endpoints compatible   Gpt2   Pytorch   Region:us

Cerebras GPT 111M Benchmarks

Cerebras GPT 111M (cerebras/Cerebras-GPT-111M)
๐ŸŒŸ Advertise your project ๐Ÿš€

Cerebras GPT 111M Parameters and Internals

Model Type 
Transformer-based Language Model, Text Generation, Causal LM
Use Cases 
Areas:
Research, NLP Applications, Ethics and Alignment Research
Applications:
Foundation model for NLP research, Reference implementations
Primary Use Cases:
Further research into large language models
Limitations:
Not suitable for machine translation tasks, Not tuned for human-facing dialog applications
Considerations:
Further safety testing and mitigations should be applied before production use.
Additional Notes 
Trained and evaluated following the approaches described in the Cerebras-GPT paper.
Training Details 
Data Sources:
The Pile
Data Volume:
371B tokens
Methodology:
GPT-3 style architecture with full attention and weight streaming technology
Context Length:
2048
Hardware Used:
16 CS-2 wafer scale systems
Model Architecture:
GPT-3 style
Responsible Ai Considerations 
Fairness:
The Pile dataset used has been analyzed for various biases and ethical standpoints.
Mitigation Strategies:
Mitigations applied are limited to standard Pile dataset pre-processing.
Input Output 
Input Format:
Tokenized text input
Accepted Modalities:
Text
Output Format:
Generated Text
LLM NameCerebras GPT 111M
Repository ๐Ÿค—https://huggingface.co/cerebras/Cerebras-GPT-111M 
Model Size111m
Required VRAM0.5 GB
Updated2025-10-10
Maintainercerebras
Model Typegpt2
Model Files  0.5 GB
Supported Languagesen
Model ArchitectureAutoModel
Licenseapache-2.0
Vocabulary Size50257
Activation Functiongelu

Rank the Cerebras GPT 111M Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 52392 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Check out Ag3ntum โ€” our secure, self-hosted AI agent for server management.
Release v20260328a