Mega Ar 126M 4K by BEE-spoke-data

 »  All LLMs  »  BEE-spoke-data  »  Mega Ar 126M 4K   URL Share it on

Mega Ar 126M 4K is an open-source language model by BEE-spoke-data. Features: 126m LLM, VRAM: 0.5GB, License: apache-2.0, LLM Explorer Score: 0.12.

  Arxiv:2209.10655 Dataset:bee-spoke-data/knowled... Dataset:bee-spoke-data/wikiped...   Dataset:jeankaddour/minipile   En   Endpoints compatible   Mega   Region:us   Safetensors

Mega Ar 126M 4K Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

Mega Ar 126M 4K Parameters and Internals

Model Type 
text generation
Additional Notes 
This model is interesting because it's not technically a transformer despite being a language model.
Supported Languages 
en (High proficiency)
Training Details 
Data Sources:
JeanKaddour/minipile, BEE-spoke-data/wikipedia-20230901.en-deduped, BEE-spoke-data/knowledge-inoc-concat-v1
Methodology:
train-from-scratch
Context Length:
4096
Model Architecture:
mega
Input Output 
Input Format:
text
Accepted Modalities:
text
Output Format:
text
Performance Tips:
Given the model's small size and architecture, it's best to leverage its longer context by adding input context to 'see more' rather than 'generate more'.
LLM NameMega Ar 126M 4K
Repository 🤗https://huggingface.co/BEE-spoke-data/mega-ar-126m-4k 
Model Size126m
Required VRAM0.5 GB
Updated2026-05-21
MaintainerBEE-spoke-data
Model Typemega
Model Files  0.5 GB
Supported Languagesen
Model ArchitectureMegaForCausalLM
Licenseapache-2.0
Transformers Version4.36.2
Tokenizer ClassGPTNeoXTokenizer
Vocabulary Size50304
Torch Data Typefloat32

Rank the Mega Ar 126M 4K Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 53834 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Check out Ag3ntum — our secure, self-hosted AI agent for server management.
Release v20260328a