Mpt 1B Redpajama 200B by anas-awadalla

 ยป  All LLMs  ยป  anas-awadalla  ยป  Mpt 1B Redpajama 200B   URL Share it on

  Arxiv:2108.12409   Arxiv:2205.14135   Arxiv:2302.13971   Autotrain compatible   Custom code Dataset:togethercomputer/redpa...   Mosaic gpt   Pytorch   Region:us

Mpt 1B Redpajama 200B Benchmarks

Mpt 1B Redpajama 200B (anas-awadalla/mpt-1b-redpajama-200b)
๐ŸŒŸ Advertise your project ๐Ÿš€

Mpt 1B Redpajama 200B Parameters and Internals

Model Type 
decoder-only transformer
Additional Notes 
This model requires `trust_remote_code=True` due to using a custom model architecture `MosaicGPT`. Training efficiency features like FlashAttention and ALIBI are included.
Training Details 
Data Sources:
RedPajama Common Crawl, C4, RedPajama GitHub, RedPajama Wikipedia, RedPajama Books, RedPajama Arxiv, RedPajama StackExchange
Data Volume:
200B tokens
Context Length:
2048
Training Time:
~ half a day
Hardware Used:
440 A100-40GBs
Model Architecture:
modified decoder-only transformer with 24 layers, 16 attention heads, width 2048, using ALiBi, QK LayerNorm, and no biases.
Input Output 
Performance Tips:
Use `attn_impl='triton'` with `bfloat16` for optimized performance.
LLM NameMpt 1B Redpajama 200B
Repository ๐Ÿค—https://huggingface.co/anas-awadalla/mpt-1b-redpajama-200b 
Model Size1b
Required VRAM5.2 GB
Updated2025-09-15
Maintaineranas-awadalla
Model Typemosaic_gpt
Model Files  5.2 GB
Model ArchitectureMosaicGPT
Licenseapache-2.0
Model Max Length2048
Transformers Version4.27.4
Tokenizer ClassGPTNeoXTokenizer
Vocabulary Size50432
Torch Data Typefloat32

Best Alternatives to Mpt 1B Redpajama 200B

Best Alternatives
Context / RAM
Downloads
Likes
Mpt 1B Redpajama 200B0K / 5.2 GB25292
Mpt 1B Redpajama 200B Dolly0K / 5.2 GB13177

Rank the Mpt 1B Redpajama 200B Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 51369 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241124