Mpt 7B 8K by mosaicml

 ยป  All LLMs  ยป  mosaicml  ยป  Mpt 7B 8K   URL Share it on

  Arxiv:1909.08053   Arxiv:2010.04245   Arxiv:2108.12409   Arxiv:2205.14135   Arxiv:2302.06675   Arxiv:2302.13971   Autotrain compatible   Composer   Custom code   Dataset:allenai/s2orc   Dataset:bigcode/the-stack   Dataset:c4   Dataset:mc4 Dataset:togethercomputer/redpa...   Ext 8k   Llm-foundry   Mosaicml   Mpt   Pytorch   Region:us   Sharded   Streamingdatasets
Model Card on HF ๐Ÿค—: https://huggingface.co/mosaicml/mpt-7b-8k 

Mpt 7B 8K Benchmarks

๐ŸŒŸ Advertise your project ๐Ÿš€

Mpt 7B 8K Parameters and Internals

Model Type 
decoder-style transformer, LLM, multimodal
Use Cases 
Areas:
research, commercial applications
Applications:
text generation, long-form instruction following, dialogue generation
Primary Use Cases:
finetuning for specific applications
Limitations:
not intended for deployment without finetuning, can produce factually incorrect output
Considerations:
Efforts made to clean pretraining data; however, outputs may still be offensive or biased.
Additional Notes 
This model builds on the MPT-7B with longer sequence handling and significant efficiency improvements.
Supported Languages 
English (proficient)
Training Details 
Data Sources:
mc4, c4, togethercomputer/RedPajama-Data-1T, bigcode/the-stack, allenai/s2orc
Data Volume:
1.5T tokens
Methodology:
MPT-7B-8k uses a modified transformer architecture, optimized for efficient training and inference with ALiBi for handling long inputs.
Context Length:
8192
Training Time:
9.5 days
Hardware Used:
440 A100-40GB GPUs
Model Architecture:
Decoder-only transformer with modifications such as FlashAttention, ALiBi, elimination of positional embeddings.
Safety Evaluation 
Ethical Considerations:
MPT-7B-8k can produce factually incorrect, lewd, biased or offensive outputs. It should not be used for human-facing interactions without further guardrails and user consent.
Responsible Ai Considerations 
Fairness:
Model may have biases inherited from training data.
Transparency:
Pretraining data was openly available, preprocessed to remove unsuitable content.
Accountability:
Responsibility of MosaicML.
Mitigation Strategies:
Guardrails recommended before deployment.
Input Output 
Input Format:
Text sequences, up to 8k tokens
Accepted Modalities:
text
Output Format:
Generated text
Performance Tips:
Utilize optimized implementations like FlashAttention and ensure usage with bfloat16 precision on GPUs.
Release Notes 
Version:
1.0.0
Date:
2023-07-18
Notes:
Initial release of MPT-7B-8k.
LLM NameMpt 7B 8K
Repository ๐Ÿค—https://huggingface.co/mosaicml/mpt-7b-8k 
Model Size7b
Required VRAM13.3 GB
Updated2025-06-09
Maintainermosaicml
Model Typempt
Model Files  9.9 GB: 1-of-2   3.4 GB: 2-of-2
Context Length8k
Model ArchitectureMPTForCausalLM
Licenseapache-2.0
Model Max Length8192
Transformers Version4.30.2
Tokenizer ClassGPTNeoXTokenizer
Vocabulary Size50432
Torch Data Typebfloat16
Mpt 7B 8K (mosaicml/mpt-7b-8k)

Quantized Models of the Mpt 7B 8K

Model
Likes
Downloads
VRAM
Mpt 7B Q81126 GB

Best Alternatives to Mpt 7B 8K

Best Alternatives
Context / RAM
Downloads
Likes
Mpt 7B Chat0K / 13.3 GB84546514
Mpt 7B0K / 13.3 GB303631170
Mpt 7B Instruct0K / 13.3 GB8695470
Mpt 7B Storywriter0K / 13.3 GB1829836
Mpt 7B Int8 Ov0K / 0 GB200
Shears Mpt 7B 50 Base0K / 13.3 GB1132
Mpt 7B0K / 26.5 GB31881
Sea Lion 7B Instruct0K / 15 GB20823
SEA LION V1 7B IT0K / 15 GB11123
Mpt 7B 8K Instruct0K / 13.3 GB61526
Note: green Score (e.g. "73.2") means that the model is better than mosaicml/mpt-7b-8k.

Rank the Mpt 7B 8K Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 48046 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241124