What are the hardware requirements for Csmpt7b?

Csmpt7b requires approximately 26.7 GB of VRAM. Quantized variants may run on less VRAM; see the Quantized Models section on this page.

Who developed Csmpt7b and how large is it?

Csmpt7b is developed by BUT-FIT, a model with 6.7b parameters. The model is published as open weights on Hugging Face and indexed on LLM Explorer with full benchmark history.

Where can I download or evaluate Csmpt7b?

Csmpt7b is hosted on Hugging Face and linked from this page. LLM Explorer also lists quantized variants and similar alternatives if available.

Csmpt7b by BUT-FIT — VRAM 26.7GB

Name: Csmpt7b
Author: BUT-FIT

Csmpt7b is an open-source language model by BUT-FIT. Features: 6.7b LLM, VRAM: 26.7GB, License: apache-2.0, Quantized, LLM Explorer Score: 0.13.

Arxiv:2304.09871 Arxiv:2412.17933 Cs Custom code Dataset:but-fit/adult content ... Dataset:but-fit/but-lcc Endpoints compatible Gguf Gguf-my-repo Llama-cpp Mpt Q8 Quantized Region:us Safetensors Sharded Tensorflow

Model Card on HF 🤗: https://huggingface.co/BUT-FIT/csmpt7b

Csmpt7b Benchmarks

LLME Score: 0.12625

^nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

What is the LLM Explorer Rank (Score)

🌟 Advertise your project 🚀

Csmpt7b Parameters and Internals

Model Type

text generation

Use Cases

Areas:

Czech language processing

Applications:

academic research, commercial applications

Primary Use Cases:

text generation for Czech language content

Additional Notes

Vocabulary swap was utilized to enhance transfer of knowledge from English to Czech.

Supported Languages

Czech (native)

Training Details

Data Sources:

BUT-FIT/BUT-LCC, BUT-FIT/adult_content_classifier_dataset

Data Volume:

272 billion training tokens, with ~67 billion tokens specific to Czech

Methodology:

Vocabulary swap method aligning English tokens to Czech equivalents

Context Length:

2048

Hardware Used:

Karolina cluster

Model Architecture:

Continuation of MPT7b with adaptations for Czech language

Input Output

Input Format:

Czech textual prompts

Accepted Modalities:

text

Output Format:

Generated Czech text

Release Notes

Version:

N/A

Date:

18/04/2024

Notes:

Release of training checkpoints

Version:

N/A

Date:

06/05/2024

Notes:

Release of manually annotated dataset for adult content classification

LLM Name	Csmpt7b
Repository 🤗	https://huggingface.co/BUT-FIT/csmpt7b
Model Size	6.7b
Required VRAM	26.7 GB
Updated	2026-04-19
Maintainer	BUT-FIT
Model Type	mpt
Model Files	13.4 GB 26.8 GB 7.1 GB 4.8 GB: 1-of-6 4.8 GB: 2-of-6 4.8 GB: 3-of-6 4.8 GB: 4-of-6 4.8 GB: 5-of-6 2.7 GB: 6-of-6
Supported Languages	cs
GGUF Quantization	Yes
Quantization Type	gguf\|q8
Model Architecture	MPTForCausalLM
License	apache-2.0
Model Max Length	2048
Transformers Version	4.37.0.dev0
Tokenizer Class	PreTrainedTokenizerFast
Padding Token	[EOS]
Vocabulary Size	64002
Torch Data Type	float32

Rank the Csmpt7b Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation

What open-source LLMs or SLMs are you in search of? 53444 in total.

Email us: info@extractum.io. Our Privacy Policy | Terms and Conditions | Suggest an improvement.

Our Social Media →

Original data from HuggingFace, OpenCompass and various public git repos.

Check out Ag3ntum — our secure, self-hosted AI agent for server management.

Release v20260328a

Support LLM Explorer

Csmpt7b by BUT-FIT

» All LLMs » BUT-FIT » Csmpt7b URL Share it on

Csmpt7b Benchmarks

Csmpt7b Parameters and Internals

Rank the Csmpt7b Capabilities

What open-source LLMs or SLMs are you in search of? 53444 in total.