Gemma 2B 10M By mustafaaljadery: Benchmarks, Features and Detailed Analysis. Insights on Gemma 2B 10M.

Arxiv:1901.02860 Arxiv:2404.07143 Endpoints compatible Region:us Safetensors Sharded Tensorflow

Model Card on HF 🤗: https://huggingface.co/mustafaaljadery/gemma-2B-10M

Gemma 2B 10M Benchmarks

^nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

What is the LLM Explorer Rank (Score)

Gemma 2B 10M (mustafaaljadery/gemma-2B-10M)

🌟 Advertise your project 🚀

Gemma 2B 10M Parameters and Internals

Model Type

causal language model

Additional Notes

This is a very early checkpoint of the model, only 200 steps. The implementation features native inference optimized for CUDA.

Training Details

Methodology:

Our approach splits the attention in local attention blocks as outlined by InfiniAttention. We take those local attention blocks and apply recurrance to the local attention blocks for the final result of 10M context global attention. A lot of the inspiration for our ideas comes from the Transformer-XL paper.

Context Length:

10000000

Input Output

Input Format:

Specifically adjusted in main.py for desired prompt

Accepted Modalities:

text

Output Format:

Text

LLM Name	Gemma 2B 10M
Repository 🤗	https://huggingface.co/mustafaaljadery/gemma-2B-10M
Model Size	2b
Required VRAM	10 GB
Updated	2025-09-23
Maintainer	mustafaaljadery
Model Type	gemma
Model Files	4.9 GB: 1-of-3 5.0 GB: 2-of-3 0.1 GB: 3-of-3
Model Architecture	GemmaForCausalLM
License	mit
Context Length	8192
Model Max Length	8192
Transformers Version	4.40.0.dev0
Tokenizer Class	GemmaTokenizer
Padding Token	<pad>
Vocabulary Size	256000
Torch Data Type	float32

Best Alternatives to Gemma 2B 10M

Best Alternatives	Context / RAM	Downloads	Likes
Gemma 1.1 2B It	8K / 5.1 GB	250244	167
Codegemma 2B	8K / 5.1 GB	23539	86
Gita SastraGPT V1	8K / 5.1 GB	27	1
Caca Tinny 2B V3	8K / 5.1 GB	4	1
Pandas Tutor Gemma 2B	8K / 5.1 GB	9	1
Gemma Qlora Customer Support	8K / 5.1 GB	9	0
Octopus V2	8K / 5.1 GB	1387	885
Octopus V2	8K / 5.1 GB	1349	889
... 2B Finetuned Sft Navarasa 2.0	8K / 10 GB	1253	25
Gemma Ko 1.1 2B It	8K / 5.1 GB	648	1

Note: green Score (e.g. "73.2") means that the model is better than mustafaaljadery/gemma-2B-10M.

Rank the Gemma 2B 10M Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation

What open-source LLMs or SLMs are you in search of? 51538 in total.

Email us: info@extractum.io. Our Privacy Policy | Terms and Conditions | Suggest an improvement.

Our Social Media →

Original data from HuggingFace, OpenCompass and various public git repos.

Release v20241124

Support LLM Explorer

Gemma 2B 10M by mustafaaljadery

» All LLMs » mustafaaljadery » Gemma 2B 10M URL Share it on

Gemma 2B 10M Benchmarks

Gemma 2B 10M Parameters and Internals

Best Alternatives to Gemma 2B 10M

Rank the Gemma 2B 10M Capabilities

What open-source LLMs or SLMs are you in search of? 51538 in total.