Mistral 1L Tiny by nilq

 ยป  All LLMs  ยป  nilq  ยป  Mistral 1L Tiny   URL Share it on

  Arxiv:2305.07759   Autotrain compatible   Dataset:roneneldan/tinystories   Endpoints compatible   Generated from trainer   Mistral   Model-index   Region:us   Safetensors
Model Card on HF ๐Ÿค—: https://huggingface.co/nilq/mistral-1L-tiny 

Mistral 1L Tiny Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
๐ŸŒŸ Advertise your project ๐Ÿš€

Mistral 1L Tiny Parameters and Internals

Model Type 
Causal Language Modeling, text-generation
Use Cases 
Primary Use Cases:
Analysis of feature dynamics and emergence in real-world language models.
Additional Notes 
Trained on the roneneldan/TinyStories dataset. Consistent English text generation observed.
Training Details 
Data Sources:
roneneldan/TinyStories
Methodology:
Inspired by the 21M parameter one-layer GPT-Neo of the Tiny Stories paper. Trained to reproduce results and acquire high-frequency checkpoints for further analysis.
Training Time:
~2 hours on a single H100
Hardware Used:
single H100
Model Architecture:
Single-layer Mistral model with hidden size 512 and MLP intermediate size 1024.
LLM NameMistral 1L Tiny
Repository ๐Ÿค—https://huggingface.co/nilq/mistral-1L-tiny 
Model Size35.1m
Required VRAM0.1 GB
Updated2025-06-09
Maintainernilq
Model Typemistral
Model Files  0.1 GB   0.0 GB
Model ArchitectureMistralForCausalLM
Context Length2048
Model Max Length2048
Transformers Version4.38.1
Tokenizer ClassPreTrainedTokenizerFast
Vocabulary Size32000
Torch Data Typefloat32
Mistral 1L Tiny (nilq/mistral-1L-tiny)

Best Alternatives to Mistral 1L Tiny

Best Alternatives
Context / RAM
Downloads
Likes
...Mistral 1L Tiny TinyStories Ft2K / 0.1 GB211
Note: green Score (e.g. "73.2") means that the model is better than nilq/mistral-1L-tiny.

Rank the Mistral 1L Tiny Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 48046 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241124