Mixsmol 4x400M V0.1 Epoch3 is an open-source language model by vilm. Features: 1.8b LLM, VRAM: 3.5GB, Context: 4K, License: apache-2.0, MoE, LLM Explorer Score: 0.12.
Mixsmol 4x400M V0.1 Epoch3 Parameters and Internals
Model Type
multimodal, crosslingual
Additional Notes
Note that this is an experimental model run focusing on data mixing.
Training Details
Data Sources:
Synthetic Textbooks, RefinedWeb, RedPajama-v2, MathPile, ThePile, GoodWiki, The Stack Smol XL, The Vault: train_small split, Instruction Pretraining
Data Volume:
50B tokens
Methodology:
Experimental in data mixing focusing on reasoning capabilities through synthetic textbook data and crosslingual understanding through machine translation/multilingual tasks pretraining
Release Notes
Version:
Epoch 3
Notes:
This version was trained on 50B tokens to test reasoning capabilities and crosslingual understanding. Future runs will use more data and compute to maximize capabilities.
Note: green Score (e.g. "73.2") means that the model is better than vilm/Mixsmol-4x400M-v0.1-epoch3.
Rank the Mixsmol 4x400M V0.1 Epoch3 Capabilities
๐ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐
Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation
What open-source LLMs or SLMs are you in search of? 52721 in total.