Llama 3.1 Swallow 8B Instruct V0.1 is an open-source language model by tokyotech-llm. Features: 8b LLM, VRAM: 16.1GB, Context: 8K, License: llama3.1|gemma, Instruction-Based, LLM Explorer Score: 0.16.
Llama 3.1 Swallow 8B Instruct V0.1 Parameters and Internals
Model Type
text-generation
Additional Notes
Swallow models enhanced Japanese capabilities while retaining English capabilities; Developed under various projects and supports; Continual pre-training and instruction-tuning based fine-tuning involved.
Supported Languages
Japanese (enhanced), English (retained)
Training Details
Data Sources:
Large Japanese web corpus, Japanese and English Wikipedia, Mathematical and coding contents
Data Volume:
200 billion tokens
Methodology:
Continual pre-training, Supervised fine-tuning on synthetic
data
Hardware Used:
Megatron-LM
Model Architecture:
Please refer to Llama 3.1 MODEL_CARD for details on the model architecture.
Note: green Score (e.g. "73.2") means that the model is better than tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.1.
Rank the Llama 3.1 Swallow 8B Instruct V0.1 Capabilities
๐ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐
Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation
What open-source LLMs or SLMs are you in search of? 52721 in total.