ALMA 13B Pretrain is an open-source language model by haoranxu. Features: 13b LLM, VRAM: 52.1GB, Context: 4K, License: mit, HF Score: 51.7, LLM Explorer Score: 0.16, Arc: 56.9, HellaSwag: 80.2, MMLU: 50.3, TruthfulQA: 37.4, WinoGrande: 76.4, GSM8K: 8.9.
ALMA 13B Pretrain Benchmarks
nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
ALMA 13B Pretrain Parameters and Internals
Model Type
Use Cases
Areas:
Applications:
Primary Use Cases: Translating Chinese to English
Considerations: For usage with LoRA models.
Additional Notes ALMA-R incorporates Contrastive Preference Optimization (CPO) for improved performance.
Supported Languages Monolingual Data (Multi-lingual), Parallel Data (High-quality parallel data)
Training Details
Data Sources: 20B monolingual tokens, high-quality parallel data, triplet preference data
Data Volume: 20B tokens for 7B model and 12B tokens for 13B model
Methodology: Two-step fine-tuning with monolingual data followed by parallel data; Further optimized with Contrastive Preference Optimization (CPO)
Context Length:
Model Architecture: LLaMA based, fine-tuned with LoRA
Input Output
Input Format:
Accepted Modalities:
Output Format:
Performance Tips: Use LoRA models together with Base Models for intended performance.
Release Notes
Version:
Date:
Notes: Full-weight Fine-tune LLaMA-2-7B on 12B monolingual tokens and then LoRA fine-tune on human-written parallel data
Version:
Date:
Notes: Further LoRA fine-tuning upon ALMA-13B-LoRA with contrastive preference optimization
Quantized Models of the ALMA 13B Pretrain
Best Alternatives to ALMA 13B Pretrain
Expand
Rank the ALMA 13B Pretrain Capabilities
๐ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐
Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation
Expand
Check out
Ag3ntum โ our secure, self-hosted AI agent for server management.
Release v20260328a