ALMA 13B Pretrain By haoranxu: Benchmarks, Features and Detailed Analysis. Insights on ALMA 13B Pretrain.

Arxiv:2309.11674 Arxiv:2401.08417 Autotrain compatible Base model:finetune:meta-llama... Base model:meta-llama/llama-2-... Endpoints compatible Llama Pytorch Region:us Sharded

Model Card on HF 🤗: https://huggingface.co/haoranxu/ALMA-13B-Pretrain

ALMA 13B Pretrain Benchmarks

ARC: 56.91 vs 96.7 (so35)^-41.1%

HellaSwag: 80.15 vs 95.3 (gpt4)^-15.9%

MMLU: 50.31 vs 88.3 (so35)^-43%

TruthfulQA: 37.44 vs 59 (gpt4)^-36.5%

WinoGrande: 76.4 vs 87.5 (gpt4)^-12.7%

GSM8K: 8.87 vs 96.4 (so35)^-90.8%

LLME Score: 0.16965

^nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

What is the LLM Explorer Rank (Score)

ALMA 13B Pretrain (haoranxu/ALMA-13B-Pretrain)

🌟 Advertise your project 🚀

ALMA 13B Pretrain Parameters and Internals

Model Type

Translation

Use Cases

Areas:

Translation, Research

Applications:

Machine Translation

Primary Use Cases:

Translating Chinese to English

Considerations:

For usage with LoRA models.

Additional Notes

ALMA-R incorporates Contrastive Preference Optimization (CPO) for improved performance.

Supported Languages

Monolingual Data (Multi-lingual), Parallel Data (High-quality parallel data)

Training Details

Data Sources:

20B monolingual tokens, high-quality parallel data, triplet preference data

Data Volume:

20B tokens for 7B model and 12B tokens for 13B model

Methodology:

Two-step fine-tuning with monolingual data followed by parallel data; Further optimized with Contrastive Preference Optimization (CPO)

Context Length:

Model Architecture:

LLaMA based, fine-tuned with LoRA

Input Output

Input Format:

Text input

Accepted Modalities:

text

Output Format:

Text output

Performance Tips:

Use LoRA models together with Base Models for intended performance.

Release Notes

Version:

ALMA-13B-LoRA

Date:

unknown

Notes:

Full-weight Fine-tune LLaMA-2-7B on 12B monolingual tokens and then LoRA fine-tune on human-written parallel data

Version:

ALMA-13B-R

Date:

unknown

Notes:

Further LoRA fine-tuning upon ALMA-13B-LoRA with contrastive preference optimization

LLM Name	ALMA 13B Pretrain
Repository 🤗	https://huggingface.co/haoranxu/ALMA-13B-Pretrain
Base Model(s)	Llama 2 13B Hf meta-llama/Llama-2-13b-hf
Model Size	13b
Required VRAM	52.1 GB
Updated	2025-10-24
Maintainer	haoranxu
Model Type	llama
Model Files	10.0 GB: 1-of-6 9.9 GB: 2-of-6 9.9 GB: 3-of-6 9.9 GB: 4-of-6 9.9 GB: 5-of-6 2.5 GB: 6-of-6
Model Architecture	LlamaForCausalLM
License	mit
Context Length	4096
Model Max Length	4096
Transformers Version	4.30.0.dev0
Tokenizer Class	LlamaTokenizer
Beginning of Sentence Token	<s>
End of Sentence Token	</s>
Unk Token	<unk>
Vocabulary Size	32000
Torch Data Type	float32

Quantized Models of the ALMA 13B Pretrain

Model	Likes	Downloads	VRAM
ALMA 13B Pretrain GGUF	12	715	5 GB
ALMA 13B Pretrain GPTQ	1	13	7 GB
ALMA 13B Pretrain AWQ	1	8	7 GB

Best Alternatives to ALMA 13B Pretrain

Best Alternatives	Context / RAM	Downloads	Likes
Luminaura RP 13B	128K / 26 GB	6	0
Yarn Llama 2 13B 128K	128K / 26 GB	34	112
Agent Llama2 13B 80K	80K / 26.4 GB	5	0
Chat Llama2 13B 80K	80K / 52.8 GB	5	0
LongAlign 13B 64K	64K / 26 GB	115	13
LongAlign 13B 64K	64K / 26 GB	11	13
LongAlign 13B 64K Base	64K / 26 GB	52	3
LongAlign 13B 64K Base	64K / 26 GB	6	3
Openbuddy Llama2 13B V15p1 64K	64K / 26.1 GB	6	4
Openbuddy Llama2 13b64k V15	64K / 26.1 GB	5	2

Rank the ALMA 13B Pretrain Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation

What open-source LLMs or SLMs are you in search of? 51558 in total.

Email us: info@extractum.io. Our Privacy Policy | Terms and Conditions | Suggest an improvement.

Our Social Media →

Original data from HuggingFace, OpenCompass and various public git repos.

Release v20241124

Support LLM Explorer