What are the hardware requirements for MoMo 72B Lora 1.8.7 DPO?

MoMo 72B Lora 1.8.7 DPO requires approximately 208.5 GB of VRAM and supports a context window of 32K tokens. Quantized variants may run on less VRAM; see the Quantized Models section on this page.

Who developed MoMo 72B Lora 1.8.7 DPO and how large is it?

MoMo 72B Lora 1.8.7 DPO is developed by moreh, a model with 72b parameters. The model is published as open weights on Hugging Face and indexed on LLM Explorer with full benchmark history.

How does MoMo 72B Lora 1.8.7 DPO perform on standard benchmarks?

MoMo 72B Lora 1.8.7 DPO has the following published scores: MMLU 77.13. Compare against reference models on this page or on the LLM Explorer leaderboards.

MoMo 72B Lora 1.8.7 DPO by moreh — VRAM 208.5GB, 32K context

Name: MoMo 72B Lora 1.8.7 DPO
Author: moreh

MoMo 72B Lora 1.8.7 DPO is an open-source language model by moreh. Features: 72b LLM, VRAM: 208.5GB, Context: 32K, License: mit, LLM Explorer Score: 0.2, Arc: 70.8, HellaSwag: 86, MMLU: 77.1, GSM8K: 78.6.

Arxiv:2106.09685 Arxiv:2305.18290 En Endpoints compatible Llama Lora Region:us Safetensors Sharded Tensorflow

Model Card on HF 🤗: https://huggingface.co/moreh/MoMo-72B-lora-1.8.7-DPO

MoMo 72B Lora 1.8.7 DPO Benchmarks

ARC: 70.82 vs 96.7 (so35)^-26.8%

HellaSwag: 85.96 vs 95.3 (gpt4)^-9.8%

MMLU: 77.13 vs 88.3 (so35)^-12.7%

GSM8K: 78.62 vs 96.4 (so35)^-18.4%

LLME Score: 0.20427

^nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

What is the LLM Explorer Rank (Score)

MoMo 72B Lora 1.8.7 DPO (moreh/MoMo-72B-lora-1.8.7-DPO)

🌟 Advertise your project 🚀

MoMo 72B Lora 1.8.7 DPO Parameters and Internals

Additional Notes

For more information on the MoAI platform and its capabilities, refer to https://moreh.io/product.

Supported Languages

languages (en), proficiency ()

Training Details

Data Sources:

Open-Orca/SlimOrca, https://huggingface.co/datasets/jondurbin/truthy-dpo-v0.1, https://huggingface.co/datasets/Intel/orca_dpo_pairs

Methodology:

Direct Preference Optimization (DPO), Supervised Fine-Tuning (SFT) using LoRA.

Hardware Used:

AMD MI250

Release Notes

Version:

V1.8.7

Date:

24/04/05

Notes:

Model was trained with optimizations in hyperparameters without exploiting weight merge. Adjustments for compatibility with Llama for leaderboard submission.

LLM Name	MoMo 72B Lora 1.8.7 DPO
Repository 🤗	https://huggingface.co/moreh/MoMo-72B-lora-1.8.7-DPO
Model Size	72b
Required VRAM	208.5 GB
Updated	2026-06-05
Maintainer	moreh
Model Type	llama
Model Files	5.0 GB: 1-of-63 4.6 GB: 2-of-63 4.3 GB: 3-of-63 4.3 GB: 4-of-63 4.8 GB: 5-of-63 4.8 GB: 6-of-63 4.3 GB: 7-of-63 4.8 GB: 8-of-63 4.8 GB: 9-of-63 4.3 GB: 10-of-63 4.8 GB: 11-of-63 4.8 GB: 12-of-63 4.3 GB: 13-of-63 4.8 GB: 14-of-63 4.8 GB: 15-of-63 4.3 GB: 16-of-63 4.8 GB: 17-of-63 4.8 GB: 18-of-63 4.3 GB: 19-of-63 4.8 GB: 20-of-63 4.8 GB: 21-of-63 4.3 GB: 22-of-63 4.8 GB: 23-of-63 4.8 GB: 24-of-63 4.3 GB: 25-of-63 4.8 GB: 26-of-63 4.8 GB: 27-of-63 4.3 GB: 28-of-63 4.8 GB: 29-of-63 4.8 GB: 30-of-63 4.3 GB: 31-of-63 4.8 GB: 32-of-63 4.8 GB: 33-of-63 4.3 GB: 34-of-63 4.8 GB: 35-of-63 4.8 GB: 36-of-63 4.3 GB: 37-of-63 4.8 GB: 38-of-63 4.8 GB: 39-of-63 4.3 GB: 40-of-63 4.8 GB: 41-of-63 4.8 GB: 42-of-63 4.3 GB: 43-of-63 4.8 GB: 44-of-63 4.8 GB: 45-of-63
Supported Languages	en
Model Architecture	LlamaForCausalLM
License	mit
Context Length	32768
Model Max Length	32768
Transformers Version	4.36.0
Vocabulary Size	152064
LoRA Model	Yes
Torch Data Type	float32

Quantized Models of the MoMo 72B Lora 1.8.7 DPO

Model	Likes	Downloads	VRAM
Smaug 72B V0.1 2.4bpw H6 EXL2	1	5	24 GB
Smaug 72B V0.1 AWQ	9	5	41 GB
Smaug 72B V0.1 GPTQ	8	8	41 GB
MoMo 72B Lora 1.8.7 DPO GPTQ	7	3	41 GB

Best Alternatives to MoMo 72B Lora 1.8.7 DPO

Best Alternatives	Context / RAM	Downloads	Likes
2 Pro Math	128K / 141.9 GB	9	0
TW3 JRGL V2	32K / 79.7 GB	16209	0
Le Triomphant ECE TW3	32K / 79.7 GB	16554	5
Smaug 72B V0.1	32K / 144.5 GB	8710	468
ECE TW3 JRGL V5	32K / 159.6 GB	8350	1
Rhea 72B V0.5	32K / 144.5 GB	8194	134
JuliusCesar 72B BeyonderV.0	32K / 74.2 GB	70	0
Caigun Model 72B KGI	32K / 144.6 GB	11	0
...Preview Llamafied Qwen Llamafy	32K / 144.5 GB	291	69
Rhea 125 V0.5	32K / 249 GB	0	1

Note: green Score (e.g. "73.2") means that the model is better than moreh/MoMo-72B-lora-1.8.7-DPO.

Rank the MoMo 72B Lora 1.8.7 DPO Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation

What open-source LLMs or SLMs are you in search of? 54964 in total.

Email us: info@extractum.io. Our Privacy Policy | Terms and Conditions | Suggest an improvement.

Our Social Media →

Original data from HuggingFace, OpenCompass and various public git repos.

Check out Ag3ntum — our secure, self-hosted AI agent for server management.

Release v20260328a

Support LLM Explorer

MoMo 72B Lora 1.8.7 DPO by moreh