Name: Polyglot Ko 12.8B
Author: EleutherAI

Polyglot Ko 12.8B is an open-source language model by EleutherAI. Features: 12.8b LLM, VRAM: 25.8GB, Context: 2K, License: apache-2.0, HF Score: 33.3, LLM Explorer Score: 0.1, Arc: 27.1, HellaSwag: 51.7, MMLU: 26.6, TruthfulQA: 34.7, WinoGrande: 59.8, GSM8K: 0.2.

Arxiv:2104.09864 Arxiv:2204.04541 Arxiv:2306.02254 Autotrain compatible Endpoints compatible Gpt neox Ko Pytorch Region:us Safetensors Sharded Tensorflow

Model Card on HF 🤗: https://huggingface.co/EleutherAI/polyglot-ko-12.8b

Polyglot Ko 12.8B Benchmarks

ARC: 27.05 vs 96.7 (so35)^-72%

HellaSwag: 51.68 vs 95.3 (gpt4)^-45.8%

MMLU: 26.64 vs 88.3 (so35)^-69.8%

TruthfulQA: 34.69 vs 59 (gpt4)^-41.2%

WinoGrande: 59.75 vs 87.5 (gpt4)^-31.7%

GSM8K: 0.15 vs 96.4 (so35)^-99.8%

LLME Score: 0.10442

^nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

What is the LLM Explorer Rank (Score)

Polyglot Ko 12.8B (EleutherAI/polyglot-ko-12.8b)

🌟 Advertise your project 🚀

Polyglot Ko 12.8B Parameters and Internals

Model Type

causal-lm

Use Cases

Areas:

research, commercial applications

Limitations:

Polyglot-Ko may not always return the most factual or accurate response., Model may produce socially unacceptable or offensive content.

Considerations:

Human curation or filtering mechanism is recommended to censor sensitive content.

Additional Notes

Polyglot-Ko may produce socially unacceptable or offensive content.

Supported Languages

Korean (high)

Training Details

Data Sources:

Korean blog posts, Korean news dataset, Modu corpus, Korean patent dataset, Korean Q & A dataset, KcBert dataset, Korean fiction dataset, Korean online comments, Korean wikipedia, Clova call, Naver sentiment movie corpus, Korean hate speech dataset, Open subtitles, AIHub various tasks datasets, Standard Korean language dictionary

Data Volume:

863 GB (1.2TB before processing)

Methodology:

Trained with cross-entropy loss to maximize the likelihood of predicting the next token. Used EleutherAI GPT-NeoX framework.

Context Length:

2048

Training Time:

301,000 steps

Hardware Used:

256 A100 GPUs

Model Architecture:

40 transformer layers, model dimension 5120, feedforward dimension 20480, 40 heads, head dimension 128, RoPE applied to 64 dimensions of each head. Tokenization vocabulary of 30003.

LLM Name	Polyglot Ko 12.8B
Repository 🤗	https://huggingface.co/EleutherAI/polyglot-ko-12.8b
Model Size	12.8b
Required VRAM	25.8 GB
Updated	2025-09-23
Maintainer	EleutherAI
Model Type	gpt_neox
Model Files	0.9 GB: 1-of-28 0.8 GB: 2-of-28 0.8 GB: 3-of-28 1.0 GB: 4-of-28 0.9 GB: 5-of-28 1.0 GB: 6-of-28 0.9 GB: 7-of-28 1.0 GB: 8-of-28 0.9 GB: 9-of-28 1.0 GB: 10-of-28 0.9 GB: 11-of-28 1.0 GB: 12-of-28 0.9 GB: 13-of-28 1.0 GB: 14-of-28 0.9 GB: 15-of-28 1.0 GB: 16-of-28 0.9 GB: 17-of-28 1.0 GB: 18-of-28 0.9 GB: 19-of-28 1.0 GB: 20-of-28 0.9 GB: 21-of-28 1.0 GB: 22-of-28 0.9 GB: 23-of-28 1.0 GB: 24-of-28 0.9 GB: 25-of-28 1.0 GB: 26-of-28 0.9 GB: 27-of-28 0.5 GB: 28-of-28 26.0 GB
Supported Languages	ko
Model Architecture	GPTNeoXForCausalLM
License	apache-2.0
Context Length	2048
Model Max Length	2048
Transformers Version	4.29.2
Tokenizer Class	PreTrainedTokenizerFast
Padding Token	<\|endoftext\|>
Vocabulary Size	30080
Torch Data Type	float16

Best Alternatives to Polyglot Ko 12.8B

Best Alternatives	Context / RAM	Downloads	Likes
...therAI Polyglot Ko 12.8B 4bits	2K / 7.7 GB	0	1
...pen Platypus Polyglot Ko 12.8B	2K / 51.4 GB	5	0
Polyglot Ko 12.8B Instruct	2K / 25.9 GB	3187	3
KoRnDAlpaca RAG Polyglot 12.8B	2K / 51.4 GB	5	0
KoRnDAlpaca RAG Polyglot 12.8B	2K / 51.4 GB	5	0
Kullm Polyglot 12.8B V3	2K / 25.9 GB	5	5
Polyglot Ko 12.8B Inst All	2K / 51.4 GB	4	1
Koquality Polyglot 12.8B	2K / 51.4 GB	5	0
Gollm 12.8B Instruct V2.3	2K / 25.9 GB	13	0
Polyglot Ko 12.8B Inst	2K / 51.4 GB	4	1

Note: green Score (e.g. "73.2") means that the model is better than EleutherAI/polyglot-ko-12.8b.

Rank the Polyglot Ko 12.8B Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation

What open-source LLMs or SLMs are you in search of? 52392 in total.

Email us: info@extractum.io. Our Privacy Policy | Terms and Conditions | Suggest an improvement.

Our Social Media →

Original data from HuggingFace, OpenCompass and various public git repos.

Check out Ag3ntum — our secure, self-hosted AI agent for server management.

Release v20260328a

Support LLM Explorer

Polyglot Ko 12.8B by EleutherAI

» All LLMs » EleutherAI » Polyglot Ko 12.8B URL Share it on

Polyglot Ko 12.8B Benchmarks

Polyglot Ko 12.8B Parameters and Internals

Best Alternatives to Polyglot Ko 12.8B

Rank the Polyglot Ko 12.8B Capabilities

What open-source LLMs or SLMs are you in search of? 52392 in total.