EleutherAI Polyglot Ko 12.8B 4bits by RichardErkhov

 ยป  All LLMs  ยป  RichardErkhov  ยป  EleutherAI Polyglot Ko 12.8B 4bits   URL Share it on

  Arxiv:2104.09864   Arxiv:2204.04541   Arxiv:2306.02254   4-bit   Autotrain compatible   Bitsandbytes   Endpoints compatible   Gpt neox   Pytorch   Region:us   Safetensors   Sharded   Tensorflow

EleutherAI Polyglot Ko 12.8B 4bits Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
EleutherAI Polyglot Ko 12.8B 4bits (RichardErkhov/EleutherAI_-_polyglot-ko-12.8b-4bits)
๐ŸŒŸ Advertise your project ๐Ÿš€

EleutherAI Polyglot Ko 12.8B 4bits Parameters and Internals

Model Type 
autoregressive, language model
Use Cases 
Areas:
Research, Commercial Applications
Applications:
Text generation, Language comprehension, Model evaluation
Primary Use Cases:
Next token prediction in Korean
Limitations:
Model may not produce the most factual or accurate responses and can produce offensive content.
Considerations:
Use with appropriate filtering mechanisms for sensitive content.
Supported Languages 
ko (Full)
Training Details 
Data Sources:
Korean blog posts, Korean news dataset, Modu corpus, Korean patent dataset, Korean Q & A dataset, KcBert dataset, Korean fiction dataset, Korean online comments, Korean wikipedia, Clova call, Naver sentiment movie corpus, Korean hate speech dataset, Open subtitles, AIHub various tasks datasets, Standard Korean language dictionary
Data Volume:
863 GB (1.2TB before processing)
Methodology:
Trained for 167 billion tokens over 301,000 steps using GPT-NeoX framework with cross-entropy loss.
Context Length:
2048
Hardware Used:
256 A100 GPUs
Model Architecture:
40 transformer layers, model dimension 5120, feedforward dimension 20480, 40 heads of dimension 128, Rotary Position Embedding applied to 64 dimensions.
Responsible Ai Considerations 
Fairness:
Polyglot-Ko may produce socially unacceptable or offensive content.
Transparency:
Open-source release with citation information provided.
Accountability:
Human curation recommended to filter sensitive content.
Mitigation Strategies:
Masking of personally identifiable information (PII) in the pre-processing stage.
Input Output 
Input Format:
Text prompt in Korean
Accepted Modalities:
text
Output Format:
Text generation
Performance Tips:
Ensure suitable hardware for large model execution and sufficient memory capacity.
LLM NameEleutherAI Polyglot Ko 12.8B 4bits
Repository ๐Ÿค—https://huggingface.co/RichardErkhov/EleutherAI_-_polyglot-ko-12.8b-4bits 
Model Size12.8b
Required VRAM7.7 GB
Updated2025-08-17
MaintainerRichardErkhov
Model Typegpt_neox
Model Files  5.0 GB: 1-of-2   2.7 GB: 2-of-2
Supported Languagesko
Model ArchitectureGPTNeoXForCausalLM
Licenseapache-2.0
Context Length2048
Model Max Length2048
Transformers Version4.39.3
Tokenizer ClassPreTrainedTokenizerFast
Padding Token<|endoftext|>
Vocabulary Size30080
Torch Data Typefloat16

Best Alternatives to EleutherAI Polyglot Ko 12.8B 4bits

Best Alternatives
Context / RAM
Downloads
Likes
...pen Platypus Polyglot Ko 12.8B2K / 51.4 GB50
Polyglot Ko 12.8B Instruct2K / 25.9 GB31873
Polyglot Ko 12.8B Inst All2K / 51.4 GB8801
Polyglot Ko 12.8B Inst2K / 51.4 GB8881
KoRnDAlpaca RAG Polyglot 12.8B2K / 51.4 GB70
Koquality Polyglot 12.8B2K / 51.4 GB6720
Ppo22K / 25.9 GB8860
Gollm 12.8B Instruct V2.32K / 25.9 GB5650
Kullm Polyglot 12.8B V32K / 25.9 GB75
...12.8B Orca Chat QLoRA Merge V22K / 25.9 GB8990

Rank the EleutherAI Polyglot Ko 12.8B 4bits Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 50728 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241124