Bilingual GPT Neox 4B Instruction Ppo By rinna: Benchmarks, Features and Detailed Analysis. Insights on Bilingual GPT Neox 4B Instruction Ppo.

Arxiv:1707.06347 Arxiv:2203.02155 Arxiv:2404.01657 Autotrain compatible Base model:finetune:rinna/bili... Base model:rinna/bilingual-gpt... Dataset:anthropic/hh-rlhf En Gpt neox Instruct Ja Pytorch Region:us Safetensors

Model Card on HF 🤗: https://huggingface.co/rinna/bilingual-gpt-neox-4b-instruction-ppo

Bilingual GPT Neox 4B Instruction Ppo Benchmarks

ARC: 28.24 vs 96.7 (so35)^-70.8%

HellaSwag: 47.9 vs 95.3 (gpt4)^-49.7%

MMLU: 23.12 vs 88.3 (so35)^-73.8%

TruthfulQA: 43.5 vs 59 (gpt4)^-26.3%

WinoGrande: 52.25 vs 87.5 (gpt4)^-40.3%

LLME Score: 0.1583

^nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

What is the LLM Explorer Rank (Score)

Bilingual GPT Neox 4B Instruction Ppo (rinna/bilingual-gpt-neox-4b-instruction-ppo)

🌟 Advertise your project 🚀

Bilingual GPT Neox 4B Instruction Ppo Parameters and Internals

Model Type

text generation

Use Cases

Areas:

research, commercial applications

Applications:

instruction-following conversational agent

Primary Use Cases:

Bilingual text generation

Limitations:

Sensitive to decoding hyper-parameters.

Considerations:

Decoding hyper-parameters should be carefully chosen.

Additional Notes

The model uses a sentencepiece-based tokenizer with a vocabulary size of 65,536.

Supported Languages

ja (full proficiency), en (full proficiency)

Training Details

Data Sources:

Anthropic/hh-rlhf

Methodology:

Supervised Fine-Tuning (SFT) and PPO-based Reinforcement Learning (RL)

Model Architecture:

36-layer, 2816-hidden-size transformer-based language model

Input Output

Input Format:

A special format for conversation between 'ユーザー' and 'システム', ending with 'システム: '.

Accepted Modalities:

text

Output Format:

Textual response in the set language (Japanese/English)

Performance Tips:

Adjust decoding hyper-parameters for optimal performance.

LLM Name	Bilingual GPT Neox 4B Instruction Ppo
Repository 🤗	https://huggingface.co/rinna/bilingual-gpt-neox-4b-instruction-ppo
Base Model(s)	Bilingual GPT Neox 4B rinna/bilingual-gpt-neox-4b
Model Size	4b
Required VRAM	7.7 GB
Updated	2025-09-23
Maintainer	rinna
Model Type	gpt_neox
Instruction-Based	Yes
Model Files	7.7 GB 7.8 GB
Supported Languages	ja en
Model Architecture	GPTNeoXForCausalLM
License	mit
Context Length	2048
Model Max Length	2048
Tokenizer Class	T5Tokenizer
Padding Token	[PAD]
Vocabulary Size	65536
Torch Data Type	float16

Best Alternatives to Bilingual GPT Neox 4B Instruction Ppo

Best Alternatives	Context / RAM	Downloads	Likes
...al GPT Neox 4B Instruction Sft	2K / 7.6 GB	1436	17
Tora 4B	2K / 7.6 GB	5	2
...x 4B Instruction Sft En Ja 84K	2K / 7.6 GB	6	1

Rank the Bilingual GPT Neox 4B Instruction Ppo Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation

What open-source LLMs or SLMs are you in search of? 51555 in total.

Email us: info@extractum.io. Our Privacy Policy | Terms and Conditions | Suggest an improvement.

Our Social Media →

Original data from HuggingFace, OpenCompass and various public git repos.

Release v20241124

Support LLM Explorer

Bilingual GPT Neox 4B Instruction Ppo by rinna

» All LLMs » rinna » Bilingual GPT Neox 4B Instruction Ppo URL Share it on

Bilingual GPT Neox 4B Instruction Ppo Benchmarks

Bilingual GPT Neox 4B Instruction Ppo Parameters and Internals

Best Alternatives to Bilingual GPT Neox 4B Instruction Ppo

Rank the Bilingual GPT Neox 4B Instruction Ppo Capabilities

What open-source LLMs or SLMs are you in search of? 51555 in total.