What are the hardware requirements for Bilingual GPT Neox 4B 8K?

Bilingual GPT Neox 4B 8K requires approximately 7.7 GB of VRAM and supports a context window of 2K tokens. Quantized variants may run on less VRAM; see the Quantized Models section on this page.

Who developed Bilingual GPT Neox 4B 8K and how large is it?

Bilingual GPT Neox 4B 8K is developed by rinna, a model with 4b parameters. The model is published as open weights on Hugging Face and indexed on LLM Explorer with full benchmark history.

How does Bilingual GPT Neox 4B 8K perform on standard benchmarks?

Bilingual GPT Neox 4B 8K has the following published scores: MMLU 25.38. Compare against reference models on this page or on the LLM Explorer leaderboards.

Bilingual GPT Neox 4B 8K by rinna — VRAM 7.7GB, 2K context

Name: Bilingual GPT Neox 4B 8K
Author: rinna

Bilingual GPT Neox 4B 8K is an open-source language model by rinna. Features: 4b LLM, VRAM: 7.7GB, Context: 2K, License: mit, LLM Explorer Score: 0.13, Arc: 28.6, HellaSwag: 43.9, MMLU: 25.4.

Arxiv:2306.15595 Arxiv:2404.01657 Base model:finetune:rinna/bili... Base model:rinna/bilingual-gpt... Dataset:cc100 Dataset:eleutherai/pile Dataset:mc4 Dataset:togethercomputer/redpa... Dataset:wikipedia En Ext 8k Gpt neox Ja Pytorch Region:us Safetensors

Model Card on HF 🤗: https://huggingface.co/rinna/bilingual-gpt-neox-4b-8k

Bilingual GPT Neox 4B 8K Benchmarks

ARC: 28.58 vs 96.7 (so35)^-70.4%

HellaSwag: 43.94 vs 95.3 (gpt4)^-53.9%

MMLU: 25.38 vs 88.3 (so35)^-71.3%

LLME Score: 0.13043

^nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

What is the LLM Explorer Rank (Score)

Bilingual GPT Neox 4B 8K (rinna/bilingual-gpt-neox-4b-8k)

🌟 Advertise your project 🚀

Bilingual GPT Neox 4B 8K Parameters and Internals

Model Type

bilingual, language model, transformer-based

Use Cases

Areas:

research, commercial applications

Additional Notes

The model's standard configuration requires transformers version 4.31.0 or higher to operate correctly. Special attention to hyper-parameters is needed for optimal performance.

Supported Languages

English (proficient), Japanese (proficient)

Training Details

Data Sources:

Japanese CC-100, Japanese C4, The Pile, Redpajama, Wikipedia

Data Volume:

1.5 billion tokens

Methodology:

fine-tuning using RoPE positional interpolation

Context Length:

8192

Model Architecture:

A 36-layer, 2816-hidden-size transformer-based language model

Input Output

Performance Tips:

Since the model is sensitive to decoding hyper-parameters (e.g., temperature, top_p, top_k, repetition_penalty), it is suggested to explore the best setting for your task.

LLM Name	Bilingual GPT Neox 4B 8K
Repository 🤗	https://huggingface.co/rinna/bilingual-gpt-neox-4b-8k
Base Model(s)	Bilingual GPT Neox 4B rinna/bilingual-gpt-neox-4b
Model Size	4b
Required VRAM	7.7 GB
Updated	2026-05-20
Maintainer	rinna
Model Type	gpt_neox
Model Files	7.7 GB 7.7 GB
Supported Languages	ja en
Context Length	8k
Model Architecture	GPTNeoXForCausalLM
License	mit
Context Length	2048
Model Max Length	2048
Tokenizer Class	T5Tokenizer
Padding Token	[PAD]
Vocabulary Size	65536
Torch Data Type	float16

Best Alternatives to Bilingual GPT Neox 4B 8K

Best Alternatives	Context / RAM	Downloads	Likes
Bilingual GPT Neox 4B	2K / 7.7 GB	3627	28
...al GPT Neox 4B Instruction Sft	2K / 7.6 GB	939	17
...al GPT Neox 4B Instruction Ppo	2K / 7.7 GB	15	14
StellarX 4B V0.2	2K / 16 GB	1256	2
StellarX 4B V0	2K / 8.1 GB	1316	1
Sft Tldr Pythia 1 4b	2K / 5.7 GB	13	0
Tora 4B	2K / 7.6 GB	9	2
...x 4B Instruction Sft En Ja 84K	2K / 7.6 GB	8	1
StellarX 4B V0.2 GPTQ	2K / 1.8 GB	5	1

Note: green Score (e.g. "73.2") means that the model is better than rinna/bilingual-gpt-neox-4b-8k.

Rank the Bilingual GPT Neox 4B 8K Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation

What open-source LLMs or SLMs are you in search of? 53999 in total.

Email us: info@extractum.io. Our Privacy Policy | Terms and Conditions | Suggest an improvement.

Our Social Media →

Original data from HuggingFace, OpenCompass and various public git repos.

Check out Ag3ntum — our secure, self-hosted AI agent for server management.

Release v20260328a

Support LLM Explorer

Bilingual GPT Neox 4B 8K by rinna

» All LLMs » rinna » Bilingual GPT Neox 4B 8K URL Share it on

Bilingual GPT Neox 4B 8K Benchmarks

Bilingual GPT Neox 4B 8K Parameters and Internals

Best Alternatives to Bilingual GPT Neox 4B 8K

Rank the Bilingual GPT Neox 4B 8K Capabilities

What open-source LLMs or SLMs are you in search of? 53999 in total.