MindLLM 1b3 Chat Zh V2.0 by bit-dny

 ยป  All LLMs  ยป  bit-dny  ยป  MindLLM 1b3 Chat Zh V2.0   URL Share it on

  Arxiv:2310.15777   Autotrain compatible   Conversational   En   Endpoints compatible   Gpt neo   Pytorch   Region:us   Zh

MindLLM 1b3 Chat Zh V2.0 Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
MindLLM 1b3 Chat Zh V2.0 (bit-dny/MindLLM-1b3-chat-zh-v2.0)
๐ŸŒŸ Advertise your project ๐Ÿš€

MindLLM 1b3 Chat Zh V2.0 Parameters and Internals

Model Type 
Pretrained Causal Language Model
Additional Notes 
Intended to provide a non-restricted small model for exploring safety challenges and domain-specific applications.
Supported Languages 
en (high proficiency), zh (high proficiency)
Training Details 
Data Sources:
Pile, Wudao, CBooks, self-collected data from filtered websites
Data Volume:
241 billion English tokens and 82 billion Chinese tokens
Methodology:
Two-stage training strategy using cross-entropy loss; fine-tuned on 4 million Chinese instruction samples
Model Architecture:
Transformer
LLM NameMindLLM 1b3 Chat Zh V2.0
Repository ๐Ÿค—https://huggingface.co/bit-dny/MindLLM-1b3-chat-zh-v2.0 
Required VRAM3 GB
Updated2025-09-23
Maintainerbit-dny
Model Typegpt_neo
Model Files  3.0 GB
Supported Languagesen zh
Model ArchitectureGPTNeoForCausalLM
Licenseapache-2.0
Context Length2048
Model Max Length2048
Transformers Version4.34.1
Tokenizer ClassGPT2Tokenizer
Padding Token[PAD]
Vocabulary Size75170
Torch Data Typebfloat16
Activation Functiongelu_new
Errorsreplace

Best Alternatives to MindLLM 1b3 Chat Zh V2.0

Best Alternatives
Context / RAM
Downloads
Likes
Fiction Story Generator2K / 0.6 GB4365
Calliope Legacy2K / 10.7 GB830
Domain Interpretation Model V22K / 1.4 GB872
Got Neo Var Ppo2K / 0.5 GB60
...c PatternDetection GTP Neo1.3B2K / 1.4 GB851
Sft 12K / 0.5 GB60
...c Entityextraction GPT Neo1.3B2K / 1.4 GB140
GPT Neo350 TURING2K / 1.5 GB70
GPT Neo350 EvilUltimate2K / 1.5 GB63
GPT Neo Br Instruction2K / 0.6 GB51
Note: green Score (e.g. "73.2") means that the model is better than bit-dny/MindLLM-1b3-chat-zh-v2.0.

Rank the MindLLM 1b3 Chat Zh V2.0 Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 51543 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241124