Ling 2.6 Flash MLX 9bit is an open-source language model by inferencerlabs. Features: LLM, VRAM: 117.2GB, Context: 128K, Quantized.
| LLM Name | Ling 2.6 Flash MLX 9bit |
| Repository 🤗 | https://huggingface.co/inferencerlabs/Ling-2.6-flash-MLX-9bit |
| Base Model(s) | |
| Required VRAM | 117.2 GB |
| Updated | 2026-05-03 |
| Maintainer | inferencerlabs |
| Model Type | bailing_hybrid |
| Model Files | |
| Supported Languages | en |
| Quantization Type | 9bit |
| Model Architecture | BailingMoeV2_5ForCausalLM |
| Context Length | 131072 |
| Model Max Length | 131072 |
| Transformers Version | 4.56.2 |
| Tokenizer Class | TokenizersBackend |
| Padding Token | <|endoftext|> |
| Vocabulary Size | 157184 |
| Torch Data Type | bfloat16 |
🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟