Norobara ZLoss 8x7B by Doctor-Shotgun

 ยป  All LLMs  ยป  Doctor-Shotgun  ยป  Norobara ZLoss 8x7B   URL Share it on

  Autotrain compatible   Conversational Dataset:doctor-shotgun/capybar... Dataset:doctor-shotgun/no-robo... Dataset:huggingfaceh4/no robot...   Dataset:ldjnr/capybara   Dataset:ldjnr/verified-camel Dataset:unalignment/toxic-dpo-...   En   Mixtral   Moe   Region:us   Safetensors   Sharded   Tensorflow

Norobara ZLoss 8x7B Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Norobara ZLoss 8x7B (Doctor-Shotgun/Norobara-ZLoss-8x7B)
๐ŸŒŸ Advertise your project ๐Ÿš€

Norobara ZLoss 8x7B Parameters and Internals

Model Type 
text-generation
Use Cases 
Limitations:
Model may generate toxic or harmful outputs.
Considerations:
Generate at your own risk.
Additional Notes 
This is an experimental instruct-tuned model to test various loss implementations.
Training Details 
Data Sources:
LDJnr/Capybara, unalignment/toxic-dpo-v0.1, LDJnr/Verified-Camel, HuggingFaceH4/no_robots, Doctor-Shotgun/no-robots-sharegpt, Doctor-Shotgun/capybara-sharegpt
Methodology:
Trained using ZLoss and Megablocks-based fork of transformers.
Training Time:
3 epochs, 13 hours using a single H100 GPU
Hardware Used:
single H100 GPU
Safety Evaluation 
Risk Categories:
bias, toxic output
Ethical Considerations:
Model may generate toxic or harmful outputs.
Input Output 
Input Format:
Modified multi-turn Alpaca instruction format
LLM NameNorobara ZLoss 8x7B
Repository ๐Ÿค—https://huggingface.co/Doctor-Shotgun/Norobara-ZLoss-8x7B 
Model Size46.7b
Required VRAM93.6 GB
Updated2025-10-11
MaintainerDoctor-Shotgun
Model Typemixtral
Model Files  4.9 GB: 1-of-19   5.0 GB: 2-of-19   5.0 GB: 3-of-19   4.9 GB: 4-of-19   5.0 GB: 5-of-19   5.0 GB: 6-of-19   4.9 GB: 7-of-19   5.0 GB: 8-of-19   5.0 GB: 9-of-19   4.9 GB: 10-of-19   5.0 GB: 11-of-19   5.0 GB: 12-of-19   5.0 GB: 13-of-19   4.9 GB: 14-of-19   5.0 GB: 15-of-19   5.0 GB: 16-of-19   4.9 GB: 17-of-19   5.0 GB: 18-of-19   4.2 GB: 19-of-19
Supported Languagesen
Model ArchitectureMixtralForCausalLM
Context Length32768
Model Max Length32768
Transformers Version4.36.2
Tokenizer ClassLlamaTokenizer
Vocabulary Size32000
Torch Data Typebfloat16

Quantized Models of the Norobara ZLoss 8x7B

Model
Likes
Downloads
VRAM
Norobara ZLoss 8x7B GGUF222415 GB
Norobara ZLoss 8x7B AWQ3724 GB
Norobara ZLoss 8x7B GPTQ2423 GB

Best Alternatives to Norobara ZLoss 8x7B

Best Alternatives
Context / RAM
Downloads
Likes
Mixtral 8x7B Instruct V0.132K / 93.6 GB4811124637
Nous Hermes 2 Mixtral 8x7B DPO32K / 93.6 GB14686451
Mixtral 8x7B V0.132K / 93.6 GB617351788
Sensualize Mixtral Bf1632K / 93.6 GB00
Skadi Mixtral V132K / 93.5 GB00
Franziska Mixtral V132K / 93.5 GB00
Typhon Mixtral V132K / 93.4 GB00
GritLM 8x7B KTO32K / 93.6 GB82893
Smaug Mixtral V0.132K / 187.7 GB854812
NatureLM 8x7B32K / 0.3 GB7218
Note: green Score (e.g. "73.2") means that the model is better than Doctor-Shotgun/Norobara-ZLoss-8x7B.

Rank the Norobara ZLoss 8x7B Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 51611 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241124