OpenBezoar HH RLHF SFT by SurgeGlobal

 ยป  All LLMs  ยป  SurgeGlobal  ยป  OpenBezoar HH RLHF SFT   URL Share it on

  Arxiv:2306.02707   Arxiv:2404.12195   Autotrain compatible Base model:finetune:surgegloba... Base model:surgeglobal/openbez...   Dataset:anthropic/hh-rlhf   En   Endpoints compatible   Llama   Pytorch   Region:us   Safetensors

OpenBezoar HH RLHF SFT Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
OpenBezoar HH RLHF SFT (SurgeGlobal/OpenBezoar-HH-RLHF-SFT)
๐ŸŒŸ Advertise your project ๐Ÿš€

OpenBezoar HH RLHF SFT Parameters and Internals

Model Type 
text generation
Use Cases 
Limitations:
The model might not consistently show improved abilities to follow instructions, and it could respond inappropriately or get stuck in loops., This model is not aligned to human preferences and therefore it may generate harmful and uncensored content., Caution is urged against relying on this model for production or adjacent use-cases.
Supported Languages 
en ()
Training Details 
Data Sources:
Anthropic HH-RLHF Dataset
Data Volume:
First 100K examples
Methodology:
Supervised Fine-Tuning (SFT)
Model Architecture:
OpenLLaMA 3B v2
Input Output 
Input Format:
Alpaca prompt template
Performance Tips:
It is important to utilize the Alpaca prompt template in order to obtain best responses for instruction related tasks.
LLM NameOpenBezoar HH RLHF SFT
Repository ๐Ÿค—https://huggingface.co/SurgeGlobal/OpenBezoar-HH-RLHF-SFT 
Base Model(s)  OpenBezoar SFT   SurgeGlobal/OpenBezoar-SFT
Model Size3b
Required VRAM6.8 GB
Updated2025-07-24
MaintainerSurgeGlobal
Model Typellama
Model Files  6.8 GB   6.8 GB
Supported Languagesen
Model ArchitectureLlamaForCausalLM
Licensecc-by-nc-4.0
Context Length2048
Model Max Length2048
Transformers Version4.33.2
Tokenizer ClassLlamaTokenizer
Vocabulary Size32000
Torch Data Typefloat16

Best Alternatives to OpenBezoar HH RLHF SFT

Best Alternatives
Context / RAM
Downloads
Likes
ISA 03 Mini 3B Hybrid Preview256K / 6.5 GB9334
Llama 3.2 3B Instruct128K / 6.5 GB16957091611
Llama 3.2 3B128K / 6.5 GB275293606
Hermes 3 Llama 3.2 3B128K / 6.5 GB15494164
DeepSeek R1 Distill Llama 3B128K / 6.5 GB178215
Llama 3.2 3B RP Toxic Fuse128K / 6.4 GB1532
Orpheus 3B 0.1 Ft128K / 6.6 GB280275
Calme 3.1 Llamaloi 3B128K / 10.6 GB23551
Cogito V1 Preview Llama 3B128K / 7.2 GB123196
Jajuka 3B128K / 6.4 GB262
Note: green Score (e.g. "73.2") means that the model is better than SurgeGlobal/OpenBezoar-HH-RLHF-SFT.

Rank the OpenBezoar HH RLHF SFT Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 50035 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241124