Snorkel Mistral PairRM DPO 8.0bpw H8 EXL2 is an open-source language model by LoneStriker. Features: LLM, VRAM: 7.4GB, Context: 32K, License: apache-2.0, Quantized, LLM Explorer Score: 0.12.
1. Generate five response variations for each prompt from a subset of 20,000 using the LLM - to start, we used Mistral-7B-Instruct-v0.2. 2. Apply PairRM for response reranking. 3. Update the LLM by applying Direct Preference Optimization (DPO) on the top (chosen) and bottom (rejected) responses. 4. Use this LLM as the base model for the next iteration, repeating three times in total.
Note: green Score (e.g. "73.2") means that the model is better than LoneStriker/Snorkel-Mistral-PairRM-DPO-8.0bpw-h8-exl2.
Rank the Snorkel Mistral PairRM DPO 8.0bpw H8 EXL2 Capabilities
๐ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐
Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation
What open-source LLMs or SLMs are you in search of? 52721 in total.