| Use Cases |
| Areas: | | General-purpose chat, Assistance with writing and coding, Live chat agent, In-game NPC interactions |
|
| Applications: | | Conversational agents, Programming assistance, Customer support, Gaming |
|
| Primary Use Cases: | | Interactive dialogue, Text generation |
|
| Limitations: | | Potential for biases, Limited context window of 8192 tokens, Possibility of errors |
|
| Considerations: | | Users should verify critical information and be aware of potential biases. |
|
|
| Training Details |
| Data Sources: | | darkcloudai-smallmodel-frontieredition, darkcloudai-webdriver-redditcrawl-2023, darkcloudai-unalignment-truthfulness, darkcloudai-generaldpo, ai2_arc, allenai/ultrafeedback_binarized_cleaned, argilla/distilabel-intel-orca-dpo-pairs, jondurbin/airoboros-3.2, codeparrot/apps, facebook/belebele, bluemoon-fandom-1-1-rp-cleaned, boolq, camel-ai/biology, camel-ai/chemistry, camel-ai/math, camel-ai/physics, jondurbin/contextual-dpo-v0.1, jondurbin/gutenberg-dpo-v0.1, jondurbin/py-dpo-v0.1, jondurbin/truthy-dpo-v0.1, LDJnr/Capybara, jondurbin/cinematika-v0.1, WizardLM/WizardLM_evol_instruct_70k, glaiveai/glaive-function-calling-v2, grimulkan/LimaRP-augmented, lmsys/lmsys-chat-1m, ParisNeo/lollms_aware_dataset, TIGER-Lab/MathInstruct, Muennighoff/natural-instructions, openbookqa, kingbri/PIPPA-shareGPT, piqa, Vezora/Tested-22k-Python-Alpaca, ropes, cakiki/rosetta-code, Open-Orca/SlimOrca, b-mc2/sql-create-context, squad_v2, mattpscott/airoboros-summarization, migtissera/Synthia-v1.3, unalignment/toxic-dpo-v0.2, WhiteRabbitNeo/WRN-Chapter-1, WhiteRabbitNeo/WRN-Chapter-2, winogrande |
|
| Methodology: | | Direct Preference Optimization (DPO) and Reinforcement Learning from Artificial Intelligence Feedback (RLAIF) with traditional SFT (Supervised Fine-Tuning) |
|
| Context Length: | |
| Model Architecture: | |
|