| Model Type | |
| Use Cases |
| Considerations: | | It's recommended to use at least 16k context due to long response lengths. |
|
|
| Additional Notes | | The model produces verbose and long reasoning responses, offering a detailed step-by-step explanation of topics such as math proofs. |
|
| Supported Languages | | en (English), de (German), fr (French), it (Italian), pt (Portuguese), hi (Hindi), es (Spanish), th (Thai) |
|
| Training Details |
| Data Sources: | | leafspark/DetailedReflection-Claude-v3_5-Sonnet |
|
| Data Volume: | | 81 examples, each approximately 3000 tokens |
|
| Methodology: | | Unsloth approach with LoRA Rank: 128, Packing: enabled, Batch size: 2, Gradient accumulation steps: 4, Epochs: 3, Steps: 30 |
|
| Context Length: | |
| Training Time: | |
| Hardware Used: | |
|
| Input Output |
| Input Format: | | Prompts should be formatted to begin with and utilize nested XML tags for the reasoning process and response generation. |
|
| Accepted Modalities: | |
| Output Format: | |
| Performance Tips: | | Use recommended sampling parameters (Temperature: 0.15, Min-P: 0.2, Top-K: 50, Top-P: 1, Frequency Penalty: 0.5, Presence Penalty: 0.1) for coherent responses. |
|
|