Model Type | text generation, chat, roleplay, storywriting |
|
Use Cases |
Areas: | research, commercial applications |
|
Applications: | chat applications, storywriting, roleplay |
|
Primary Use Cases: | extended context chat, long-form text generation |
|
Limitations: | may generate biased or incomplete information, |
|
Considerations: | Suitable for tasks requiring extended context. |
|
|
Additional Notes | The model merges innovations such as SuperHOT to offer extended token contexts. |
|
Supported Languages | English (fluent), 20 other languages (basic) |
|
Training Details |
Data Sources: | |
Data Volume: | |
Methodology: | Training using LoRA merged with Kaio Ken's SuperHOT techniques |
|
Context Length: | |
Training Time: | |
Hardware Used: | |
Model Architecture: | Transformer-based architecture. |
|
|
Responsible Ai Considerations |
Fairness: | Considerations related to fairness and bias are not provided. |
|
Transparency: | Few transparency measures are mentioned in the context of setting sequence lengths and extending contexts. |
|
Accountability: | Accountability is not explicitly outlined in the model card. |
|
Mitigation Strategies: | The use of Alpaca data format for optimal performance. |
|
|
Input Output |
Input Format: | text based on Alpaca formatting |
|
Accepted Modalities: | |
Output Format: | |
Performance Tips: | Adapt sequences and scaling factors using 'trust_remote_code=True'. |
|
|
Release Notes |
Notes: | Merged with Kaio Ken's SuperHOT 8K to achieve 8K context during inference. |
|
|
|