Model Type | |
Use Cases |
Areas: | |
Primary Use Cases: | Chat models, Instruct models |
|
Limitations: | Limited language capabilities outside specified languages |
|
Considerations: | Implemented NTK-YaRN for extended context capabilities. |
|
|
Additional Notes | Trained with 3D parallelism and ZeRO on AWS SageMaker. |
|
Supported Languages | English (High), German (High), Spanish (High), French (High), Italian (Limited), Portuguese (Limited), Polish (Limited), Dutch (Limited), Romanian (Limited), Czech (Limited), Swedish (Limited) |
|
Training Details |
Data Sources: | OpenAssistant/oasst1, ehartford/dolphin, tau/sled, tiiuae/falcon-refinedweb, internal, internal-long-context |
|
Data Volume: | |
Methodology: | Supervised finetuning with a custom NTK-YaRN method for context length extension |
|
Context Length: | |
Hardware Used: | |
Model Architecture: | |
|
Input Output |
Input Format: | Prompts with integrated chat tokens for instruct and chat mode |
|
Accepted Modalities: | |
Output Format: | Generated text based on input queries |
|
Performance Tips: | Ensure correct integration of chat tokens in prompts for optimal performance. |
|
|