| Model Type | | decoder-only transformer, text generation |
|
| Use Cases |
| Areas: | | research, commercial applications |
|
| Applications: | | text generation, natural language understanding |
|
| Primary Use Cases: | | translation, code generation |
|
| Limitations: | | Limited proficiency outside supported languages |
|
|
| Additional Notes | | Base model, requiring further fine-tuning for specific use cases. |
|
| Supported Languages | | fi (fluent), en (fluent), da (fluent), sv (fluent), no (fluent), nn (fluent), is (fluent) |
|
| Training Details |
| Data Sources: | | cerebras/SlimPajama-627B, bigcode/starcoderdata, mc4 |
|
| Data Volume: | |
| Context Length: | |
| Training Time: | |
| Hardware Used: | |
| Model Architecture: | | GPT-like with rotary positional embeddings and flash attention |
|
|
| Responsible Ai Considerations |
| Fairness: | | May produce outputs that are inaccurate, prejudiced, or controversial due to its training data. |
|
| Mitigation Strategies: | | Users should consider additional evaluation and customization. |
|
|
| Input Output |
| Input Format: | |
| Accepted Modalities: | |
| Output Format: | |
|
| Release Notes |
| Version: | |
| Date: | |
| Notes: | | Initial model release with partial training data. |
|
|
|