| Model Type | | decoder-only, transformer, language model |
|
| Use Cases |
| Areas: | | Research, Evaluation of Large Language Models in Nordic languages |
|
| Limitations: | | Bias and safety limitations, Possible content inaccuracies and irrelevance, Generation diversity issues, Potential for generating offensive, inappropriate content |
|
| Considerations: | | Includes data diversity concerns and requires feedback mechanism for affected individuals. |
|
|
| Supported Languages | | languages_supported (da, sv, no, en, is), proficiency_level (fluent) |
|
| Training Details |
| Data Sources: | | Books from Litteraturbanken, The Pile, Articles from Diva, The Pile: PubMed, The Pile: ArXiv, Code from Code Parrot: Github, Pushshift.io Reddit dataset, English Math dataset, Swedish Math dataset, Summarization data, OPUS, Movie scripts, Natural Instructions, P3, The Norwegian Colossal Corpus, Danish Gigaword, Icelandic Gigaword, The Pile: Stack Exchange, Web Common Crawl, MC4, OSCAR, Open Web Text, Miscellaneous public Swedish websites, Familjeliv Articles, Public Swedish Job Ads, Wikipedia |
|
| Data Volume: | |
| Methodology: | | Pretrained using a causal language modeling objective |
|
| Model Architecture: | |
|
| Responsible Ai Considerations |
| Fairness: | | The model has limitations regarding bias and safety. |
|
| Transparency: | | Communication and transparency around usage is encouraged. |
|
| Mitigation Strategies: | | Controlled pre-release; feedback collection from Nordic NLP ecosystem. |
|
|
| Release Notes | |