| Model Type | | text-generation, causal language model |
|
| Use Cases |
| Areas: | |
| Primary Use Cases: | |
|
| Additional Notes | | The model is intended for a generalist purpose and is available under a permissive license. It may have biases and users deploying the model should mitigate risks. |
|
| Supported Languages | | en (English), es (Spanish), ca (Catalan) |
|
| Training Details |
| Data Sources: | | Wikipedia, C4, Biomedical, Legal, Gutenberg, RacoCatalà Noticias, RacoCatalà Forums, CaWaC, Vilaweb |
|
| Data Volume: | |
| Methodology: | | Language adaptation with BPE tokenizer and embeddings update |
|
| Model Architecture: | |
|