Model Type | text-generation, causal language model |
|
Use Cases |
Areas: | |
Primary Use Cases: | |
|
Additional Notes | The model is intended for a generalist purpose and is available under a permissive license. It may have biases and users deploying the model should mitigate risks. |
|
Supported Languages | en (English), es (Spanish), ca (Catalan) |
|
Training Details |
Data Sources: | Wikipedia, C4, Biomedical, Legal, Gutenberg, RacoCatalà Noticias, RacoCatalà Forums, CaWaC, Vilaweb |
|
Data Volume: | |
Methodology: | Language adaptation with BPE tokenizer and embeddings update |
|
Model Architecture: | |
|