Model Type | causal language model, text generation |
|
Use Cases |
Areas: | research, commercial applications |
|
Primary Use Cases: | causal language modeling, text-generation tasks |
|
Limitations: | Limited to training data biases; No bias and toxicity estimates currently available. |
|
Considerations: | Model is trained on data that may contain biases and should be used with caution. |
|
|
Additional Notes | Small amount of English data was retained to prevent catastrophic forgetting. |
|
Supported Languages | en (High), es (High), ca (High) |
|
Training Details |
Data Sources: | Wikipedia, C4_es, Biomedical, Legal, Gutenberg, C4_ca, RacoCatalà Noticias, RacoCatalà Forums, CaWaC, Vilaweb |
|
Data Volume: | |
Methodology: | Adapted by swapping the tokenizer and adjusting the embedding layer. |
|
Training Time: | |
Hardware Used: | 8 NVIDIA H100 GPUs with 80GB RAM |
|
Model Architecture: | Byte-Pair Encoding (BPE) tokenizer with 50,257 tokens. |
|
|
Responsible Ai Considerations |
Fairness: | No measures have been taken to estimate bias and toxicity at the time of submission. |
|
Transparency: | Model is provided with documentation of its creation and intended use. |
|
Accountability: | Accountability lies with the users deploying the model. |
|
Mitigation Strategies: | The model users should aim to mitigate risks associated with bias and toxicity. |
|
|
Input Output |
Input Format: | |
Accepted Modalities: | |
Output Format: | |
|
Release Notes | |