| Model Type | | text-generation, trilingual |
|
| Use Cases |
| Areas: | | research, multilingual applications |
|
| Limitations: | | max_seq_length = 2048, float16, vocab size: 150 016 |
|
|
| Supported Languages | | hu (native), en (native), zh (native) |
|
| Training Details |
| Data Sources: | | Hungarian corpus, English corpus, Github, Chinese corpus |
|
| Data Volume: | | Hungarian: 41.5B words (314 GB), English: 61.9B words (391 GB), Chinese: 98.7B Chinese characters (340 GB), 6 million Github documents (33 GB) |
|
| Methodology: | | Trained with EleutherAI's GPT-NeoX |
|
| Context Length: | |
| Model Architecture: | |
|