Use Cases |
Areas: | academic research, commerical use |
|
Applications: | multilingual tasks, text generation, dialogue, summarization |
|
Primary Use Cases: | Chinese question answering, English question answering, language comprehension, common sense questioning, logical reasoning, math solving, coding |
|
Limitations: | may produce inaccurate, biased, or offensive content |
|
Considerations: | Developers should conduct safety tests before deployment. |
|
|
Supported Languages | en (54.91), zh (31.09), ru (3.15), ja (3.22), de (1.52), es (0.91), fr (0.73), pl (0.48), it (0.36), pt (0.34), nl (0.20), cs (0.27), sv (0.15), ko (0.18), fi (0.14), ar (0.12), ro (0.11), bg (0.10), th (0.10), da (0.09), hu (0.19), no (0.07), hi (0.07), iw (0.06), fa (0.07), sl (0.05), et (0.04), lv (0.03), sk (0.08), ms (0.05), ca (0.06), sr (0.03), tr (0.23), uk (0.24), id (0.13), mr (0.08), lt (0.05), kk (0.02), ta (0.03) |
|
Training Details |
Data Sources: | web pages, code, encyclopedia, books, academic papers, QA, other |
|
Data Volume: | |
Methodology: | FlashAttention2, 3D parallelism with virtual pipeline |
|
Context Length: | |
Hardware Used: | A800 80G GPU, 1500GB memory for training |
|
Model Architecture: | |
|