| Model Type | | Transformer-based Language Model, text-generation |
|
| Use Cases |
| Areas: | | research, commercial applications |
|
| Limitations: | | The models have not been tuned to ensure outputs align with human intent and safety considerations. |
|
| Considerations: | | The models are still in the early stages of development. |
|
|
| Supported Languages | |
| Training Details |
| Data Sources: | | Japanese Wikipedia, Common Crawl, English Wikipedia, The Pile, The Stack |
|
| Data Volume: | |
| Context Length: | |
| Hardware Used: | | 128 A100 40GB GPUs, 8 A100 40GB GPUs |
|
| Model Architecture: | |
|
| Input Output |
| Input Format: | |
| Accepted Modalities: | |
| Output Format: | |
|