| Model Type | |
| Use Cases |
| Areas: | | research, commercial applications |
|
| Limitations: | | unreliable outputs, unsafe behaviors, offensive content |
|
| Considerations: | | The model may exhibit undesirable behaviors and should be fine-tuned and evaluated for specific applications. |
|
|
| Additional Notes | | For questions and comments about the model, email: lm@stability.ai |
|
| Supported Languages | | en (fluent), de (fluent), es (fluent), fr (fluent), it (fluent), nl (fluent), pt (fluent) |
|
| Training Details |
| Data Sources: | | tiiuae/falcon-refinedweb, togethercomputer/RedPajama-Data-1T, uonlp/CulturaX, CarperAI/pilev2-dev, bigcode/starcoderdata, DataProvenanceInitiative/Commercially-Verified-Licenses |
|
| Data Volume: | |
| Context Length: | |
| Hardware Used: | | 384 NVIDIA H100 GPUs (AWS P5 instances) |
|
| Model Architecture: | | decoder-only transformer with 12.1 billion parameters, 5120 hidden size, 40 layers, and 32 attention heads. |
|
|
| Input Output |
| Input Format: | |
| Accepted Modalities: | |
| Output Format: | | generated text from the model |
|
|