| Model Type | | text generation, causal-lm |
|
| Use Cases |
| Areas: | |
| Primary Use Cases: | | Research into LLMs, Foundation model for NLP applications, ethics, and alignment research. |
|
|
| Supported Languages | |
| Training Details |
| Data Sources: | |
| Data Volume: | |
| Methodology: | | GPT-3 style architecture, Full attention |
|
| Context Length: | |
| Hardware Used: | | Cerebras Andromeda AI supercomputer (16 CS-2 wafer scale systems) |
|
| Model Architecture: | | Transformer-based, GPT-3 style |
|
|
| Safety Evaluation |
| Ethical Considerations: | | Model was trained on the Pile dataset which was analyzed for ethical issues such as toxicity and bias. |
|
|
| Responsible Ai Considerations |
| Fairness: | | Potential for distributional bias from the Pile dataset. |
|
| Accountability: | | Developers are accountable for the model's outputs when using in production. |
|
| Mitigation Strategies: | | Standard Pile dataset preprocessing mitigations were employed. |
|
|
| Input Output | |