| Model Type | | text generation, decoder-only model |
|
| Use Cases |
| Areas: | | Research, Text Generation |
|
| Primary Use Cases: | | prompting for evaluation, text generation |
|
| Limitations: | | bias, toxicity, generation diversity issues, hallucination |
|
| Considerations: | | Fine-tuned models will inherit biases from the base model. |
|
|
| Supported Languages | |
| Training Details |
| Data Sources: | | BookCorpus, CC-Stories, The Pile including subsets like Pile-CC, OpenWebText2, USPTO, Project Gutenberg, OpenSubtitles, Wikipedia, DM Mathematics, HackerNews, Pushshift.io Reddit, CCNewsV2 |
|
| Data Volume: | |
| Methodology: | | Pretrained using a causal language modeling (CLM) objective |
|
| Context Length: | |
| Training Time: | |
| Hardware Used: | |
| Model Architecture: | |
|
| Safety Evaluation |
| Risk Categories: | | bias, toxicity, safety issues |
|
| Ethical Considerations: | | The training data contains unfiltered content, which is not neutral leading to biased outputs. |
|
|
| Responsible Ai Considerations |
| Transparency: | | Data sources and limitations mentioned, but specifics of transparency not detailed. |
|
|
| Input Output |
| Accepted Modalities: | |
| Output Format: | |
|