| Model Type | |
| Use Cases |
| Areas: | | Research, Commercial Applications |
|
| Applications: | | Text generation, Prompting for downstream tasks |
|
| Primary Use Cases: | | Text generation, Prompting for evaluation of downstream tasks |
|
| Limitations: | | Bias in training data, Quality issues in generation diversity and hallucination. |
|
| Considerations: | | Bias in training data can affect fine-tuned versions. |
|
|
| Additional Notes | | The model card discusses ethical considerations related to model biases due to the nature of the training data. |
|
| Training Details |
| Data Sources: | | BookCorpus, CC-Stories, The Pile, Pushshift.io Reddit, CCNewsV2 |
|
| Data Volume: | |
| Methodology: | | Causal language modeling (CLM) using GPT2 byte-level BPE. |
|
| Context Length: | |
| Training Time: | | 33 days of continuous training |
|
| Hardware Used: | |
| Model Architecture: | | Open Pretrained Transformers (OPT), a suite of decoder-only pre-trained transformers. |
|
|
| Safety Evaluation |
| Methodologies: | | Evaluation using prompts similar to GPT-3 |
|
| Findings: | | Model is strongly biased, can have quality issues such as hallucination. |
|
| Risk Categories: | |
| Ethical Considerations: | | Data contains unfiltered content from the internet leading to biases. |
|
|
| Responsible Ai Considerations |
| Fairness: | | Acknowledges bias in training data. |
|
| Transparency: | | Bias and safety acknowledged in official model card. |
|
| Accountability: | | Encouraging responsible AI research by making models available for study. |
|
| Mitigation Strategies: | | Sharing models to allow broader study and understanding of biases. |
|
|
| Input Output |
| Input Format: | |
| Accepted Modalities: | |
| Output Format: | |
| Performance Tips: | | Using top-k sampling by setting `do_sample` to `True` for non-deterministic generation. |
|
|