| Model Type | | text-generation, decoder-only |
|
| Use Cases |
| Areas: | |
| Applications: | | Text Generation, Prompt-based Evaluation |
|
| Primary Use Cases: | | Text generation using CLM objective |
|
| Limitations: | | High possibility of bias and quality issues like hallucination and lack of diversity |
|
|
| Additional Notes | | OPT models aim to enable reproducible and responsible research. |
|
| Supported Languages | | English (Predominantly supported), Non-English (Small amount in training corpus) |
|
| Training Details |
| Data Sources: | | BookCorpus, CC-Stories, The Pile, Pushshift.io Reddit dataset, CCNewsV2 |
|
| Data Volume: | |
| Methodology: | | Causal Language Modeling (CLM) |
|
| Context Length: | |
| Training Time: | |
| Hardware Used: | |
| Model Architecture: | | Decoder-only, similar to GPT-3 |
|
|
| Responsible Ai Considerations |
| Mitigation Strategies: | | Model may have bias due to unfiltered internet data. |
|
|
| Input Output |
| Input Format: | | Sequences of 2048 consecutive tokens, tokenized using GPT2 BPE with a vocabulary of 50272. |
|
| Accepted Modalities: | |
| Output Format: | |
| Performance Tips: | | Use the generate method directly for better performance with large models. |
|
|
| Release Notes |
| Version: | |
| Date: | |
| Notes: | | Initial release with sizes from 125M to 175B parameters. |
|
|
|