Model Type | Transformer-based Language Model |
|
Use Cases |
Areas: | Scientific Research, Interpretability Research |
|
Applications: | Research on behavior, functionality, limitations of large language models |
|
Primary Use Cases: | Controlled scientific experiments |
|
Limitations: | Not suitable for human-facing interactions, English language-only models, unsuitable for generating text in other languages, Not fine-tuned for genre prose or commercial chatbots |
|
Considerations: | Conduct risk and bias assessment if fine-tuning; evaluate risks before deployment |
|
|
Additional Notes | Pythia model suite renamed in January 2023 for clarity |
|
Supported Languages | |
Training Details |
Data Sources: | The Pile, 22 diverse sources including arXiv, CommonCrawl, Project Gutenberg, YouTube subtitles, GitHub |
|
Data Volume: | |
Model Architecture: | |
|
Responsible Ai Considerations |
Fairness: | Documented biases with regards to gender, religion, and race (as per Pile paper). |
|
|
Input Output |
Input Format: | String of text for next token prediction. |
|
Accepted Modalities: | |
Output Format: | String (one token at a time) |
|
|
Release Notes |
Version: | |
Date: | |
Notes: | Pyhtia-160M retrained to address hyperparameter discrepancies |
|
|
|