| Model Type | | text-to-text, decoder-only, large language model |
|
| Use Cases |
| Areas: | | Content Creation and Communication, Research and Education |
|
| Primary Use Cases: | | Text Generation, Chatbots and Conversational AI, Text Summarization |
|
| Limitations: | | Biases in training data, Context complexity, Language ambiguity, Factual inaccuracies |
|
| Considerations: | | Perform continuous monitoring using evaluation metrics and human review, apply de-biasing techniques. |
|
|
| Supported Languages | |
| Training Details |
| Data Sources: | | Web Documents, Code, Mathematics |
|
| Data Volume: | | 27B model was trained with 13 trillion tokens |
|
| Methodology: | | Trained with JAX and ML Pathways |
|
| Hardware Used: | |
| Model Architecture: | |
|
| Safety Evaluation |
| Methodologies: | | Red-teaming, Human evaluation, Benchmark against relevant academic datasets |
|
| Risk Categories: | | Text-to-Text Content Safety, Text-to-Text Representational Harms, Memorization, Large-scale harm |
|
| Ethical Considerations: | | Evaluation Results indicate within acceptable thresholds for meeting internal policies for categories such as child safety, content safety, representational harms, memorization, and large-scale harms. |
|
|
| Responsible Ai Considerations |
| Fairness: | | LLMs trained on large-scale, real-world text data can reflect socio-cultural biases embedded in the training material. |
|
| Transparency: | | This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes. |
|
| Accountability: | | Developers should monitor for harmful content and biases in outputs. |
|
| Mitigation Strategies: | | Guidelines for responsible use, content safety mechanisms, adherence to privacy regulations. |
|
|
| Input Output |
| Input Format: | | Text string, such as a question, a prompt, or a document to be summarized. |
|
| Accepted Modalities: | |
| Output Format: | | Generated English-language text in response to the input. |
|
| Performance Tips: | | Use appropriate dtype for hardware capabilities, try Flash Attention 2 for performance increases. |
|
|