| Model Type | | text-to-text, decoder-only, language | 
 | 
| Use Cases | 
| Areas: | | Content Creation and Communication, Research and Education | 
 |  | Applications: | | Text Generation, Chatbots and Conversational AI, Text Summarization, NLP Research, Language Learning Tools, Knowledge Exploration | 
 |  | Limitations: | | Training Data quality and scope, Context and Task Complexity, Language Ambiguity and Nuance, Factual Accuracy, Common Sense | 
 |  | Considerations: |  |  | 
| Supported Languages |  | 
| Training Details | 
| Data Sources: | | Web Documents, Code, Mathematics | 
 |  | Data Volume: | | 8 trillion tokens for 9B model | 
 |  | Methodology: |  |  | Hardware Used: |  |  | 
| Safety Evaluation | 
| Methodologies: | | structured evaluations, internal red-teaming | 
 |  | Findings: | | acceptable thresholds for internal policies | 
 |  | Risk Categories: | | Text-to-Text Content Safety, Text-to-Text Representational Harms, Memorization, Large-scale harm | 
 |  | 
| Responsible Ai Considerations | 
| Fairness: | | LLMs trained on large-scale, real-world text data can reflect socio-cultural biases. | 
 |  | Transparency: | | Model card present with details on architecture, capabilities, limitations, and evaluation processes. | 
 |  | Mitigation Strategies: | | Mechanisms and guidelines for content safety provided for developers. | 
 |  | 
| Input Output | 
| Input Format: |  |  | Accepted Modalities: |  |  | Output Format: | | Generated English-language text | 
 |  |