| Model Type | | text-to-text, decoder-only, large language model | 
 | 
| Use Cases | 
| Areas: | | Research, Commercial Applications | 
 |  | Applications: | | Text Generation, Chatbots and Conversational AI, Text Summarization | 
 |  | Primary Use Cases: | | Content Creation and Communication, Research and Education | 
 |  | Limitations: | | Context and Task Complexity, Language Ambiguity, Factual Inaccuracies | 
 |  | Considerations: | | Adhering to privacy regulations, continuous monitoring for bias, content safety mechanisms. | 
 |  | 
| Additional Notes | | Training used multilingual, diverse data including code and mathematics. | 
 | 
| Supported Languages |  | 
| Training Details | 
| Data Sources: | | Web Documents, Code, Mathematics | 
 |  | Data Volume: | | 2 trillion tokens for 2B model | 
 |  | Methodology: | | Text-to-text, decoder-only | 
 |  | Hardware Used: |  |  | Model Architecture: | | Lightweight state-of-the-art open model | 
 |  | 
| Safety Evaluation | 
| Methodologies: | | Red-teaming, Human evaluation, Automated evaluation | 
 |  | Risk Categories: | | Misinformation, Bias, Dangerous capabilities, Memorization | 
 |  | Ethical Considerations: | | Bias and fairness concerns, misinformation risks, transparency and accountability. | 
 |  | 
| Responsible Ai Considerations | 
| Fairness: | | Models evaluated for socio-cultural biases. | 
 |  | Transparency: | | Summary details on architecture, capabilities, and limitations provided. | 
 |  | Accountability: | | Guidelines provided for responsible use. | 
 |  | Mitigation Strategies: | | Evaluations and automated techniques to filter sensitive data from training sets. | 
 |  | 
| Input Output | 
| Input Format: |  |  | Accepted Modalities: |  |  | Output Format: | | Generated English-language text | 
 |  | Performance Tips: | | Better performance with clear prompts and instruction. | 
 |  | 
| Release Notes | | 
| Version: |  |  | Notes: | | Lightweight open model, trained on 2 trillion tokens. | 
 |  | 
 |