| Model Type | | text-to-text, text-to-code, decoder-only |
|
| Use Cases |
| Areas: | | Research, Commercial applications |
|
| Applications: | | Code Completion, Code Generation, Code Conversation, Code Education |
|
| Primary Use Cases: | | Code completion with IDE extension, Interactive code learning experiences |
|
| Limitations: | | Limitations of LLMs based on training data., Potential representational harms. |
|
| Considerations: | | See Gemma model card for comprehensive considerations. |
|
|
| Additional Notes | | The model is built for Responsible AI development with a focus on open code applications. |
|
| Supported Languages | |
| Training Details |
| Data Sources: | | Publicly available code repositories, Open source mathematics datasets, Synthetically generated code |
|
| Data Volume: | |
| Methodology: | |
| Hardware Used: | |
| Model Architecture: | |
|
| Safety Evaluation |
| Methodologies: | | Internal red-teaming, Structured evaluations |
|
| Risk Categories: | | Human safety, Representational harms, Cyber-offence capabilities |
|
| Ethical Considerations: | | Testing autonomous hacking capabilities and ensuring potential harms are limited. |
|
|
| Responsible Ai Considerations |
| Fairness: | | Human evaluation on prompts covering content safety and representational harms. |
|
| Transparency: | | Discussions and evaluations are detailed in the Gemma model card. |
|
| Accountability: | | Developed by Google, accountable for outputs under their AI principles. |
|
| Mitigation Strategies: | | Controlled through structured evaluations and internal red-teaming. |
|
|
| Input Output |
| Input Format: | | For pretrained model: code prefix and/or suffix for code completion and generation. |
|
| Accepted Modalities: | |
| Output Format: | | For instruction-tuned model: code and natural language |
|
| Performance Tips: | | Ensure correct usage of FIM tokens in prompts. |
|
|