| Model Type | | text generation, auto-regressive |
|
| Use Cases |
| Areas: | |
| Applications: | | Multitask language understanding, code generation |
|
| Limitations: | | Contains biases from internet-crawled data, May emit toxic or biased content |
|
| Considerations: | | Developers must ensure use case requirements and address model misuse |
|
|
| Additional Notes | | Performance improvements achieved with fewer training tokens per model. |
|
| Supported Languages | | languages (/text), proficiency_levels (English and multilingual, including code) |
|
| Training Details |
| Data Sources: | | continuous pre-training data corpus from Nemotron-4 15B |
|
| Data Volume: | |
| Methodology: | | Pruning, knowledge distillation, continued training |
|
| Training Time: | |
| Hardware Used: | |
| Model Architecture: | |
|
| Responsible Ai Considerations |
| Fairness: | | Efforts required by developers to ensure meeting industry and use case requirements. |
|
| Accountability: | | Shared responsibility promoted by NVIDIA |
|
| Mitigation Strategies: | | Establish policies and practices to address product misuse. |
|
|
| Input Output |
| Input Format: | |
| Output Format: | |
|
| Release Notes |
| Version: | |
| Date: | | February 2024 - June 2024 |
|
| Notes: | | Pruned and distilled from Nemotron-4 15B, achieving compute cost savings and improved performance. |
|
|
|