| Model Type | | Transformer Decoder (Auto-regressive Language Model) |
|
| Use Cases |
| Primary Use Cases: | | Roleplaying, Retrieval augmented generation, Function calling |
|
| Limitations: | | Model may amplify societal biases and return toxic responses, May produce inaccurate or unacceptable text |
|
| Considerations: | | Validate the imported packages are from a trusted source for end-to-end security. |
|
|
| Training Details |
| Methodology: | | Multi-stage SFT and preference-based alignment with NeMo Aligner |
|
| Context Length: | |
| Model Architecture: | | Transformer Decoder with Grouped-Query Attention (GQA) and Rotary Position Embeddings (RoPE), 40 layers, 32 attention heads. |
|
|
| Safety Evaluation |
| Methodologies: | | Garak automated LLM vulnerability scanner, AEGIS content safety evaluation, Human Content Red Teaming |
|
| Ethical Considerations: | | NVIDIA encourages working with internal model team to ensure model meets specific industry and use case requirements. |
|
|
| Input Output |
| Input Format: | | System {system prompt} User {prompt} Assistant\n |
|
| Performance Tips: | | The model may not perform optimally without the recommended prompt template. |
|
|