| Model Type |  | 
| Use Cases |  | 
| Additional Notes | | Pretrained base model with no moderation mechanisms. Mistral Nemo requires smaller temperatures during use; recommended temperature is 0.3. | 
 | 
| Supported Languages | | en (yes), fr (yes), de (yes), es (yes), it (yes), pt (yes), ru (yes), zh (yes), ja (yes) | 
 | 
| Training Details | 
| Data Sources: | | multilingual and code data | 
 |  | Context Length: |  |  | Model Architecture: | | Transformer - 40 layers, 5,120 dimensions, 32 heads, 128 head dim, 14,436 hidden dim, SwiGLU activation, 128k vocabulary, Rotary embeddings | 
 |  |