| Model Type | |
| Use Cases |
| Areas: | |
| Applications: | | Code generation, Assisting developers |
|
| Primary Use Cases: | | Text generation based on programming language inputs |
|
| Limitations: | | May not work with instruction-based commands, Generated code might contain bugs, Not guaranteed to work as intended |
|
|
| Additional Notes | | Requires proper attribution when utilizing generated code from the model due to licenses. |
|
| Supported Languages | | 600+ programming languages () |
|
| Training Details |
| Data Sources: | | The Stack v2, Arxiv, Wikipedia |
|
| Data Volume: | |
| Methodology: | | Grouped Query Attention, Sliding Window Attention, Fill-in-the-Middle objective |
|
| Context Length: | |
| Hardware Used: | | NVIDIA DGX H100, 1024 x H100 GPUs |
|
| Model Architecture: | |
|
| Input Output |
| Input Format: | |
| Output Format: | |
|