| Model Type | |
| Use Cases |
| Areas: | | Software development, Code generation |
|
| Applications: | | Programming assistance, Coding instruction |
|
| Primary Use Cases: | | Assisting programmers in writing code, Providing coding solutions based on instructions |
|
| Limitations: | | May not provide optimal solutions for complex problems., Performance is dependent on the quality of the input prompt. |
|
| Considerations: | | Preface input with 'Question: ' and finish it with 'Answer:' |
|
|
| Supported Languages | | programming_languages (80+ Programming languages) |
|
| Training Details |
| Data Sources: | | bigcode/commitpackft, bigcode/oasst-octopack |
|
| Data Volume: | | 1 trillion pretraining & 2M instruction tuning tokens |
|
| Methodology: | | Instruction Tuning on CommitPackFT and OASST |
|
| Training Time: | | Pretraining: 24 days, Instruction tuning: 4 hours |
|
| Hardware Used: | | Pretraining: 512 Tesla A100 GPUs, Instruction tuning: 8 Tesla A100 GPUs |
|
| Model Architecture: | | GPT-2 model with multi-query attention and Fill-in-the-Middle objective |
|
|
| Input Output |
| Input Format: | | Preface input with 'Question:' and finish with 'Answer:' |
|
| Accepted Modalities: | |
| Output Format: | |
| Performance Tips: | | Use quality prompts to improve output relevance. |
|
|
| Release Notes |
| Version: | |
| Notes: | | Introduction of OctoCoder with instruction tuning based on StarCoder. |
|
|
|