| Model Type | | text generation, code generation |
|
| Use Cases |
| Areas: | |
| Applications: | |
| Primary Use Cases: | | QA format, chat format, code format |
|
| Limitations: | | Generate Inaccurate Code and Facts, Limited Scope for code, Unreliable Responses to Instruction, Language Limitations, Potential Societal Biases, Toxicity, Verbosity |
|
|
| Additional Notes | | Phi-2 is intended for QA, chat, and code purposes. Model-generated text/code should be treated as a starting point. Users should be cautious when employing these models in applications. |
|
| Supported Languages | |
| Training Details |
| Data Sources: | | Phi-1.5, NLP synthetic texts, filtered websites |
|
| Data Volume: | |
| Methodology: | | Transformer-based model with next-word prediction objective |
|
| Context Length: | |
| Training Time: | |
| Hardware Used: | |
| Model Architecture: | | Transformer-based model with next-word prediction objective |
|
|
| Input Output |
| Accepted Modalities: | |
| Performance Tips: | | Phi-2 has an attention overflow issue (with FP16). If encountering this issue, enable/disable autocast on the PhiAttention.forward() function. |
|
|