| Model Type | | auto-regressive language model |
|
| Use Cases |
| Areas: | |
| Primary Use Cases: | | Zero-shot common sense reasoning tasks |
|
|
| Additional Notes | | MobileLLM integrated several key techniques such as SwiGLU activation function, deep and thin architectures, embedding sharing, and grouped-query attention. |
|
| Training Details |
| Data Sources: | | Publicly available online data. |
|
| Data Volume: | |
| Context Length: | |
| Hardware Used: | |
| Model Architecture: | | MobileLLM is an auto-regressive language model leveraging an optimized transformer architecture with techniques such as SwiGLU activation function, deep and thin architectures, embedding sharing, and grouped-query attention. |
|
|
| Input Output |
| Input Format: | |
| Accepted Modalities: | |
| Output Format: | |
|