| LLM Name | PowerMoE 3B | 
| Repository ๐ค | https://huggingface.co/ibm-research/PowerMoE-3b | 
| Model Size | 3b | 
| Required VRAM | 13.5 GB | 
| Updated | 2025-09-23 | 
| Maintainer | ibm-research | 
| Model Type | granitemoe | 
| Model Files | |
| Model Architecture | GraniteMoeForCausalLM | 
| License | apache-2.0 | 
| Context Length | 4096 | 
| Model Max Length | 4096 | 
| Transformers Version | 4.44.0.dev0 | 
| Tokenizer Class | GPT2Tokenizer | 
| Padding Token | <|endoftext|> | 
| Vocabulary Size | 49152 | 
| Torch Data Type | float32 | 
| Activation Function | silu | 
| Best Alternatives | Context / RAM | Downloads | Likes | 
|---|---|---|---|
| Granite 3.1 3B A800m Instruct | 128K / 6.6 GB | 13462 | 27 | 
| Granite 3.1 3B A800m Base | 128K / 6.6 GB | 3483 | 8 | 
| Granite Guardian 3.2 3B A800m | 128K / 6.6 GB | 2582 | 5 | 
| Granite 3.0 3B A800m Instruct | 4K / 6.8 GB | 1678 | 20 | 
| Granite 3.0 3B A800m Base | 4K / 13.5 GB | 2475 | 5 | 
| PowerMoE 3B | 4K / 13.5 GB | 15341 | 11 | 
๐ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐