| Model Type | | language model, instruction tuned |
|
| Use Cases |
| Limitations: | | Possibility of producing inaccurate, harmful, biased, or objectionable outputs |
|
| Considerations: | | Users should undertake thorough safety testing and implement filtering mechanisms. |
|
|
| Additional Notes | | Provides full training, fine-tuning, and evaluation procedures with checkpoints. |
|
| Training Details |
| Data Sources: | | RefinedWeb, deduplicated PILE, RedPajama (subset), Dolma v1.6 (subset) |
|
| Data Volume: | |
| Methodology: | | Layer-wise scaling strategy |
|
| Model Architecture: | |
|
| Input Output |
| Performance Tips: | | Use smaller models as assistants for speculative generation. |
|
|