| Model Type | | GPT, pre-trained, instruction-based |
|
| Use Cases |
| Areas: | | Research, Commercial Applications |
|
| Applications: | | General Language Modeling, Domain-Specific Instruction Modeling |
|
| Primary Use Cases: | | Domain adaptation in finance and biomedicine, Synthesizing instruction-response pairs |
|
| Limitations: | | No specific finance data due to ethical concerns |
|
|
| Additional Notes | | Demonstrates the effectiveness of supervised multitask pre-training using instruction-response pairs. |
|
| Supported Languages | |
| Training Details |
| Data Sources: | | tiiuae/falcon-refinedweb, instruction-pretrain/ft-instruction-synthesizer-collection, instruction-pretrain/general-instruction-augmented-corpora |
|
| Data Volume: | |
| Methodology: | | Supervised multitask pre-training using instruction-response pairs |
|
| Model Architecture: | | Instruction-based GPT model |
|
|
| Release Notes |
| Version: | |
| Date: | |
| Notes: | | Paper accepted at EMNLP 2024 main conference. |
|
| Version: | |
| Date: | |
| Notes: | | Updated FAQ on continual pre-training from Llama3. |
|
| Version: | |
| Date: | |
| Notes: | | Updated guidelines on domain-specific tasks evaluation. |
|
| Version: | |
| Date: | |
| Notes: | | Scaled up pre-trained tokens to 250B, with 500M instruction-response pairs. |
|
| Version: | |
| Date: | |
| Notes: | | Released paper, code, and resources. |
|
|
|