| Model Type | | text-generation, instruction-tuned |
|
| Use Cases |
| Areas: | |
| Applications: | | Cross-Lingual Adaptation, Instruction Following |
|
| Primary Use Cases: | | Text Generation, Language Translation |
|
| Limitations: | | Not fine-tuned for specific human intent and safety considerations |
|
|
| Additional Notes | | Developed by multiple team members from TokyoTech-LLM, with acknowledgements to Meta Research for Llama 2. |
|
| Supported Languages | | Japanese (Proficient), English (Proficient) |
|
| Training Details |
| Data Sources: | | OpenAssistant Conversations Dataset EN top-1 thread, OpenAssistant Conversations Dataset |
|
| Methodology: | | Supervised fine-tuning (SFT) |
|
| Model Architecture: | | Please refer to LLaMA-2 technical report for details on the model architecture. |
|
|
| Input Output |
| Input Format: | | ~~[INST] <>
{SYSTEM_PROMPT}
<>
{USER_MESSAGE} [/INST] |
|
| Accepted Modalities: | |
| Output Format: | |
| Performance Tips: | | Adhere strictly to instruction format to maintain performance. |
|
|
| Release Notes |
| Version: | |
| Date: | |
| Notes: | | Release of enhanced instruction-tuned models as preview versions. |
|
| Version: | |
| Date: | |
| Notes: | | Trained with approximately twice as many Japanese tokens. |
|
| Version: | |
| Date: | |
| Notes: | | Model release with no vocabulary expansion. |
|
| Version: | |
| Date: | |
| Notes: | | Release of various instruct-hf models as well as no vocabulary expansion models. |
|
| Version: | |
| Date: | |
| Notes: | | Initial release of Swallow 7b, 13b, and 70b in instruct hf variants. |
|
|
|