| Model Type |  | 
| Use Cases | 
| Areas: | | Research, Commercial applications | 
 |  | Limitations: | | Models not tuned to ensure outputs align with human intent and safety considerations. | 
 |  | 
| Additional Notes | | The models are continually pre-trained and instruction-tuned, emphasizing Japanese language capabilities. | 
 | 
| Supported Languages | | supported_languages_list (Japanese, English), languages_details (The Swallow model has undergone continual pre-training with the addition of Japanese language data.) | 
 | 
| Training Details | 
| Data Sources: | | Japanese Wikipedia, RefinedWeb, Swallow Corpus, The Pile | 
 |  | Methodology: | | Supervised fine-tuning (SFT) and instruction tuning using Anthropic HH-RLHF, Databricks Dolly 15-k, and OpenAssistant Conversations Dataset. | 
 |  | Model Architecture: | | Refer to LLaMA-2 technical report for details on the model architecture. | 
 |  | 
| Input Output | 
| Accepted Modalities: |  |  | Output Format: |  |  | Performance Tips: | | Model employs a tokenizer with a broadened vocabulary based on Japanese data, offering efficient text representation and faster inference. | 
 |  | 
| Release Notes | | 
| Version: |  |  | Date: |  |  | Notes: | | Release of Swallow-7b-instruct-v0.1, Swallow-13b-instruct-v0.1, and Swallow-70b-instruct-v0.1. | 
 |  
| Version: |  |  | Date: |  |  | Notes: | | Release of Swallow-7b-plus-hf with twice as many Japanese tokens as Swallow-7b-hf. | 
 |  
| Version: |  |  | Date: |  |  | Notes: | | Release of Swallow-13b-NVE-hf. | 
 |  
| Version: |  |  | Date: |  |  | Notes: | | Release of Swallow-7b-NVE-hf, Swallow-7b-NVE-instruct-hf, Swallow-70b-NVE-hf, Swallow-70b-NVE-instruct-hf. | 
 |  
| Version: |  |  | Date: |  |  | Notes: | | Release of Swallow-7b-hf, Swallow-7b-instruct-hf, Swallow-13b-hf, Swallow-13b-instruct-hf, Swallow-70b-hf, Swallow-70b-instruct-hf. | 
 |  | 
 |