Model Type | large-scale pre-trained, text generation |
|
Use Cases |
Areas: | Research, Non-commercial applications |
|
Applications: | Translation, Programming, Text classification, Information extraction, Summarization, Copywriting, Common sense Q&A, Mathematics |
|
Limitations: | Cannot be used for commercial purposes due to LLaMA license |
|
|
Additional Notes | Model weights differences published due to licensing restrictions, requiring user intervention to obtain complete weights. |
|
Supported Languages | |
Training Details |
Data Sources: | openwebtext, Books, Wikipedia, Code, Cleaned Wudao dataset, self-built Chinese dataset |
|
Data Volume: | |
Methodology: | Continual pretraining, multi-task supervised fine-tuning, human feedback learning |
|
Training Time: | |
Hardware Used: | |
Model Architecture: | Enhanced LLaMA with 8,000 Chinese characters |
|
|
Responsible Ai Considerations |
Transparency: | Loss curve during training released to help understand potential issues. |
|
|
Input Output |
Input Format: | Tokenized text (LlamaTokenizer) |
|
Accepted Modalities: | |
Output Format: | |
|