| Model Type | | text-generation, multimodal |
|
| Additional Notes | | Model trained on unfiltered internet data, may contain objectionable content. |
|
| Supported Languages | | en (high), zh (high), ja (high), de (high) |
|
| Training Details |
| Data Sources: | | synthetic dataset generated using large context windows, retrieval-augmented generation, and knowledge graph integration |
|
| Data Volume: | |
| Methodology: | | fine-tuning using a synthesis dataset |
|
| Context Length: | |
| Training Time: | | < 1 day on 16 nodes of 8*A100-80G |
|
| Hardware Used: | | 16 nodes of 8*A100-80G GPUs |
|
|
| Input Output |
| Input Format: | | Accepts text and image modalities. |
|
| Accepted Modalities: | |
| Performance Tips: | | Use a standardized implementation for inference to avoid performance degradation. For fewer hallucinations, use top_p=0.8 and temperature=0.3, or temperature=0.2. |
|
|