| Model Type | | Large Language Model (Llama architecture) |
|
| Use Cases |
| Areas: | |
| Applications: | |
| Limitations: | | Will output blatantly wrong information, Possible generation of inappropriate content, Not recommended for production use |
|
| Considerations: | | Consider further fine tuning and preference optimization before use |
|
|
| Additional Notes | | The model is primarily for experimentation and benchmarking rather than production use. Outputs correct German primarily. |
|
| Supported Languages | |
| Training Details |
| Data Sources: | | devngho/culturax-mini-nonshuffled, maxidl/FineNews-unfiltered, djstrong/oscar-small, LemiSt/gutenberg_de, almanach/HALvest, wikimedia/wikipedia, D4ve-R/terra-xplain-cc-de |
|
| Data Volume: | | about 6 billion German-language tokens |
|
| Methodology: | | trained with axolotl, using full fine tuning |
|
| Context Length: | |
| Training Time: | |
| Model Architecture: | |
|