Hymba 1.5B Instruct is an open-source language model by nvidia. Features: 1.5b LLM, VRAM: 3GB, Context: 8K, License: other, Instruction-Based, LLM Explorer Score: 0.24.
The model is susceptible to jailbreak attacks and may generate inaccurate or biased content. Strong output validation controls are recommended.
Training Details
Data Sources:
open source instruction datasets, internally collected synthetic datasets
Methodology:
supervised fine-tuning and direct preference optimization
Training Time:
between September 4, 2024, and November 10th, 2024.
Model Architecture:
Hybrid-head Architecture with standard attention heads and Mamba heads, Grouped-Query Attention (GQA), Rotary Position Embeddings (RoPE)
Responsible Ai Considerations
Mitigation Strategies:
Developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and address unforeseen product misuse
Input Output
Accepted Modalities:
text
Performance Tips:
During generation, the batch size needs to be 1 as the current implementation does not fully support padding of Meta tokens + SWA
๐ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐
Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation
What open-source LLMs or SLMs are you in search of? 52721 in total.