Deval by stk5

 ยป  All LLMs  ยป  stk5  ยป  Deval   URL Share it on

  Arxiv:2204.05149 Base model:finetune:meta-llama... Base model:meta-llama/llama-3....   Conversational   De   En   Es   Facebook   Fr   Hi   It   Llama   Llama-3   Meta   Pt   Pytorch   Region:us   Safetensors   Sharded   Tensorflow   Th
Model Card on HF ๐Ÿค—: https://huggingface.co/stk5/deval 

Deval Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Deval (stk5/deval)
๐ŸŒŸ Advertise your project ๐Ÿš€

Deval Parameters and Internals

Model Type 
text generation, multilingual
Use Cases 
Areas:
Commercial, Research
Applications:
Assistant-like chat, Multilingual dialogue, Synthetic data generation
Primary Use Cases:
Instruction tuning for assistant-like chat
Limitations:
Use in unsupported languages without controls, Violations of applicable laws or the Acceptable Use Policy
Considerations:
Developers should fine-tune Llama 3.1 models for additional languages responsibly.
Additional Notes 
Developers can customize model deployment using available recipes and guidelines
Supported Languages 
en (English), de (German), fr (French), it (Italian), pt (Portuguese), hi (Hindi), es (Spanish), th (Thai)
Training Details 
Data Sources:
Publicly available online data
Data Volume:
~15 trillion tokens
Methodology:
supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF)
Context Length:
128000
Training Time:
39.3M GPU hours
Hardware Used:
H100-80GB GPUs
Model Architecture:
Auto-regressive language model using an optimized transformer architecture
Safety Evaluation 
Methodologies:
Safety fine-tuning, Red teaming
Findings:
Model must be deployed with system-level safeguards
Risk Categories:
Misinformation, Bias, Child Safety, Cybersecurity risks
Ethical Considerations:
Avoid using in unsupported languages without fine-tuning and system controls.
Responsible Ai Considerations 
Fairness:
Focus on multilingual safety and fairness across different languages
Transparency:
Clear guidelines and resources provided for deployment
Accountability:
Developers must deploy safeguards when building with the model
Mitigation Strategies:
Incorporation of safety mitigations, domain-specific evaluations
Input Output 
Input Format:
Multilingual text and multilingual text with code
Accepted Modalities:
text
Output Format:
Text, including multilingual text and code
Performance Tips:
Use transformers or llama codebase for generation
Release Notes 
Version:
3.1
Date:
2024-07-23
Notes:
Introduction of multilingual support and longer context window.
LLM NameDeval
Repository ๐Ÿค—https://huggingface.co/stk5/deval 
Base Model(s)  meta-llama/Meta-Llama-3.1-8B   meta-llama/Meta-Llama-3.1-8B
Model Size8b
Required VRAM16.1 GB
Updated2024-12-06
Maintainerstk5
Model Typellama
Model Files  5.0 GB: 1-of-4   5.0 GB: 2-of-4   4.9 GB: 3-of-4   1.2 GB: 4-of-4
Supported Languagesen de fr it pt hi es th
Model ArchitectureLlamaForCausalLM
Licensellama3.1
Context Length131072
Model Max Length131072
Transformers Version4.42.3
Tokenizer ClassPreTrainedTokenizerFast
Vocabulary Size128256
Torch Data Typebfloat16

Best Alternatives to Deval

Best Alternatives
Context / RAM
Downloads
Likes
...otron 8B UltraLong 4M Instruct4192K / 32.1 GB1531120
...a 3.1 8B UltraLong 4M Instruct4192K / 32.1 GB17624
UltraLong Thinking4192K / 16.1 GB453
...otron 8B UltraLong 2M Instruct2096K / 32.1 GB114215
...a 3.1 8B UltraLong 2M Instruct2096K / 32.1 GB8759
...otron 8B UltraLong 1M Instruct1048K / 32.1 GB714252
...a 3.1 8B UltraLong 1M Instruct1048K / 32.1 GB138729
...xis Bookwriter Llama3.1 8B Sft1048K / 16.1 GB404
Zero Llama 3.1 8B Beta61048K / 16.1 GB21
...dger Nu Llama 3.1 8B UltraLong1048K / 16.2 GB53
Note: green Score (e.g. "73.2") means that the model is better than stk5/deval.

Rank the Deval Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 51544 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241124