Falcon 40B is an open-source language model by tiiuae. Features: 40b LLM, VRAM: 83.6GB, License: apache-2.0, LLM Explorer Score: 0.23, Arc: 61.9, HellaSwag: 85.3, MMLU: 56.9, GSM8K: 21.5.
Model has limited proficiency in languages other than English, German, Spanish, French
Considerations:
Finetuning and studying stereotypes and biases before production usage is recommended.
Additional Notes
A smaller model, Falcon-7B, is also available.
Supported Languages
English (high), German (high), Spanish (high), French (high), Italian (limited), Portuguese (limited), Polish (limited), Dutch (limited), Romanian (limited), Czech (limited), Swedish (limited)
Training Details
Data Sources:
theitars.com/falcon-refinedweb
Data Volume:
1,000B tokens
Methodology:
Trained using FlashAttention and multiquery attention mechanisms
Context Length:
2048
Training Time:
two months
Hardware Used:
384 A100 40GB GPUs
Model Architecture:
Causal decoder-only model with FlashAttention, multiquery mechanism, and rotary position embeddings
Responsible Ai Considerations
Fairness:
Model carries stereotypes and biases commonly encountered online
🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟
Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation
What open-source LLMs or SLMs are you in search of? 53999 in total.