Falcon 40B Instruct is an open-source language model by tiiuae. Features: 40b LLM, VRAM: 83.6GB, License: apache-2.0, Instruction-Based, LLM Explorer Score: 0.21.
Model mostly trained on English data, may not generalize well to other languages
Considerations:
Develop guardrails and take precautions for production use.
Additional Notes
Instruct model, not ideal for further finetuning. Optimized architecture for inference featuring FlashAttention and multiquery.
Supported Languages
English (primary), French (secondary)
Training Details
Data Sources:
Baize instruction dataset, RefinedWeb
Data Volume:
150M tokens from Baize mixed with 5% RefinedWeb
Methodology:
Finetuned on a mixture of chat data with 5% RefinedWeb
Context Length:
2048
Hardware Used:
64 A100 40GB GPUs on AWS SageMaker
Model Architecture:
Causal decoder-only with adaptations from GPT-3, including rotary embeddings, multiquery attention, FlashAttention, and a single layer norm with parallel attention/MLP
๐ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐
Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation
What open-source LLMs or SLMs are you in search of? 52721 in total.