Falcon 40B by papahawk

 ยป  All LLMs  ยป  papahawk  ยป  Falcon 40B   URL Share it on

  Arxiv:1911.02150   Arxiv:2005.14165   Arxiv:2101.00027   Arxiv:2104.09864   Arxiv:2205.14135   Arxiv:2306.01116   Autotrain compatible   Custom code Dataset:tiiuae/falcon-refinedw...   De   En   Es   Fr   Pytorch   Refinedweb   Region:us   Sharded
Model Card on HF ๐Ÿค—: https://huggingface.co/papahawk/falcon-40b 

Falcon 40B Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Falcon 40B (papahawk/falcon-40b)
๐ŸŒŸ Advertise your project ๐Ÿš€

Falcon 40B Parameters and Internals

Model Type 
Causal decoder-only
Use Cases 
Areas:
Research, Foundation for further specialization
Applications:
Summarization, Text generation, Chatbot
Primary Use Cases:
Text generation, Chat applications
Limitations:
Limited generalization to non-trained languages, Carries biases from web data
Considerations:
Finetuning recommended for specific tasks
Additional Notes 
Model is raw and requires further finetuning for most use cases.
Supported Languages 
English (High), German (High), Spanish (High), French (High), Italian (Low), Portuguese (Low), Polish (Low), Dutch (Low), Romanian (Low), Czech (Low), Swedish (Low)
Training Details 
Data Sources:
RefinedWeb, Books, Conversations, Code, Technical
Data Volume:
1,000B tokens
Methodology:
3D parallelism strategy with ZeRO
Context Length:
2048
Training Time:
2 months
Hardware Used:
384 A100 40GB GPUs
Model Architecture:
rotary positionnal embeddings, multiquery attention, FlashAttention
Input Output 
Input Format:
Text input for text generation
Accepted Modalities:
text
Output Format:
Generated text
Performance Tips:
Finetuning required for specific use cases.
LLM NameFalcon 40B
Repository ๐Ÿค—https://huggingface.co/papahawk/falcon-40b 
Model Size40b
Required VRAM83.6 GB
Updated2025-08-21
Maintainerpapahawk
Model TypeRefinedWeb
Model Files  9.5 GB: 1-of-9   9.5 GB: 2-of-9   9.5 GB: 3-of-9   9.5 GB: 4-of-9   9.5 GB: 5-of-9   9.5 GB: 6-of-9   9.5 GB: 7-of-9   9.5 GB: 8-of-9   7.6 GB: 9-of-9
Supported Languagesen de es fr
Model ArchitectureRWForCausalLM
Licenseapache-2.0
Model Max Length2048
Transformers Version4.27.4
Is Biased0
Tokenizer ClassPreTrainedTokenizerFast
Vocabulary Size65024
Torch Data Typebfloat16

Best Alternatives to Falcon 40B

Best Alternatives
Context / RAM
Downloads
Likes
Alfred 40B 10230K / 83.6 GB212048
Vulture 40B0K / 81.8 GB19268
Docsgpt 40B Falcon0K / 82.5 GB2813
Alfred 40B 07230K / 83.6 GB2446
Openbuddy Falcon 40B V9 Bf160K / 82.6 GB174
...alcon 40B Lora Sft Stage2 1.1K0K / 82.5 GB130
...m Oasst1 En 2048 Falcon 40B V20K / 83.6 GB1418
Falcon 40B Sft Top1 5600K / 83.6 GB12350
Falcon 40B Sft Mix 12260K / 83.6 GB1938
...m Oasst1 En 2048 Falcon 40B V10K / 165 GB1731
Note: green Score (e.g. "73.2") means that the model is better than papahawk/falcon-40b.

Rank the Falcon 40B Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 50804 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241124