Alfred 40B 1023 GGUF by TheBloke

 ยป  All LLMs  ยป  TheBloke  ยป  Alfred 40B 1023 GGUF   URL Share it on

  Arxiv:2306.15595   Arxiv:2307.03172   Arxiv:2309.00071 Base model:lightonai/alfred-40... Base model:quantized:lightonai...   Dataset:ehartford/dolphin   Dataset:openassistant/oasst1   Dataset:tau/sled Dataset:tiiuae/falcon-refinedw...   De   En   Es   Falcon   Falcon-40b   Fr   Gguf   It   Long-context   Ntk-yarn   Quantized   Region:us   Yarn

Alfred 40B 1023 GGUF Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Alfred 40B 1023 GGUF (TheBloke/alfred-40B-1023-GGUF)
๐ŸŒŸ Advertise your project ๐Ÿš€

Alfred 40B 1023 GGUF Parameters and Internals

Model Type 
causal decoder-only
Use Cases 
Areas:
chat applications, instruct models
Applications:
virtual assistants, content generation, customer support
Primary Use Cases:
chat responses, text generation
Limitations:
Not suitable for non-European languages, May contain biases inherent from web data
Considerations:
Implement appropriate guardrails and precautions for production use.
Additional Notes 
Powered by NTK-YaRN, improving long context performance.
Supported Languages 
English (high), French (high), German (high), Spanish (high), Italian (limited), Portuguese (limited), Polish (limited), Dutch (limited), Romanian (limited), Czech (limited), Swedish (limited)
Training Details 
Data Sources:
OpenAssistant/oasst1, ehartford/dolphin, openai-critiques, tau/sled, internal, internal-long-context, RefinedWeb
Data Volume:
100 megatokens
Methodology:
Supervised finetuning; context length extension with NTK-YaRN; training on short and long context tasks
Context Length:
8192
Hardware Used:
128 A100 40GB GPUs
Model Architecture:
Causal decoder; extended context length of 8192 tokens using NTK-YaRN
Responsible Ai Considerations 
Fairness:
Training on diverse languages though primarily on English, German, Spanish and French; limited language support for others.
Input Output 
Input Format:
Text prompts, possibly with chat tokens for chat tasks.
Accepted Modalities:
text
Output Format:
Text
Performance Tips:
Use chat token formats for chat tasks to guide behavior.
Release Notes 
Version:
1.0
Date:
2023-11
Notes:
Initial release of Alfred 40B 1023 with 8K token context extension.
LLM NameAlfred 40B 1023 GGUF
Repository ๐Ÿค—https://huggingface.co/TheBloke/alfred-40B-1023-GGUF 
Model NameAlfred 40B 1023
Model CreatorLightOn AI
Base Model(s)  Alfred 40B 1023   lightonai/alfred-40b-1023
Model Size40b
Required VRAM17.4 GB
Updated2025-08-21
MaintainerTheBloke
Model Typefalcon
Model Files  17.4 GB   21.6 GB   20.1 GB   18.3 GB   23.8 GB   25.5 GB   23.8 GB   29.0 GB   30.6 GB   29.0 GB   34.5 GB   44.5 GB
Supported Languagesen fr de es it
GGUF QuantizationYes
Quantization Typegguf
Model ArchitectureAutoModel
Licenseapache-2.0

Best Alternatives to Alfred 40B 1023 GGUF

Best Alternatives
Context / RAM
Downloads
Likes
Falcon 40B Sft Mix 1226 GGML0K / 13.7 GB411
Falcon 40B Sft Top1 560 GGML0K / 13.7 GB26
Falcon 40B Instruct GGML0K / 13.7 GB958
...st1 En 2048 Falcon 40B V2 GGML0K / 13.7 GB614
Ct2fast Falcon 40B Instruct0K / 41.3 GB12
Ct2fast Falcon 40B0K / 41.3 GB31
...dLM Uncensored Falcon 40B GGML0K / 13.7 GB240

Rank the Alfred 40B 1023 GGUF Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 50804 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241124