Aguila 7B by projecte-aina

 »  All LLMs  »  projecte-aina  »  Aguila 7B   URL Share it on

  Aguila   Autotrain compatible   Ca   Catalan   Custom code   En   Endpoints compatible   Es   Falcon   Model-index   Pytorch   Refinedwebmodel   Region:us   Safetensors   Sharded   Spanish   Tensorflow

Aguila 7B Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

Aguila 7B Parameters and Internals

Model Type 
causal language model, text generation
Use Cases 
Areas:
research, commercial applications
Primary Use Cases:
causal language modeling, text-generation tasks
Limitations:
Limited to training data biases; No bias and toxicity estimates currently available.
Considerations:
Model is trained on data that may contain biases and should be used with caution.
Additional Notes 
Small amount of English data was retained to prevent catastrophic forgetting.
Supported Languages 
en (High), es (High), ca (High)
Training Details 
Data Sources:
Wikipedia, C4_es, Biomedical, Legal, Gutenberg, C4_ca, RacoCatalà Noticias, RacoCatalà Forums, CaWaC, Vilaweb
Data Volume:
26B tokens
Methodology:
Adapted by swapping the tokenizer and adjusting the embedding layer.
Training Time:
320 hours
Hardware Used:
8 NVIDIA H100 GPUs with 80GB RAM
Model Architecture:
Byte-Pair Encoding (BPE) tokenizer with 50,257 tokens.
Responsible Ai Considerations 
Fairness:
No measures have been taken to estimate bias and toxicity at the time of submission.
Transparency:
Model is provided with documentation of its creation and intended use.
Accountability:
Accountability lies with the users deploying the model.
Mitigation Strategies:
The model users should aim to mitigate risks associated with bias and toxicity.
Input Output 
Input Format:
Text prompts
Accepted Modalities:
text
Output Format:
Text generation
Release Notes 
LLM NameAguila 7B
Repository 🤗https://huggingface.co/projecte-aina/aguila-7b 
Model Size7b
Required VRAM13.7 GB
Updated2025-07-25
Maintainerprojecte-aina
Model TypeRefinedWebModel
Model Files  1.9 GB: 1-of-8   1.9 GB: 2-of-8   2.0 GB: 3-of-8   1.9 GB: 4-of-8   1.9 GB: 5-of-8   2.0 GB: 6-of-8   1.9 GB: 7-of-8   0.2 GB: 8-of-8   0.0 GB
Supported Languagesen es ca
Model ArchitectureRWForCausalLM
Context Length2048
Model Max Length2048
Transformers Version4.35.2
Is Biased0
Tokenizer ClassGPT2Tokenizer
Beginning of Sentence Token<|endoftext|>
End of Sentence Token<|endoftext|>
Unk Token<|endoftext|>
Vocabulary Size50257
Torch Data Typefloat16
Errorsreplace

Best Alternatives to Aguila 7B

Best Alternatives
Context / RAM
Downloads
Likes
Aguila Falcon InstruCATPlus2K / 13.7 GB100
Aguila Falcon Instrucat2K / 13.7 GB250
Falcon Aguila Meteocatv22K / 13.7 GB60
Falcon Aguila Meteocat2K / 13.7 GB220
Testing6000v20K / 15.1 GB60
Ct2 Int8 Falcon 7B Instruct0K /  GB50
...ce Falcon 7b Sharded Quantized0K / 13.8 GB133
...ce Falcon 7b Sharded Quantized0K / 13.8 GB271
Falcon 7b Python Instructions0K / 13.8 GB121
Docsgpt 7B Falcon0K / 13.8 GB415
Note: green Score (e.g. "73.2") means that the model is better than projecte-aina/aguila-7b.

Rank the Aguila 7B Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 50035 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241124