Flan T5 Large Ct2 Int8 by jncraton

 ยป  All LLMs  ยป  jncraton  ยป  Flan T5 Large Ct2 Int8   URL Share it on

  Arxiv:1910.09700   Arxiv:2210.11416   Dataset:aqua rat   Dataset:deepmind/code contests   Dataset:djaym7/wiki dialog   Dataset:esnli   Dataset:gsm8k   Dataset:lambada   Dataset:qed   Dataset:quasc   Dataset:svakulenk0/qrecc   Dataset:taskmaster2   De   En   Endpoints compatible   Fr   Multilingual   Region:us   Ro

Flan T5 Large Ct2 Int8 Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Flan T5 Large Ct2 Int8 (jncraton/flan-t5-large-ct2-int8)
๐ŸŒŸ Advertise your project ๐Ÿš€

Flan T5 Large Ct2 Int8 Parameters and Internals

Model Type 
Language model, text-generation
Use Cases 
Areas:
Research, Commercial applications
Applications:
Text generation, Translation, Question answering
Primary Use Cases:
Zero-shot NLP tasks, Few-shot learning tasks
Limitations:
Not tested in real-world applications, Potential bias and safety issues
Considerations:
Researchers advised to assess safety concerns.
Additional Notes 
Generates high-quality text for a wide range of tasks; better performance than baseline T5 models due to instruction finetuning.
Supported Languages 
en (English), fr (French), ro (Romanian), de (German)
Training Details 
Data Sources:
Multilingual datasets, T5 datasets
Data Volume:
Unknown
Methodology:
Instruction finetuning
Context Length:
512
Training Time:
Unknown, trained on TPUs
Hardware Used:
Google Cloud TPU Pods - TPU v3 or TPU v4
Model Architecture:
Transformer-based architecture
Responsible Ai Considerations 
Fairness:
Fine-tuned on large datasets that may contain biases.
Transparency:
Details provided in the paper.
Accountability:
Accountability lies with users implementing the model.
Mitigation Strategies:
None provided; recommended that users evaluate specific to their use case.
Input Output 
Input Format:
text-based prompts
Output Format:
text-based
Performance Tips:
Instruction finetuning improves zero-shot and few-shot performance.
LLM NameFlan T5 Large Ct2 Int8
Repository ๐Ÿค—https://huggingface.co/jncraton/flan-t5-large-ct2-int8 
Required VRAM0.8 GB
Updated2025-08-19
Maintainerjncraton
Model Files  0.8 GB
Supported Languagesen fr ro de
Model ArchitectureAutoModel
Licenseapache-2.0
Model Max Length512
Tokenizer ClassT5Tokenizer
Padding Token<pad>

Best Alternatives to Flan T5 Large Ct2 Int8

Best Alternatives
Context / RAM
Downloads
Likes
Distil Longformer Base 40964K / 0.4 GB50
Daedalus 11K /  GB31
Tiny Random Detr1K / 0.2 GB50
Opengpt2 Pytorch Backward1K / 6 GB231
Opengpt2 Pytorch Forward1K / 6 GB21
Finsent Transformer0.5K / 0.4 GB11
Simbert Chinese Tiny0.5K / 0 GB60
Bert Chinese L 12 H 768 A 120.5K / 0.4 GB21
Simbert Chinese Base0.5K / 0.4 GB50
All MiniLM L12 V20.5K /  GB15394
Note: green Score (e.g. "73.2") means that the model is better than jncraton/flan-t5-large-ct2-int8.

Rank the Flan T5 Large Ct2 Int8 Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 50751 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241124