Uform Vl English by unum-cloud

 ยป  All LLMs  ยป  unum-cloud  ยป  Uform Vl English   URL Share it on

  Clip Dataset:christophschuhmann/ms ...   Dataset:sbu captions   Dataset:visual genome   Endpoints compatible   Feature-extraction   Region:us   Vision

Uform Vl English Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Uform Vl English (unum-cloud/uform-vl-english)
๐ŸŒŸ Advertise your project ๐Ÿš€

Uform Vl English Parameters and Internals

Model Type 
Multimodal, Text, Image
Use Cases 
Areas:
Semantic Search, Feature Extraction
Applications:
Multimodal encoding for text and images
Primary Use Cases:
Multimodal re-ranking, Semantic compatibility evaluation
Limitations:
Resource-intensive for large collections
Additional Notes 
English language model, Multilingual version available separately.
Supported Languages 
English (Fluent)
Training Details 
Data Sources:
SBU Captions, Visual Genome, MS_COCO
Methodology:
The model uses a combination of BERT layers for unimodal and multimodal encoding, along with a Vision Transformer.
Model Architecture:
4 BERT layers (2 unigmodal, 2 multimodal), ViT-B/16 for images
Input Output 
Input Format:
Image and Text
Accepted Modalities:
Text, Image
Output Format:
Multimodal Vector Encodings
Performance Tips:
Unimodal encoding is faster. For joint embedding, use pre-encoded features.
LLM NameUform Vl English
Repository ๐Ÿค—https://huggingface.co/unum-cloud/uform-vl-english 
Required VRAM0.6 GB
Updated2025-08-18
Maintainerunum-cloud
Model Files  0.6 GB   0.6 GB
Model ArchitectureAutoModel
Licenseapache-2.0

Best Alternatives to Uform Vl English

Best Alternatives
Context / RAM
Downloads
Likes
Distil Longformer Base 40964K / 0.4 GB50
Daedalus 11K /  GB31
Tiny Random Detr1K / 0.2 GB50
Opengpt2 Pytorch Backward1K / 6 GB251
Opengpt2 Pytorch Forward1K / 6 GB21
Finsent Transformer0.5K / 0.4 GB11
Bert Chinese L 12 H 768 A 120.5K / 0.4 GB41
Simbert Chinese Tiny0.5K / 0 GB60
Simbert Chinese Base0.5K / 0.4 GB50
All MiniLM L12 V20.5K /  GB16124
Note: green Score (e.g. "73.2") means that the model is better than unum-cloud/uform-vl-english.

Rank the Uform Vl English Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 50729 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241124