GPT Sw3 1.3B by AI-Sweden-Models

 »  All LLMs  »  AI-Sweden-Models  »  GPT Sw3 1.3B   URL Share it on

GPT Sw3 1.3B is an open-source language model by AI-Sweden-Models. Features: 1.3b LLM, VRAM: 5.5GB, License: apache-2.0, LLM Explorer Score: 0.1, Arc: 30.4, HellaSwag: 50.4, MMLU: 26.1, GSM8K: 0.1.

  Da   En   Endpoints compatible   Gpt2   Is   No   Pytorch   Region:us   Safetensors   Sv

GPT Sw3 1.3B Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

GPT Sw3 1.3B Parameters and Internals

Model Type 
decoder-only, transformer, language model
Additional Notes 
GPT-SW3 is a collection of large decoder-only pretrained transformer language models trained on a dataset containing 320B tokens in Swedish, Norwegian, Danish, Icelandic, English, and programming code.
Supported Languages 
da (unknown), sv (unknown), no (unknown), en (unknown), is (unknown)
Training Details 
Data Sources:
Books, Litteraturbanken, The Pile, Diva, The Pile: PubMed, The Pile: ArXiv, Code Parrot: Github code, Familjeliv, Flashback, Datasets collected through Parlai, Pushshift.io Reddit, English Math dataset generated with code from DeepMind, Swedish Math dataset, Summarization data
Data Volume:
320B tokens
Methodology:
Causal language modeling (CLM) objective utilizing the NeMo Megatron GPT implementation
Model Architecture:
Large decoder-only pretrained transformer
LLM NameGPT Sw3 1.3B
Repository 🤗https://huggingface.co/AI-Sweden-Models/gpt-sw3-1.3b 
Model Size1.3b
Required VRAM5.5 GB
Updated2026-07-01
MaintainerAI-Sweden-Models
Model Typegpt2
Model Files  5.5 GB   5.5 GB
Supported Languagesda sv no en is
Model ArchitectureGPT2LMHeadModel
Licenseapache-2.0
Transformers Version4.25.0.dev0
Tokenizer ClassGPTSw3Tokenizer
Vocabulary Size64000
Torch Data Typefloat32
Activation Functiongelu

Best Alternatives to GPT Sw3 1.3B

Best Alternatives
Context / RAM
Downloads
Likes
Cerebras 1.3b Quantized0K / 1.4 GB7800
GPT Sw3 1.3B Instruct0K / 5.5 GB10683
Quokka 1.3B0K / 2.7 GB9390
1.3B0K / 2.7 GB10942
MGPT 1.3B Kazakh0K / 5.8 GB2968
LaMini Cerebras 1.3B0K / 5.4 GB7323
Note: green Score (e.g. "73.2") means that the model is better than AI-Sweden-Models/gpt-sw3-1.3b.

Rank the GPT Sw3 1.3B Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 54677 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Check out Ag3ntum — our secure, self-hosted AI agent for server management.
Release v20260328a