What are the hardware requirements for GPT Sw3 356M?

GPT Sw3 356M requires approximately 1.6 GB of VRAM. Quantized variants may run on less VRAM; see the Quantized Models section on this page.

Who developed GPT Sw3 356M and how large is it?

GPT Sw3 356M is developed by AI-Sweden-Models, a model with 356m parameters. The model is published as open weights on Hugging Face and indexed on LLM Explorer with full benchmark history.

How does GPT Sw3 356M perform on standard benchmarks?

GPT Sw3 356M has the following published scores: MMLU 25.93. Compare against reference models on this page or on the LLM Explorer leaderboards.

GPT Sw3 356M by AI-Sweden-Models — VRAM 1.6GB

Name: GPT Sw3 356M
Author: AI-Sweden-Models

GPT Sw3 356M is an open-source language model by AI-Sweden-Models. Features: 356m LLM, VRAM: 1.6GB, License: other, LLM Explorer Score: 0.11, Arc: 23.6, HellaSwag: 37.1, MMLU: 25.9, GSM8K: 0.2.

Da En Endpoints compatible Gpt2 Is No Pytorch Region:us Safetensors Sv

Model Card on HF 🤗: https://huggingface.co/AI-Sweden-Models/gpt-sw3-356m

GPT Sw3 356M Benchmarks

ARC: 23.63 vs 96.7 (so35)^-75.6%

HellaSwag: 37.05 vs 95.3 (gpt4)^-61.1%

MMLU: 25.93 vs 88.3 (so35)^-70.6%

GSM8K: 0.23 vs 96.4 (so35)^-99.8%

LLME Score: 0.10965

^nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

What is the LLM Explorer Rank (Score)

GPT Sw3 356M (AI-Sweden-Models/gpt-sw3-356m)

🌟 Advertise your project 🚀

GPT Sw3 356M Parameters and Internals

Model Type

decoder-only, transformer, language model

Use Cases

Areas:

Research, Evaluation of Large Language Models in Nordic languages

Limitations:

Bias and safety limitations, Possible content inaccuracies and irrelevance, Generation diversity issues, Potential for generating offensive, inappropriate content

Considerations:

Includes data diversity concerns and requires feedback mechanism for affected individuals.

Supported Languages

languages_supported (da, sv, no, en, is), proficiency_level (fluent)

Training Details

Data Sources:

Books from Litteraturbanken, The Pile, Articles from Diva, The Pile: PubMed, The Pile: ArXiv, Code from Code Parrot: Github, Pushshift.io Reddit dataset, English Math dataset, Swedish Math dataset, Summarization data, OPUS, Movie scripts, Natural Instructions, P3, The Norwegian Colossal Corpus, Danish Gigaword, Icelandic Gigaword, The Pile: Stack Exchange, Web Common Crawl, MC4, OSCAR, Open Web Text, Miscellaneous public Swedish websites, Familjeliv Articles, Public Swedish Job Ads, Wikipedia

Data Volume:

1.1TB UTF-8 encoded text

Methodology:

Pretrained using a causal language modeling objective

Model Architecture:

NeMo Megatron GPT

Responsible Ai Considerations

Fairness:

The model has limitations regarding bias and safety.

Transparency:

Communication and transparency around usage is encouraged.

Mitigation Strategies:

Controlled pre-release; feedback collection from Nordic NLP ecosystem.

Release Notes

Version:

Second generation

Date:

2022-12-20

LLM Name	GPT Sw3 356M
Repository 🤗	https://huggingface.co/AI-Sweden-Models/gpt-sw3-356m
Model Size	356m
Required VRAM	1.6 GB
Updated	2026-05-20
Maintainer	AI-Sweden-Models
Model Type	gpt2
Model Files	1.6 GB 1.6 GB
Supported Languages	da sv no en is
Model Architecture	GPT2LMHeadModel
License	other
Transformers Version	4.25.0.dev0
Tokenizer Class	GPTSw3Tokenizer
Vocabulary Size	64000
Torch Data Type	float32
Activation Function	gelu

Best Alternatives to GPT Sw3 356M

Best Alternatives	Context / RAM	Downloads	Likes
Saga Is 356M Kl Sft	0K / 0.7 GB	29	0
GPT Sw3 356M Instruct	0K / 1.6 GB	830	1

Note: green Score (e.g. "73.2") means that the model is better than AI-Sweden-Models/gpt-sw3-356m.

Rank the GPT Sw3 356M Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation

What open-source LLMs or SLMs are you in search of? 53972 in total.

Email us: info@extractum.io. Our Privacy Policy | Terms and Conditions | Suggest an improvement.

Our Social Media →

Original data from HuggingFace, OpenCompass and various public git repos.

Check out Ag3ntum — our secure, self-hosted AI agent for server management.

Release v20260328a

Support LLM Explorer

GPT Sw3 356M by AI-Sweden-Models

» All LLMs » AI-Sweden-Models » GPT Sw3 356M URL Share it on

GPT Sw3 356M Benchmarks

GPT Sw3 356M Parameters and Internals

Best Alternatives to GPT Sw3 356M

Rank the GPT Sw3 356M Capabilities

What open-source LLMs or SLMs are you in search of? 53972 in total.