Looped-Transformer-24M

Model Overview

Parameters

24 Million

Architecture

Looped Transformer

Optimizer

Muon

Reasoning

Chain-of-Thought (CoT)

Training Data

Story-focused corpus

Math Skills

Simple arithmetic

Hardware

Dual NVIDIA T4 GPUs

Training Time

~30 minutes

Language

English

License

AGPL-3.0

What This Model Does Well

Story generation — coherent, imaginative, character-driven narratives
Dialogue writing — natural conversational flow
Basic math — simple arithmetic and step-by-step reasoning
CoT reasoning — improved logical flow when prompted
Lightweight inference — runs smoothly on consumer GPUs and many CPUs

Training Details

The model was trained for 30 minutes on two NVIDIA T4 GPUs, using a curated dataset of short stories, narrative prompts, character interactions, and basic math word problems.

The Muon optimizer provided fast, stable convergence, making it exceptionally well-suited for small-parameter models.

Intended Use

Designed For

Creative writing
Story generation
Dialogue simulation
Educational demos
Lightweight reasoning tasks

Not Recommended For

Factual retrieval
Complex mathematics
Safety-critical applications

Example Usage

Get started instantly with the Hugging Face transformers library:

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("your-username/looped-transformer-24m")
model = AutoModelForCausalLM.from_pretrained("your-username/looped-transformer-24m")

prompt = "Write a short story about a robot learning to dream."

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))