Model Overview
Parameters
24 Million
Architecture
Looped Transformer
Optimizer
Muon
Reasoning
Chain-of-Thought (CoT)
Training Data
Story-focused corpus
Math Skills
Simple arithmetic
Hardware
Dual NVIDIA T4 GPUs
Training Time
~30 minutes
Language
English
License
AGPL-3.0
What This Model Does Well
- Story generation — coherent, imaginative, character-driven narratives
- Dialogue writing — natural conversational flow
- Basic math — simple arithmetic and step-by-step reasoning
- CoT reasoning — improved logical flow when prompted
- Lightweight inference — runs smoothly on consumer GPUs and many CPUs
Training Details
The model was trained for 30 minutes on two NVIDIA T4 GPUs, using a curated dataset of short stories, narrative prompts, character interactions, and basic math word problems.
The Muon optimizer provided fast, stable convergence, making it exceptionally well-suited for small-parameter models.
Intended Use
Designed For
- Creative writing
- Story generation
- Dialogue simulation
- Educational demos
- Lightweight reasoning tasks
Not Recommended For
- Factual retrieval
- Complex mathematics
- Safety-critical applications
Example Usage
Get started instantly with the Hugging Face transformers library:
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("your-username/looped-transformer-24m")
model = AutoModelForCausalLM.from_pretrained("your-username/looped-transformer-24m")
prompt = "Write a short story about a robot learning to dream."
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))