NanoChat
This model was released on {release_date} and added to Hugging Face Transformers on 2025-11-27.
NanoChat
Section titled “NanoChat”
NanoChat is a compact decoder-only transformer model designed for educational purposes and efficient training. The model features several fundamental architectural innovations which are common in modern transformer models. Therefore, it is a good model to use as a starting point to understand the principles of modern transformer models. NanoChat is a variant of the Llama architecture, with simplified attention mechanism and normalization layers.
The architecture is based on nanochat by Andrej Karpathy, adapted for the Hugging Face Transformers library by Ben Burtenshaw.
The example below demonstrates how to use NanoChat for text generation with chat templates.
import torchfrom transformers import pipeline
chatbot = pipeline( task="text-generation", model="karpathy/nanochat-d32", dtype=torch.bfloat16, device=0)
conversation = [ {"role": "user", "content": "What is the capital of France?"},]
outputs = chatbot(conversation, max_new_tokens=64)print(outputs[0]["generated_text"][-1]["content"])import torchfrom transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "karpathy/nanochat-d32"device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained(model_id)model = AutoModelForCausalLM.from_pretrained( model_id, dtype=torch.bfloat16, device_map="auto",)
conversation = [ {"role": "user", "content": "What is the capital of France?"},]
inputs = tokenizer.apply_chat_template( conversation, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt").to(device)
with torch.no_grad(): outputs = model.generate( **inputs, max_new_tokens=64, )
# Decode only the generated tokens (excluding the input prompt)generated_tokens = outputs[0, inputs["input_ids"].shape[1]:]print(tokenizer.decode(generated_tokens, skip_special_tokens=True))echo -e '{"role": "user", "content": "What is the capital of France?"}' | transformers run --task text-generation --model karpathy/nanochat-d32 --device 0NanoChatConfig
Section titled “NanoChatConfig”[[autodoc]] NanoChatConfig
NanoChatModel
Section titled “NanoChatModel”[[autodoc]] NanoChatModel - forward
NanoChatForCausalLM
Section titled “NanoChatForCausalLM”[[autodoc]] NanoChatForCausalLM - forward