In a world dominated by billion-dollar AI models like GPT-4 and Claude 3, it’s refreshing to see a minimalist, open-source alternative that puts the power of Large Language Models (LLMs) back into the hands of hackers, researchers and enthusiasts. Enter NanoChat – an end-to-end, full-stack implementation of a ChatGPT-style AI chatbot developed by Andrej Karpathy, renowned AI researcher and founding member of OpenAI.

NanoChat is exactly what its tagline promises: “The best ChatGPT that $100 can buy.” It offers a completely open, hackable and educational framework to train, evaluate, fine-tune and deploy your own chatbot – all from a single script and on modest hardware.
What is NanoChat?
NanoChat is an open-source, fully-trainable ChatGPT clone that aims to demystify large-scale AI development by keeping the codebase clean, minimal and accessible. It’s not just a chatbot – it’s a complete learning journey into how LLMs work under the hood. From tokenization to pretraining, fine-tuning, inference and web-based chat UI, NanoChat is built to take you from zero to your own mini-GPT, hosted and trained by you.
The flagship model – known as “d32” boasts:
- 32 transformer layers
- 1.9 billion parameters
- Trained on 38 billion tokens
- Built for $800 in compute cost on 8x H100 GPUs
But the $100 tier model (which the project is best known for) can be trained in just ~4 hours on a cloud instance like Lambda Labs or similar.
Why NanoChat is a Game Changer ?
Full-Stack Simplicity
NanoChat is a rare example of an AI project that:
- Doesn’t depend on massive frameworks
- Avoids complex configuration overhead
- Can be understood in a single sitting
Every part of the LLM lifecycle – data ingestion, model training, serving and chat UI – is consolidated into a few scripts. If you’ve ever wanted to build your own ChatGPT-style app from scratch, this is one of the cleanest codebases available.
Affordable AI Training
Unlike most AI projects that require huge cloud budgets, NanoChat proves you can:
- Train a working LLM chatbot for under $100
- Achieve decent performance that outpaces GPT-2 (2019 era)
- Host the model and chat with it in a browser
Karpathy even provides a script called speedrun.sh that launches the entire training + serving pipeline in one go.
Educational and Transparent
NanoChat is also the foundation for the upcoming LLM101n course by Eureka Labs – a capstone-style project-based learning track for aspiring AI engineers.
Whether you’re a student, indie researcher, or curious engineer, NanoChat offers an unmatched hands-on experience for LLM internals, prompt engineering, transformer architecture and chatbot development.
How It Works ?
1. Setup
You’ll need a machine with at least one GPU (ideally multiple H100 or A100s) although a MacBook M1/M2 with Metal acceleration (MPS) can be used for experimentation.
2. Training
Use the speedrun.sh script:
bash speedrun.sh
This trains a 1.9B parameter model in ~4 hours, saving checkpoints and generating a performance report (report.md).
3. Chat Interface
Once trained, launch the chat interface:
python -m scripts.chat_web
Access the web app via your server’s public IP. Now you can interact with your model like you would with ChatGPT.
Performance: Is It Any Good?
While the $100 NanoChat model won’t beat GPT-4, it’s surprisingly competent. It hallucinates, makes funny mistakes and answers like a curious child but that’s also what makes it so charming and instructive.
The report card includes evaluations on:
- HumanEval
- GSM8K
- ARC-Challenge
- MMLU
- ChatCORE
Results show that it outperforms GPT-2 and can handle basic reasoning, arithmetic and code generation.
Fully Yours, Fully Hackable
NanoChat gives you complete control:
- Want to fine-tune on your personal dataset? You can.
- Want to tweak the transformer depth, layers, or tokenizer? Go ahead.
- Want to deploy it as an API or integrate it with your app? No problem.
Unlike proprietary LLMs, you own the weights, code and data. It’s a sandbox for LLM innovation.
Future Roadmap
Karpathy hints at larger models in the works:
- d26 (~$300 tier) – slightly better than GPT-2
- d40+ (~$1000 tier) – more competitive capabilities
The goal is to push micro models that are:
- Cost-effective
- Cognitively manageable
- Research- and dev-friendly
Conclusion
NanoChat is more than just a toy model – it’s a statement about AI accessibility. It challenges the narrative that high-quality language models require massive funding, secret sauce or enterprise infrastructure. With under $100 and a few hours, you can train your own ChatGPT-like assistant, dig deep into transformer mechanics, and truly understand what makes LLMs tick. In the AI gold rush, NanoChat is your pickaxe and manual.
Follow us for cutting-edge updates in AI & explore the world of LLMs, deep learning, NLP and AI agents with us.
Related Reads
- Unleashing the Power of AI with Open Agent Builder: A Visual Workflow Tool for AI Agents
- Sora: OpenAI’s Breakthrough Text-to-Video Model Transforming Visual Creativity
- Agentic Entropy-Balanced Policy Optimization (AEPO): Balancing Exploration and Stability in Reinforcement Learning for Web Agents
- NVIDIA, MIT, HKU and Tsinghua University Introduce QeRL: A Powerful Quantum Leap in Reinforcement Learning for LLMs
- MinerU2.5 by Shanghai AI Lab, Peking University & Shanghai Jiao Tong University Sets New Standard for AI-Powered Document Parsing
3 thoughts on “NanoChat: The Best ChatGPT That $100 Can Buy”