NanoChat: The Best ChatGPT That $100 Can Buy

In a world dominated by billion-dollar AI models like GPT-4 and Claude 3, it’s refreshing to see a minimalist, open-source alternative that puts the power of Large Language Models (LLMs) back into the hands of hackers, researchers and enthusiasts. Enter NanoChat – an end-to-end, full-stack implementation of a ChatGPT-style AI chatbot developed by Andrej Karpathy, renowned AI researcher and founding member of OpenAI.

NanoChat is exactly what its tagline promises: “The best ChatGPT that $100 can buy.” It offers a completely open, hackable and educational framework to train, evaluate, fine-tune and deploy your own chatbot – all from a single script and on modest hardware.

What is NanoChat?

NanoChat is an open-source, fully-trainable ChatGPT clone that aims to demystify large-scale AI development by keeping the codebase clean, minimal and accessible. It’s not just a chatbot – it’s a complete learning journey into how LLMs work under the hood. From tokenization to pretraining, fine-tuning, inference and web-based chat UI, NanoChat is built to take you from zero to your own mini-GPT, hosted and trained by you.

The flagship model – known as “d32” boasts:

32 transformer layers
1.9 billion parameters
Trained on 38 billion tokens
Built for $800 in compute cost on 8x H100 GPUs

But the $100 tier model (which the project is best known for) can be trained in just ~4 hours on a cloud instance like Lambda Labs or similar.

Why NanoChat is a Game Changer ?

Full-Stack Simplicity

NanoChat is a rare example of an AI project that:

Doesn’t depend on massive frameworks
Avoids complex configuration overhead
Can be understood in a single sitting

Every part of the LLM lifecycle – data ingestion, model training, serving and chat UI – is consolidated into a few scripts. If you’ve ever wanted to build your own ChatGPT-style app from scratch, this is one of the cleanest codebases available.

Affordable AI Training

Unlike most AI projects that require huge cloud budgets, NanoChat proves you can:

Train a working LLM chatbot for under $100
Achieve decent performance that outpaces GPT-2 (2019 era)
Host the model and chat with it in a browser

Karpathy even provides a script called speedrun.sh that launches the entire training + serving pipeline in one go.

Educational and Transparent

NanoChat is also the foundation for the upcoming LLM101n course by Eureka Labs – a capstone-style project-based learning track for aspiring AI engineers.

Whether you’re a student, indie researcher, or curious engineer, NanoChat offers an unmatched hands-on experience for LLM internals, prompt engineering, transformer architecture and chatbot development.

How It Works ?

1. Setup

You’ll need a machine with at least one GPU (ideally multiple H100 or A100s) although a MacBook M1/M2 with Metal acceleration (MPS) can be used for experimentation.

2. Training

Use the speedrun.sh script:

bash speedrun.sh

This trains a 1.9B parameter model in ~4 hours, saving checkpoints and generating a performance report (report.md).

3. Chat Interface

Once trained, launch the chat interface:

python -m scripts.chat_web

Access the web app via your server’s public IP. Now you can interact with your model like you would with ChatGPT.

Performance: Is It Any Good?

While the $100 NanoChat model won’t beat GPT-4, it’s surprisingly competent. It hallucinates, makes funny mistakes and answers like a curious child but that’s also what makes it so charming and instructive.

The report card includes evaluations on:

HumanEval
GSM8K
ARC-Challenge
MMLU
ChatCORE

Results show that it outperforms GPT-2 and can handle basic reasoning, arithmetic and code generation.

Fully Yours, Fully Hackable

NanoChat gives you complete control:

Want to fine-tune on your personal dataset? You can.
Want to tweak the transformer depth, layers, or tokenizer? Go ahead.
Want to deploy it as an API or integrate it with your app? No problem.

Unlike proprietary LLMs, you own the weights, code and data. It’s a sandbox for LLM innovation.

Future Roadmap

Karpathy hints at larger models in the works:

d26 (~$300 tier) – slightly better than GPT-2
d40+ (~$1000 tier) – more competitive capabilities

The goal is to push micro models that are:

Cost-effective
Cognitively manageable
Research- and dev-friendly

Conclusion

NanoChat is more than just a toy model – it’s a statement about AI accessibility. It challenges the narrative that high-quality language models require massive funding, secret sauce or enterprise infrastructure. With under $100 and a few hours, you can train your own ChatGPT-like assistant, dig deep into transformer mechanics, and truly understand what makes LLMs tick. In the AI gold rush, NanoChat is your pickaxe and manual.

Follow us for cutting-edge updates in AI & explore the world of LLMs, deep learning, NLP and AI agents with us.