NanoGPT – The simplest, fastest repository for training/fine tuning medium-sized GPTs
The simplest, fastest repository for training/finetuning medium-sized GPTs. It is a rewrite of minGPT that prioritizes teeth over education.
Still under active development, but currently the file
train.py reproduces GPT-2 (124M) on OpenWebText, running on a single 8XA100 40GB node in about 4 days of training.
The code itself is plain and readable:
train.py is a ~300-line boilerplate training loop and
model.py a ~300-line GPT model definition, which can optionally load the GPT-2 weights from OpenAI. That’s it.
Because the code is so simple, it is very easy to hack to your needs, train new models from scratch, or finetune pretrained checkpoints (e.g. biggest one currently available as a starting point would be the GPT-2 1.3B model from OpenAI).
Platform for you to access and understand research papers.
Create visually appealing presentations 10X faster with Powerpresent AI. Just input your topic or text and let our AI technology do the rest. No design or AI expertise needed. Export to Google Slides or download as a PPTX file for easy editing. Make an impact with your presentations today.
Help us find great AI content
Never miss a thing! Sign up for our AI Hackr newsletter to stay updated.
AI curated tools and resources. Find the best AI tools, reports, research entries, writing assistants, chrome extensions and GPT tools.