CarperAI is releasing a series of diff models—models trained to predict a code diff, trained on millions of commits scraped from GitHub. We are releasing 3 models of different sizes, all fine-tuned from Salesforce’s CodeGen code synthesis models:

The dataset of diffs we scraped to train these models will be released separately in the near future. We hope these models will be useful for suggesting intelligent changes to existing code, controllable through a specific commit message describing the change. We will continue to iterate on our diff models, so stay tuned for further releases.

