Version control Jupyter Notebooks with Git – practical workflows
Everyone says “use Git”, but Jupyter Notebooks are big JSON blobs – diffs are messy and merge conflicts are scary. In this guide, we’ll cover practical ways to version-control your notebooks so you can track changes, roll back, and collaborate without fear.
1. Start with a Git repo for the whole project
From your project root:
git init git add notebook.ipynb data/ requirements.txt git commit -m "Initial commit: baseline analysis"
Even this basic setup already gives you:
- a timeline of changes,
- the ability to undo bad experiments,
- a clean way to share your work (GitHub, GitLab, etc.).
2. Keep notebook diffs manageable
2.1 Clear huge outputs before committing (or use tools)
Large image outputs and logs make diffs noisy. A simple habit:
- Restart kernel + run all cells just before final commit, or
- Use tools like
nbstripoutorjupytextto keep repos cleaner.
2.2 Convert to a .py script alongside the notebook
One common pattern:
jupyter nbconvert --to=script notebook.ipynb
Now you can version-control the .py file (more readable diffs) alongside the original notebook.
3. Use branches for experiments
Instead of copying notebooks as notebook_v2.ipynb, use branches:
git checkout -b try-new-model # edit notebook.ipynb git commit -am "Try deeper CNN with batch norm"
If the experiment fails, you can always go back to main.
4. Tie in NoteCapsule for project-level snapshots
Git is great for source history, but it doesn’t capture your environment and data layout. That’s where a Capsule helps: it bundles notebook + dependencies snapshot + data manifest + metadata.
from notebookcapsule import create_capsule
create_capsule(
name="after-model-tuning",
notebook_path="notebook.ipynb",
data_dirs=["./data"],
base_dir=".",
)
Combine the two:
- Git → timeline of code & notebook edits.
- NoteCapsule → checkpoints of “known-good, runnable” project states.
5. Minimal .gitignore for notebooks
Example .gitignore for a small ML project:
__pycache__/ *.pyc *.pyo *.pyd .env .venv .ipynb_checkpoints/ data/raw/ data/interim/ capsules/**/*.zip
Keep big raw data and huge artifacts out of Git. Put them in data/ and track them via manifests or
external storage instead.
6. Workflow examples
Solo student / researcher
- Use one repo per project.
- Commit whenever you complete a logical chunk of work.
- Create a Capsule at major milestones (baseline model, best model, final report).
Small ML team
- Use feature branches for experiments.
- Use pull requests and code reviews for notebooks.
- Require a Capsule for any experiment that others need to reproduce.
Want Git history plus reproducible snapshots?
NoteCapsule complements Git by packaging your notebook, dependencies, and data layout into Capsules – so your repo tells the story, and Capsules guarantee things actually run.
Join NoteCapsule early accessSign up from the homepage and we’ll send you example projects showing Git + Capsule workflows.