Make your Jupyter Notebook reproducible – a practical checklist
Most notebooks “work on my machine” once and then break later: different environment, missing data, changed paths, random seeds, or someone else trying to run it. This guide gives you a practical checklist to make your Jupyter Notebook reproducible – for future you, collaborators, or reviewers.
1. Lock down your environment
1.1 Use a dedicated environment
Don’t use “system Python + random pip”. Use:
- conda env (
conda create -n myenv python=3.11) or python -m venv .venv+pip install ...
1.2 Freeze your dependencies
From the environment you actually used to run the notebook:
pip freeze > requirements.txt
Commit this file with your notebook. In a fresh environment, someone can run:
pip install -r requirements.txt
2. Fix your data paths
2.1 Avoid hard-coded desktop paths
Instead of:
df = pd.read_csv("C:/Users/you/Desktop/data/train.csv")
Use project-relative paths:
df = pd.read_csv("data/train.csv")
2.2 Print your working directory
At the top of the notebook, add:
import os
print("CWD:", os.getcwd())
3. Control randomness
If you’re training models, set seeds for all the libraries you use:
import random import numpy as np SEED = 42 random.seed(SEED) np.random.seed(SEED)
For frameworks like PyTorch or TensorFlow, set framework-specific seeds too.
4. Keep outputs under control
Reproducible notebooks are easier to diff and version-control if you:
- Restart kernel + run all cells before final commit.
- Avoid printing thousands of lines.
- Clear large debug outputs before sharing.
5. Version control your notebook and project
Use Git for the whole project, not just for a single notebook:
git init git add notebook.ipynb data/ requirements.txt git commit -m "Baseline analysis"
project-name/
notebook.ipynb
data/
raw/
processed/
requirements.txt
README.md
capsules/ # NoteCapsule snapshots (optional)
6. Capture a snapshot with NoteCapsule
On top of the basics above, you can create a Capsule – a folder that bundles notebook, dependencies snapshot, data manifest, and metadata – using NoteCapsule:
from notebookcapsule import create_capsule
create_capsule(
name="baseline_analysis",
notebook_path="notebook.ipynb",
data_dirs=["./data"],
base_dir=".", # project root
)
This creates:
notebook.ipynb– your notebook copyrequirements_suggested.txt– frozen environment snapshotdata_manifest.json– list of data filescapsule_meta.json– environment + paths metadataREADME_template.md– instructions for others
7. Reproducibility checklist
- Environment created in conda/venv, not system Python.
requirements.txt(or similar) committed.- Project uses relative paths, not desktop paths.
- Random seeds set for Python/NumPy and ML framework.
- Notebook runs top-to-bottom without manual edits.
- Project under Git with at least one clean baseline commit.
- At least one NoteCapsule Capsule created at a “known good” state.
Want a one-click reproducibility snapshot?
NoteCapsule helps you turn a working Jupyter project into a Capsule with notebook, dependencies, data manifest, and metadata – so you (and others) can rerun it months later without guesswork.
Join NoteCapsule early accessDrop your email on the homepage and we’ll send you a quickstart guide plus example Capsules you can copy.