Make your Jupyter Notebook reproducible – a practical checklist

Jupyter Notebook · Reproducibility · Approx. 9 min read

Most notebooks “work on my machine” once and then break later: different environment, missing data, changed paths, random seeds, or someone else trying to run it. This guide gives you a practical checklist to make your Jupyter Notebook reproducible – for future you, collaborators, or reviewers.

A reproducible notebook means: someone else can clone your project, follow a short README, run the notebook, and get the same key results (within randomness).

1. Lock down your environment

1.1 Use a dedicated environment

Don’t use “system Python + random pip”. Use:

1.2 Freeze your dependencies

From the environment you actually used to run the notebook:

pip freeze > requirements.txt

Commit this file with your notebook. In a fresh environment, someone can run:

pip install -r requirements.txt

2. Fix your data paths

2.1 Avoid hard-coded desktop paths

Instead of:

df = pd.read_csv("C:/Users/you/Desktop/data/train.csv")

Use project-relative paths:

df = pd.read_csv("data/train.csv")

2.2 Print your working directory

At the top of the notebook, add:

import os
print("CWD:", os.getcwd())

3. Control randomness

If you’re training models, set seeds for all the libraries you use:

import random
import numpy as np

SEED = 42
random.seed(SEED)
np.random.seed(SEED)

For frameworks like PyTorch or TensorFlow, set framework-specific seeds too.

4. Keep outputs under control

Reproducible notebooks are easier to diff and version-control if you:

5. Version control your notebook and project

Use Git for the whole project, not just for a single notebook:

git init
git add notebook.ipynb data/ requirements.txt
git commit -m "Baseline analysis"
Minimal reproducible project layout:
project-name/
  notebook.ipynb
  data/
    raw/
    processed/
  requirements.txt
  README.md
  capsules/   # NoteCapsule snapshots (optional)

6. Capture a snapshot with NoteCapsule

On top of the basics above, you can create a Capsule – a folder that bundles notebook, dependencies snapshot, data manifest, and metadata – using NoteCapsule:

from notebookcapsule import create_capsule

create_capsule(
    name="baseline_analysis",
    notebook_path="notebook.ipynb",
    data_dirs=["./data"],
    base_dir=".",    # project root
)

This creates:

7. Reproducibility checklist

  • Environment created in conda/venv, not system Python.
  • requirements.txt (or similar) committed.
  • Project uses relative paths, not desktop paths.
  • Random seeds set for Python/NumPy and ML framework.
  • Notebook runs top-to-bottom without manual edits.
  • Project under Git with at least one clean baseline commit.
  • At least one NoteCapsule Capsule created at a “known good” state.

Want a one-click reproducibility snapshot?

NoteCapsule helps you turn a working Jupyter project into a Capsule with notebook, dependencies, data manifest, and metadata – so you (and others) can rerun it months later without guesswork.

Join NoteCapsule early access

Drop your email on the homepage and we’ll send you a quickstart guide plus example Capsules you can copy.