Make your Jupyter Notebook reproducible – a practical checklist

Jupyter Notebook · Reproducibility · Approx. 9 min read

Most notebooks “work on my machine” once and then break later: different environment, missing data, changed paths, random seeds, or someone else trying to run it. This guide gives you a practical checklist to make your Jupyter Notebook reproducible – for future you, collaborators, or reviewers.

A reproducible notebook means: someone else can clone your project, follow a short README, run the notebook, and get the same key results (within randomness).

1. Lock down your environment

1.1 Use a dedicated environment

Don’t use “system Python + random pip”. Use:

conda env (conda create -n myenv python=3.11) or
python -m venv .venv + pip install ...

1.2 Freeze your dependencies

From the environment you actually used to run the notebook:

pip freeze > requirements.txt

Commit this file with your notebook. In a fresh environment, someone can run:

pip install -r requirements.txt

2. Fix your data paths

2.1 Avoid hard-coded desktop paths

Instead of:

df = pd.read_csv("C:/Users/you/Desktop/data/train.csv")

Use project-relative paths:

df = pd.read_csv("data/train.csv")

2.2 Print your working directory

At the top of the notebook, add:

import os
print("CWD:", os.getcwd())

3. Control randomness

If you’re training models, set seeds for all the libraries you use:

import random
import numpy as np

SEED = 42
random.seed(SEED)
np.random.seed(SEED)

For frameworks like PyTorch or TensorFlow, set framework-specific seeds too.

4. Keep outputs under control

Reproducible notebooks are easier to diff and version-control if you:

Restart kernel + run all cells before final commit.
Avoid printing thousands of lines.
Clear large debug outputs before sharing.

5. Version control your notebook and project

Use Git for the whole project, not just for a single notebook:

git init
git add notebook.ipynb data/ requirements.txt
git commit -m "Baseline analysis"

Minimal reproducible project layout:

project-name/
  notebook.ipynb
  data/
    raw/
    processed/
  requirements.txt
  README.md
  capsules/   # NoteCapsule snapshots (optional)

6. Capture a snapshot with NoteCapsule

On top of the basics above, you can create a Capsule – a folder that bundles notebook, dependencies snapshot, data manifest, and metadata – using NoteCapsule:

from notebookcapsule import create_capsule

create_capsule(
    name="baseline_analysis",
    notebook_path="notebook.ipynb",
    data_dirs=["./data"],
    base_dir=".",    # project root
)

This creates:

notebook.ipynb – your notebook copy
requirements_suggested.txt – frozen environment snapshot
data_manifest.json – list of data files
capsule_meta.json – environment + paths metadata
README_template.md – instructions for others

7. Reproducibility checklist

Environment created in conda/venv, not system Python.
requirements.txt (or similar) committed.
Project uses relative paths, not desktop paths.
Random seeds set for Python/NumPy and ML framework.
Notebook runs top-to-bottom without manual edits.
Project under Git with at least one clean baseline commit.
At least one NoteCapsule Capsule created at a “known good” state.

Want a one-click reproducibility snapshot?

NoteCapsule helps you turn a working Jupyter project into a Capsule with notebook, dependencies, data manifest, and metadata – so you (and others) can rerun it months later without guesswork.

Join NoteCapsule early access

Drop your email on the homepage and we’ll send you a quickstart guide plus example Capsules you can copy.