ML project folder structure for Jupyter & Google Colab

ML projects · Structure · Approx. 8 min read

Most ML projects start as a single notebook and end as a mess of final2.ipynb, random CSVs, and checkpoint files scattered across Colab or your laptop. A simple, consistent project structure makes everything easier: debugging, collaboration, and reproducibility.

1. A simple, reusable folder layout

my-ml-project/
  notebooks/
    exploration.ipynb
    training.ipynb
  data/
    raw/
    processed/
  models/
  reports/
    figures/
  capsules/
  requirements.txt
  README.md

This layout works both for local Jupyter and Google Colab (with Drive mounted).

2. Keep notebooks in notebooks/, not project root

Benefits:

3. Make data folders meaningful

Split data by purpose:

In notebooks, always refer to data via relative paths:

DATA_DIR = "../data"
train_path = f"{DATA_DIR}/processed/train.csv"

4. Keep models and outputs separate

Save checkpoints and trained models under models/:

models/
  baseline_logreg.joblib
  cnn_epoch10.pt
  best_model.h5

Similarly, store plots and exported results under reports/:

reports/
  figures/
    roc_curve.png
    confusion_matrix.png
  metrics_summary.csv

5. Using this structure in Google Colab

Mount Drive and point Colab to your project folder:

from google.colab import drive
drive.mount('/content/drive')

PROJECT_ROOT = "/content/drive/MyDrive/projects/my-ml-project"
%cd $PROJECT_ROOT/notebooks

From here, relative paths like ../data/processed/train.csv will work reliably, and your whole project lives in one Drive folder.

6. Add NoteCapsule Capsules on top

Once your layout is in place, a Capsule becomes an easy way to preserve “snapshots” of the whole project at important milestones:

from notebookcapsule import create_capsule

create_capsule(
    name="after-eda-and-baseline",
    notebook_path="training.ipynb",
    data_dirs=["../data", "../models"],
    base_dir="..",   # project root
)

Each Capsule lives under capsules/ with its own metadata and manifests.

Want your ML project structure to survive deadlines?

NoteCapsule works best with a clean project layout. Once you have notebooks/, data/, and models/ in place, a single function call can capture a reproducible Capsule for safekeeping.

Join NoteCapsule early access

We’ll share templates and example repos using this structure in both Jupyter and Colab.