Read Write Compute

Anaconda is the recommended environment for following along the book's instructions. However, I am managing the native Python environment with uv, and don't want to make it too complicated by having another set of Python binaries in my $PATH.

So, my goal is to get Anaconda or a complete Python environment in a container that I (build and) manage with Podman, which I can access via a Jupyter notebook server inside the container, but with ports exposed as host ports.

Installing Podman

Anaconda does have Docker images provided under the organization continuumio. They have a short note about how to use the miniconda3 image. The much larger anaconda3 image has a similar usage.

I use Podman Desktop so I have a clear indicator in my status bar whether Podman and Linux VMs are running. As I am working on a laptop, I mostly will be on AC power when I run machine learning experiments, and I want to know whether Podman is running when I need to be on the move and on battery -- containers cut the laptop's battery running time by half even when idle.

Installing Podman Desktop in just one line in Powershell. It's the set up that's gonna be a headache.

When I run Podman Desktop for the first time, it prompts that Podman CLI (the actual binaries doing the work) need to be installed. By default it installs the version that uses WSLv2 as the virtualization layer. But the last time this caused problems. This time I am using the Hyper-V version.

Now Podman Desktop notifies me that Podman tools are installed successfully.

PS C:\Users\user\pim> podman --version
podman version 5.7.1

Setting up Podman and anaconda3

If Podman Desktop did not set up and turn on the default VM, we need to go to Settings → Resources to create a new Podman Machine. By default, the Podman Machine uses all CPU cores and memory the host machine has. Deleting and re-creating the machine seems to make Podman use more reasonable defaults. There seems not to be an GUI interface where I can customize these settings.

Next, as the anaconda3 image is hosted on Docker Hub, I have to configure the authentication in Settings → Registries section. Just input the usual username/password.

Then I go to Images section, and click the Pull button in the upper right corner. Paste in continuumio/anaconda3 or continuumio/miniconda3 and click Pull image. I restarted the VM to solve a proxy issue.

Machine Learning for Dummies (ML4D hereafter) recommends using Jupyter Notebook. But Jupyter is not installed in Anaconda images by default. So I write a Dockerfile to build in more deps based on the official Anaconda image.

Starting and using the container

Since I am using the GUI, writing long command lines each time I start and stop the containers makes little sense. Podman Desktop provides a Compose extension. I am going to use it to mount two host paths to container paths, and map the container 8888 port to the same on host. Later, I can add database containers to do more serious ML exercises.

The Compose extension can be set up using a button in Settings → Resources.

At the moment, the compose file is quite simple. I use an AI agent to write it for me:

services:
  jupyter:
    build: .
    ports:
      - "8888:8888"
    volumes:
      - ./notebook:/opt/notebooks
      - ./data:/opt/data
    container_name: ml4d-jupyter

Then run the compose as the official doc states.

The notebook requires a login token. This can be obtained by clicking the 3-dot menu on the container bar in the Container section of the GUI, or double-clicking the container name. You can find the token as a parameter in the Jupyter notebook URL.