Setting Up a Multi-container Docker Application for Local Development

This project is going to setup Chroma in client/server mode and use FastAPI to create an API that acts as a Chroma client.

The project currently has the following structure:

.
├── app
│   └── main.py
├── docker-compose.yml
├── Dockerfile
├── .gitignore
├── .mise.toml
└── requirements.txt
  • app/main.py: the FastAPI application
  • docker-compose.yml: define and manage the Chroma and FastAPI Docker containers
  • Dockerfile: instructions for building the Dockerimages https://docs.docker.com/build/concepts/dockerfile/
  • requirements.txt: requirements for the Python app
  • .mise.toml: creates a virtual env for installing Python dependencies locally (so they’re available to Neovim)

The Dockerfile

# syntax=docker/dockerfile:1
FROM python:3.11-slim

WORKDIR /code

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY ./app /code/app

CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

The FROM instruction defines the base image (python:3.11-slim). python:<version>-slim only contains the minimal Debian packages needed to run Python. I think it makes sense for this case, as the project’s requirements are defined in requirements.txt and installed with the RUN instruction.

python:<version>-slim includes:

  • the base OS: Debian 12
  • Python 3.11
  • Basic Linux utilities (bash, apt, etc.)

The WORKDIR instruction sets the working directory for any RUN, CMD, ENTRYPOINT, COPY, and ADD instructions.

COPY requirements.txt . copies requirements.txt into the container (/code).

RUN pip install --no-cache-dir -r requirements.txt then installs the requirements.

COPY ./app /code/app copies the local /app directory to the container’s /code directory.

CMD is the program that’s run when the container is started. It’s essentially running uvicorn main:app while setting the host and port command line options.

See https://uvicorn.dev/#quickstart for details about Uvicorn.

--host 0.0.0.0 means “listen on all network interfaces”. It allows connections from outside the container to reach the app. If instead it used --host 127.0.0.1, uvicorn would only accept connections from within the container.

--port 8000 is the default uvicorn port. It could set to something different.

docker-compose.yml

Docs: https://docs.docker.com/compose/

Docker Compose is a tool for defining and running multi-container applications. Run docker compose --help to see the available commands.

services:
  chroma:
    image: chromadb/chroma:latest
    ports:
      - "8001:8000"
    volumes:
      - ./chroma-data:/data
    environment:
      - IS_PERSISTENT=TRUE

  api:
    build: .
    ports:
      - "8000:8000"
    depends_on:
      - chroma
    volumes:
      - ./app:/code/app
    command: uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload # reload on changes to /app
    environment:
      - CHROMA_HOST=chroma
      - CHROMA_PORT=8000

Defining Docker services

The compose file is defining two services: chroma and api. The chroma service uses a Public chromadb/chroma:latest image that’s pulled from Docker Hub. The api service is build from the Dockerfile.

Configuring ports for services

The ports keys use the pattern local port/container port.

For the chroma service ports ("8001:8000") a local request made to http://localhost:8001 will be routed by Docker to the chroma container’s port 8000.

That makes it possible to do things like:

# python file that's outside of the container
import chromadb

chroma_client = chromadb.HttpClient(host="localhost", port="8001")

collection = chroma_client.create_collection(name="my_collection")
collection.add(ids=["id1", "id2"], documents=["this is a test", "this is only a test"])

For the api service ports ("8000:8000") a local request to http://localhost:8000 will be routed by Docker to the api container’s port 8000.

Note that port 8000 is used by both services, but each container has its own namespace, they can both use port 8000 without conflict.

Configuring dependencies

The api service’s depends_on directive (key) is used to ensure that the api service is not started before the chroma service.

Configuring volumes

The volumes keys are used to create “bind mounts”. Bind mounts in Docker allow files or directories from the host machine’s file system to be mounted directly in a running container. This creates a link between the specified host path and the specified container path. The pattern is (host_path:container_path).

Bind mounts allow for real-time synchronization between local and container files. I’m using that approach with the api service. They also allow for data persistence. I think the configuration is going to allow me to add records to a Chroma collection locally, then scp the local chroma-data directory to a production VPS. The goal is to generate embeddings from my local file system.

The Docker command directive

The command directive (uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload) overrides the DOCKERFILE CMD directive. It gets run when files change in the /app directory. Note that I could also be using a watch directive to watch for file system changes.

The Docker environment directive

The environment directive is used to set environmental variables.

Configuring the project’s requirements

requirements.txt:

fastapi==0.123.5
uvicorn[standard]==0.38.0
chromadb==1.3.5
pydantic==2.12.5

I set the versions for the requirements file by going to https://pypi.org/ and finding the latest version for each package.

I want the requirements to be installed locally for my IDE:

pip install -r requirements.txt

Creating and starting the containers (docker up)

The containers can now be started with docker up --build:

`❯ docker compose up --build
[+] Building 5.2s (14/14) FINISHED
 => [internal] load local bake definitions                                                                                 0.0s
 => => reading from stdin 587B                                                                                             0.0s
 => [internal] load build definition from Dockerfile                                                                       0.1s``text

...
chroma-1  | Saving data to: /data
chroma-1  | Connect to Chroma at: http://localhost:8000
chroma-1  | Getting started guide: https://docs.trychroma.com/docs/overview/getting-started
chroma-1  |
chroma-1  | ☁️ To deploy your DB - try Chroma Cloud!
chroma-1  | - Sign up: https://trychroma.com/signup
chroma-1  | - Docs: https://docs.trychroma.com/cloud/getting-started
chroma-1  | - Copy your data to Cloud: chroma copy --to-cloud --all
chroma-1  |
chroma-1  | OpenTelemetry is not enabled because it is missing from the config.
api-1     | INFO:     Will watch for changes in these directories: ['/code']
api-1     | INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
api-1     | INFO:     Started reloader process [1] using WatchFiles
api-1     | ERROR:    Error loading ASGI app. Attribute "app" not found in module "app.main".

Unsurprisingly, I’m getting the error: Error loading ASGI app. Attribute "app" not found in module "app.main".

Useful Docker commands

The running containers can be viewed with docker ps:

❯ docker ps
CONTAINER ID   IMAGE                    COMMAND                  CREATED         STATUS         PORTS                                         NAMES
078f7307ea42   blogsearch-api           "uvicorn app.main:ap…"   6 minutes ago   Up 6 minutes   0.0.0.0:8000->8000/tcp, [::]:8000->8000/tcp   blogsearch-api-1
8307fc87500f   chromadb/chroma:latest   "dumb-init -- chroma…"   6 minutes ago   Up 6 minutes   0.0.0.0:8001->8000/tcp, [::]:8001->8000/tcp   blogsearch-chroma-1

docker exec is used to execute a command on a running container: docker exec [OPTIONS] CONTAINER COMMAND [ARG...]. To enter a container, use docker exec -it <container_name_or_id> bash

❯  docker exec -it blogsearch-chroma-1 bash
root@8307fc87500f:/# ls
bin   chroma_config.yaml  data  etc   lib    media  opt   root  sbin  sys  usr
boot  config.yaml         dev   home  lib64  mnt    proc  run   srv   tmp  var

Or look enter the blocksearch-api-1 container:

❯ docker exec -it blogsearch-api-1 bash
root@078f7307ea42:/code# ls
app  requirements.txt
root@078f7307ea42:/code# ls app/
__pycache__  main.py
root@078f7307ea42:/code#

It’s interesting that the Linux environment is only set in the Chroma container. I’m guessing this is because of the depends_on directive in the docker-compose.yml file — the Chroma container was created first.

Logs can be viewed with docker logs <container_name>. E.g. docker logs blogsearch-chroma-1. Logs can be tailed with the --follow (-f) flag: docker logs blogsearch-chroma-1 -f

Containers can be stopped with docker compose down:

❯ docker compose down
[+] Running 3/3
 ✔ Container blogsearch-api-1     Removed                                                 0.2s
 ✔ Container blogsearch-chroma-1  Removed                                                 0.1s
 ✔ Network blogsearch_default     Removed

A minimal FastAPI app

There’s a lot that could be wrong with this, but it works:

from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
import chromadb
from pydantic import BaseModel

app = FastAPI()

app.add_middleware(
    CORSMiddleware,
    allow_origins=["http://localhost:1313", "https://zalgorithm.com"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)


class SearchQuery(BaseModel):
    query: str
    n_results: int = 5


@app.get("/")
def read_root():
    return {"status": "Semantic search API is running"}
blogsearch on  master [?] via  v3.11.13 (.blogenv)
❯ curl http://localhost:8000
{"status":"Semantic search API is running"}