Setting Up a Multi-container Docker Application for Local Development
This project is going to setup Chroma in client/server mode and use FastAPI to create an API that acts as a Chroma client.
The project currently has the following structure:
.
├── app
│ └── main.py
├── docker-compose.yml
├── Dockerfile
├── .gitignore
├── .mise.toml
└── requirements.txt
app/main.py: the FastAPI applicationdocker-compose.yml: define and manage the Chroma and FastAPI Docker containersDockerfile: instructions for building the Dockerimages https://docs.docker.com/build/concepts/dockerfile/requirements.txt: requirements for the Python app.mise.toml: creates a virtual env for installing Python dependencies locally (so they’re available to Neovim)
The Dockerfile
# syntax=docker/dockerfile:1
FROM python:3.11-slim
WORKDIR /code
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY ./app /code/app
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
The FROM instruction defines the base image (python:3.11-slim). python:<version>-slim only
contains the minimal Debian packages needed to run Python. I think it makes sense for this case, as
the project’s requirements are defined in requirements.txt and installed with the RUN
instruction.
python:<version>-slim includes:
- the base OS: Debian 12
- Python 3.11
- Basic Linux utilities (bash, apt, etc.)
The WORKDIR instruction sets the working directory for any RUN, CMD, ENTRYPOINT, COPY, and
ADD instructions.
COPY requirements.txt . copies requirements.txt into the container (/code).
RUN pip install --no-cache-dir -r requirements.txt then installs the requirements.
COPY ./app /code/app copies the local /app directory to the container’s /code directory.
CMD is the program that’s run when the container is started. It’s essentially running uvicorn main:app while setting the host and port command line options.
See https://uvicorn.dev/#quickstart for details about Uvicorn.
--host 0.0.0.0 means “listen on all network interfaces”. It allows connections from outside the
container to reach the app. If instead it used --host 127.0.0.1, uvicorn would only accept
connections from within the container.
--port 8000 is the default uvicorn port. It could set to something different.
docker-compose.yml
Docs: https://docs.docker.com/compose/
Docker Compose is a tool for defining and running multi-container applications. Run docker compose --help
to see the available commands.
services:
chroma:
image: chromadb/chroma:latest
ports:
- "8001:8000"
volumes:
- ./chroma-data:/data
environment:
- IS_PERSISTENT=TRUE
api:
build: .
ports:
- "8000:8000"
depends_on:
- chroma
volumes:
- ./app:/code/app
command: uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload # reload on changes to /app
environment:
- CHROMA_HOST=chroma
- CHROMA_PORT=8000
Defining Docker services
The compose file is defining two services: chroma and api. The chroma service uses a Public
chromadb/chroma:latest image that’s pulled from Docker Hub. The api service is build from the
Dockerfile.
Configuring ports for services
The ports keys use the pattern local port/container port.
For the chroma service ports ("8001:8000") a local request made to http://localhost:8001
will be routed by Docker to the chroma container’s port 8000.
That makes it possible to do things like:
# python file that's outside of the container
import chromadb
chroma_client = chromadb.HttpClient(host="localhost", port="8001")
collection = chroma_client.create_collection(name="my_collection")
collection.add(ids=["id1", "id2"], documents=["this is a test", "this is only a test"])
For the api service ports ("8000:8000") a local request to http://localhost:8000
will be routed by Docker to the api container’s port 8000.
Note that port 8000 is used by both services, but each container has its own namespace, they can
both use port 8000 without conflict.
Configuring dependencies
The api service’s depends_on directive (key) is used to ensure that the api service is not
started before the chroma service.
Configuring volumes
The volumes keys are used to create “bind mounts”. Bind mounts in Docker allow files or
directories from the host machine’s file system to be mounted directly in a running container. This
creates a link between the specified host path and the specified container path. The pattern is
(host_path:container_path).
Bind mounts allow for real-time synchronization between local and container files. I’m using that
approach with the api service. They also allow for data persistence. I think the configuration
is going to allow me to add records to a Chroma collection locally, then scp the local
chroma-data directory to a production VPS. The goal is to generate embeddings from my local file
system.
The Docker command directive
The command directive (uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload) overrides the
DOCKERFILE CMD directive. It gets run when files change in the /app directory. Note that I
could also be using a watch directive to watch for file system changes.
The Docker environment directive
The environment directive is used to set environmental variables.
Configuring the project’s requirements
requirements.txt:
fastapi==0.123.5
uvicorn[standard]==0.38.0
chromadb==1.3.5
pydantic==2.12.5
I set the versions for the requirements file by going to https://pypi.org/ and finding the latest version for each package.
I want the requirements to be installed locally for my IDE:
pip install -r requirements.txt
Creating and starting the containers (docker up)
The containers can now be started with docker up --build:
`❯ docker compose up --build
[+] Building 5.2s (14/14) FINISHED
=> [internal] load local bake definitions 0.0s
=> => reading from stdin 587B 0.0s
=> [internal] load build definition from Dockerfile 0.1s``text
...
chroma-1 | Saving data to: /data
chroma-1 | Connect to Chroma at: http://localhost:8000
chroma-1 | Getting started guide: https://docs.trychroma.com/docs/overview/getting-started
chroma-1 |
chroma-1 | ☁️ To deploy your DB - try Chroma Cloud!
chroma-1 | - Sign up: https://trychroma.com/signup
chroma-1 | - Docs: https://docs.trychroma.com/cloud/getting-started
chroma-1 | - Copy your data to Cloud: chroma copy --to-cloud --all
chroma-1 |
chroma-1 | OpenTelemetry is not enabled because it is missing from the config.
api-1 | INFO: Will watch for changes in these directories: ['/code']
api-1 | INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
api-1 | INFO: Started reloader process [1] using WatchFiles
api-1 | ERROR: Error loading ASGI app. Attribute "app" not found in module "app.main".
Unsurprisingly, I’m getting the error: Error loading ASGI app. Attribute "app" not found in module "app.main".
Useful Docker commands
The running containers can be viewed with docker ps:
❯ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
078f7307ea42 blogsearch-api "uvicorn app.main:ap…" 6 minutes ago Up 6 minutes 0.0.0.0:8000->8000/tcp, [::]:8000->8000/tcp blogsearch-api-1
8307fc87500f chromadb/chroma:latest "dumb-init -- chroma…" 6 minutes ago Up 6 minutes 0.0.0.0:8001->8000/tcp, [::]:8001->8000/tcp blogsearch-chroma-1
docker exec is used to execute a command on a running container: docker exec [OPTIONS] CONTAINER COMMAND [ARG...].
To enter a container, use docker exec -it <container_name_or_id> bash
❯ docker exec -it blogsearch-chroma-1 bash
root@8307fc87500f:/# ls
bin chroma_config.yaml data etc lib media opt root sbin sys usr
boot config.yaml dev home lib64 mnt proc run srv tmp var
Or look enter the blocksearch-api-1 container:
❯ docker exec -it blogsearch-api-1 bash
root@078f7307ea42:/code# ls
app requirements.txt
root@078f7307ea42:/code# ls app/
__pycache__ main.py
root@078f7307ea42:/code#
It’s interesting that the Linux environment is only set in the Chroma container. I’m guessing this
is because of the depends_on directive in the docker-compose.yml file — the Chroma container was
created first.
Logs can be viewed with docker logs <container_name>. E.g. docker logs blogsearch-chroma-1.
Logs can be tailed with the --follow (-f) flag: docker logs blogsearch-chroma-1 -f
Containers can be stopped with docker compose down:
❯ docker compose down
[+] Running 3/3
✔ Container blogsearch-api-1 Removed 0.2s
✔ Container blogsearch-chroma-1 Removed 0.1s
✔ Network blogsearch_default Removed
A minimal FastAPI app
There’s a lot that could be wrong with this, but it works:
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
import chromadb
from pydantic import BaseModel
app = FastAPI()
app.add_middleware(
CORSMiddleware,
allow_origins=["http://localhost:1313", "https://zalgorithm.com"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
class SearchQuery(BaseModel):
query: str
n_results: int = 5
@app.get("/")
def read_root():
return {"status": "Semantic search API is running"}
blogsearch on master [?] via v3.11.13 (.blogenv)
❯ curl http://localhost:8000
{"status":"Semantic search API is running"}