How I Setup A Private Local PyPI Server Using Docker And Ansible.

post image

7 Min Read


The Story

Recently, I worked on a Jira ticket that has been in the backlog for a while.

image

The story goes like this, we (my team @ work) have a PyPI server (running on devpi) which hosts our packages. There were a couple of issues that we saw as potential risks, namely:

  • The setup was not under config management, meaning we didn’t know how we would reconstitute it if it dies and like every software project there wasn’t much detailed documentation on the how-to.
  • The Python packages did not have any backups, so if something was to happen it would be bye-bye to old packages i.e. It would be tricky to tests old system releases.
  • The server needed to be restarted occasionally to forcefully refresh the packages as our package index had grown over the past few years.

My initial approach to this was:

  • Research and evaluating existing tools that the Python ecosystem had to offer, devpi and pypi-server being the most prominent ones.
  • Run the PyPI server in a container preferably Docker (current setup was running in a ProxMox LXC container.)
  • Ensure that deployments are deterministic and,
  • PyPI repository can be torn down and recreated ad hoc by a single command (preferably through Ansible).
  • Overall ensure that there isn’t any significant downtime between the change-over i.e. The client-side shouldn’t have to make any changes.

In this post, I will try to detail how I set up a private local PyPI server using Docker And Ansible.

TL;DR

Deploy/destroy devpi server running in Docker container using a single command.

The How

After my initial research between devpi and pypi-server. Devpi won for various and obvious reasons!

I could have just bash scripted everything quickly before putting it all together but then where is the fun in that? Also, this has to run in Prod. Hence, why I decided on an over-engineered approach which would be a good learning platform.

The Walk-through

The setup is divided into two sections, Containerization and Automation.

This post-walk-through mainly focuses on containerisation. Go here for the automation.

Prerequisite

If you already have Docker and Docker-Compose installed and configured you can skip this step else you can search for your installation methods.

sudo apt install docker.io
# The Docker service needs to be set up to run at startup.
sudo systemctl start docker
sudo systemctl enable docker
python3 -m pip install docker-compose

Containerization

Part of my solution was that the PyPI server runs in a container preferably Docker for obvious reasons (current setup was running in a ProxMox LXC container). Using a container offers convenience and ensuring that deployments are deterministic.

Directory Structure

In this section, I will go through each file in our pypi_server directory, which houses the configurations.

├── Makefile
├── pypi_server
│   ├── config.yml
│   ├── create_pypi_index.sh
│   ├── docker-compose-dev.yaml
│   ├── docker-compose-stable.yaml
│   ├── Dockerfile
│   ├── entrypoint.sh
│   └── README.md
└── README.md

Makefile

Below is a snippet from our Makefile, which makes it a lot easier for our CI system to lint, build, tag and pushes images to our local Docker Registry. This means that instead of typing the whole docker command to build/tag or push, we can run something like:

# Which will lint the Dockerfile, build, tag and push the image to our local registry
make push_pypi_server

You can also check out my over-engineered Makefile here.

cat >> Makefile << EOF 
SHELL := /bin/bash -eo pipefail
# Defined images here
.PHONY: $(IMAGES)
IMAGES := pypi_server
# Docker registry URL
REGISTRY := 
.DEFAULT_GOAL := help

define PRINT_HELP_PYSCRIPT
import re, sys
print("Please use `make <target>` where <target> is one of\n")
for line in sys.stdin:
    match = re.match(r'(^([a-zA-Z-]+).*?)## (.*)$$', line)
    if match:
        target, _, help = match.groups()
        if not any([target.startswith('--'), '%' in target, '$$' in target]):
            target = target.replace(':','')
            print(f'{target:40} {help}')
        if '%' in target:
            target = target.replace('_%:', '_{image_name}').split(' ')[0]
            print(f'{target:40} {help}')
        if '$$' in target:
            target = target[:target.find(':')]
            print(f'{target:40} {help}')

endef
export PRINT_HELP_PYSCRIPT

.PHONY: help
help:
    @python3 -c "$$PRINT_HELP_PYSCRIPT" < $(MAKEFILE_LIST)

pre_build_%: IMAGE = $(subst pre_build_,,$@)
pre_build_%:  ## Run Dockerfile linter (https://github.com/hadolint/hadolint)
    docker run --rm -i hadolint/hadolint < $(IMAGE)/Dockerfile

build_cached_%: IMAGE = $(subst build_cached_,,$@)
build_cached_%: pre_build_%  ## Build the docker image [Using cache when building].
    docker build -t "$(IMAGE):latest" "${IMAGE}"

build_%: IMAGE = $(subst build_,,$@)
build_%: pre_build_%  ## Build the docker image [Not using cache when building].
    docker build --no-cache -t "$(IMAGE):latest" "${IMAGE}"
    touch .$@

tag_%: IMAGE = $(subst tag_,,$@)
tag_%: pre_build_%  ## Tag a container before pushing to cam registry.
    if [ ! -f ".build_${IMAGE}" ]; then \
        echo "Rebuilding the image: ${IMAGE}"; \
        make build_$(IMAGE); \
    fi;
    docker tag "$(IMAGE):latest" "$(REGISTRY)/$(IMAGE):latest"

push_%: IMAGE = $(subst push_,,$@)
push_%: tag_%  ## Push tagged container to cam registry.
    docker push $(REGISTRY)/$(IMAGE):latest
    rm -rf ".build_$(IMAGE)"
EOF

Dockerfile and scripts

I created the following Dockerfile, which executes a script entrypoint.sh upon container startup and also copies a create_pypi_index.sh script which should be run once when the devpi-server is up. This script creates and configures the indices.

cat >> Dockerfile << EOF 
FROM python:3.7

RUN pip install --no-cache-dir \
    devpi-client==5.2.2 \
    devpi-server==5.5.1 \
    devpi-web==4.0.6

ENV PYPI_PASSWORD
EXPOSE 3141
WORKDIR /root
VOLUME /root/.devpi

COPY create_pypi_index.sh /data/create_pypi_index.sh
RUN chmod a+x /data/create_pypi_index.sh

COPY entrypoint.sh /data/entrypoint.sh
ENTRYPOINT ["bash", "/data/entrypoint.sh"]

COPY config.yml /data/config.yml
CMD ["devpi-server", "-c", "/data/config.yml"]
EOF

entrypoint.sh

According to the docs:

When started afresh, devpi-server will not contain any users or indexes except for the root user and the root/pypi index (see using root/pypi index) which represents and caches https://pypi.org packages.

Note: The root/pypi index is a read-only cache of https://pypi.org, hence why a new index creation is necessary if you want to push packages as well.

cat >> entrypoint.sh << EOF 
#!/usr/bin/env bash
if ! [ -f /root/.devpi/server ]; then
    devpi-init
fi

exec "$@"
EOF

create_pypi_index.sh

Read more about indices creation here

cat >> create_pypi_index.sh << EOF 
#!/usr/bin/env bash

# Creates PyPI user and an index for uploading packages to.

devpi use http://localhost:3141
devpi login root --password=
devpi user -c pypi email= password=${PYPI_PASSWORD:-}
devpi user -l
devpi index -c pypi/stable bases=root/pypi volatile=True mirror_whitelist=*
EOF

Once the image has been build and a container is running, we can create an index by running the following:

PYPI_CONTAINER=$(docker ps --filter "name=pypi" --filter "status=running" --format "{{.Names}}")
docker exec -ti ${PYPI_CONTAINER} /bin/bash -c "/data/create_pypi_index.sh"

Devpi configuration

This is a YAML devpi configuration.

cat >> config.yml << EOF 
---
devpi-server:
  host: 0.0.0.0
  port: 3141
  restrict-modify: root
EOF

Compose file(s)

This is a developmental docker-compose that builds the image locally instead of using the image from the registry.

cat >> docker-compose-dev.yaml << EOF 
---
version: '3'
services:
  devpi:
    build:
      context: .
      dockerfile: ./Dockerfile
    ports:
      - "${DEVPI_PORT:-3141}:3141"
    volumes:
       - "${DEVPI_HOME:-./devpi}:/root/.devpi"
    tty: true
    stdin_open: true
EOF

The only difference between the docker-compose-dev.yaml and docker-compose-stable.yaml is one has a build context and the other has a defined image it pulls from

Run the command below to build the image and run the container on localhost

env DEVPI_HOME="${HOME}/.devpi" docker-compose -f docker-compose-dev.yaml up --build -d
# or 
# cat << EOF > .env
# DEVPI_HOME="${HOME}/.devpi"
# EOF
# docker-compose --env-file ./.env -f docker-compose-dev.yaml up --build -d
# --------------------------------------------------------------------------
# or native
# docker build -t pypi_server . 
# docker run -d -ti -v "${HOME}/.devpi:/root/.devpi" -p 3141:3141 pypi_server

The PyPI server available at: http://localhost:3141. If all went well you should see an image like below.

image

Garbage Collection

To clean up for whatever reason run the following command:

env DEVPI_HOME="${HOME}/.devpi" docker-compose -f docker-compose-dev.yaml down --volumes --rmi all

Client: Permanent index configuration for pip

To avoid having to re-type index URLs with pip or easy-install , you can configure pip by setting the index-url entry in your $HOME/.pip/pip.conf (posix) or $HOME/pip/pip.ini (windows). Let’s do it for the root/pypi index:

mkdir -p ~/.pip
cat >> ~/.pip/pip.conf << EOF 
[global]
no-cache-dir = false
timeout = 60
index-url = http://localhost:3141/root/pypi/stable
[search]
index = http://localhost:3141/root/pypi/
EOF

Alternatively, you can add a special environment variable to your shell settings (e.g. .bashrc):

cat >> ~/.bashrc << EOF 
export PIP_INDEX_URL=http://localhost:3141/root/pypi/stable/
EOF

Automation

I didn’t want the post to be too long.

Post continues here

Conclusion

Congratulations!!!

Assuming that everything was set up correctly. You now have a container running a local/private PyPI server and you can download or upload (using twine or devpi-client) packages.

We’ve been using this private package index for a few months, and also noticed a significant improvement in downloading and installing packages. The image shows the time it takes to download and install packages before and after. image

Note:

  • Uploading packages to the local PyPI server is beyond the scope of this post.
  • The purpose of this post was mainly to share the approach that worked well for us. You may use it to host your private package repository and index, adapting it to the cloud provider and web server of your choice.

Reference