Container registry SOP

Fedora uses the Docker Distribution container registry to host its container images.

Production instance: https://registry.fedoraproject.org

Contact information

Owner

Fedora Infrastructure Team

Contact

#fedora-admin

Persons

bowlofeggs cverna puiterwijk

Servers
  • oci-candidate-registry01.iad2.fedoraproject.org

  • oci-candidate-registry01.stg.iad2.fedoraproject.org

  • oci-registry01.iad2.fedoraproject.org

  • oci-registry01.stg.iad2.fedoraproject.org

  • oci-registry02.iad2.fedoraproject.org

Purpose

Serve Fedora’s container images

Configuring all nodes

Run this command from the ansible checkout to configure all nodes in production:

$ sudo rbac-playbook groups/oci-registry.yml

Upgrades

Fedora infrastructure uses the registry packaged and distributed with Fedora. Thus, there is no special upgrade procedure - a simple dnf update will do.

System architecture

The container registry is hosted in a fairly simple design. There are two hosts that run Docker Distribution to serve the registry API, and these hosts are behind a load balancer. These hosts will respond to all requests except for requests for blobs. Requests for blobs will receive a 302 redirect to https://cdn.registry.fedoraproject.org, which is a caching proxy hosted by CDN 77. The primary goal of serving the registry API ourselves is so that we can serve the container manifests over TLS so that users can be assured they are receiving the correct image blobs when they retrieve them. We do not rely on signatures since we do not have a Notary instance.

The two registry instances are configured not to cache their data, and use NFS to replicate their shared storage. This way, changes to one registry should appear in the other quickly.

Troubleshooting

Logs

You can monitor the registry via the systemd journal:

sudo journalctl -f -u docker-distribution

Running out of disk space

We have a niagos check that monitors the available disk space on /srv/registry. An ansible playbook is available to reclaim some disk space if needed:

sudo rbac-playbook manual/oci-registry-prune.yml

This will delete all the images that are older than 30 days on the candidate registries (prod and stg) and then run the garbage collection on the registries server.