MirrorManager Infrastructure SOP

MirrorManager manages mirrors for fedora distribution.

Contact Information

Owner

Fedora Infrastructure Team

Contact

#fedora-admin, sysadmin-main, sysadmin-web

Servers

Hosted in OpenShift

Mirrorlist Servers

Docker container on the proxy servers

Purpose

Manage mirrors for Fedora distribution

Description

MirrorManager handles our mirroring system. It keeps track of lists of valid mirrors and handles handing out metalink URLs to end users to download packages from.

Everything runs in OpenShift. There is a cron job to scan the master mirror (NFS mounted at /srv) using the mm2_update-master-directory-list script (umdl) for changes. Changed directories are detected by comparing the ctime to the value in the database.

There are also jobs to compare the content on the mirrors with the results from umdl using RSYNC, HTTP, HTTPS. The crawler schedule can be viewed in the vars/apps/mirrormanager.yml file in Ansible.

If the content on the mirrors is the same as on the master those mirrors are included in the dynamic metalink/mirrorlist.

A hourly job generates a binary file which contains the information about the state of each mirror. This file is used by the mirrorlist containers on the proxy servers to dynamically generate the metalink/mirrorlist for each client individually.

The frontend deployment runs the web interface to manipulate the mirrors. Each mirror-admin can only change the details of the associated mirror. Members of the FAS group sysadmin-web can seen and change all existing mirrors.

The mirrorlist provided by the frontend has no actively consumed content and is therefore heavily cached (12h). It is only used to give an overview of existing mirrors.

Additionally the frontend provides

The frontend is also used for report_mirror check-ins. This is used by mirrors to report their status independent of the crawlers.

Release Preparation

MirrorManager should automatically detect the new release version, and will create a new Version() object in the database. This is visible on the Version page in the web UI, and on https://mirrormanager.fedoraproject.org/.

If the versioning scheme changes, it’s possible this will fail. If so, contact the Mirror Wrangler.

Move to Archive

Once the files of an EOL release have been copied to the archive directory tree and enough mirrors have picked the files up at the archive location there is also a playbook to adapt those paths in MirrorManager’s database:

$ rbac-playbook -v /srv/web/infra/ansible/playbooks/manual/mirrormanager/move-to-archive.yml --extra-vars="product='EPEL' version='7'"

mirrorlist containers and mirrorlist servers

Every hour at :55 after the hour, a job generates a binary file with all the current mirrormanager information in it and syncs it to proxies and mirrorlist-servers. Each proxy accepts requests to mirrors.fedoraproject.org on apache, then uses haproxy to determine what backend will reply. There are 2 containers defined on each proxy: mirrorlist1 and mirrorlist2. haproxy will look for those first, then fall back to any of the mirrorlist servers defined over the vpn.

At :15 after the hour, a script runs on all proxies: /usr/local/bin/restart-mirrorlist-containers This script starts up mirrorlist2 container, makes sure it can process requests and then if so, restarts mirrorlist1 container with the new pkl data. If not, mirrorlist1 keeps running with the old data. During this process at least one (with mirrorlists servers as backup) server is processing requests so users see no issues.

mirrorlist-containers log to /var/log/mirrormanager/mirrorlist\{1|2}/ on the host proxy server.

Troubleshooting and Resolution

Regenerating the Publiclist

On os-control01:

oc -n mirrormanager create job --from=cj/update-mirrorlist-cache update-mirrorlist-cache-manual

This command generates a new mirrorlist file and transfers it to the proxies. The mirrorlist containers on the proxies are restarted 15 minutes after each full hour.

The mirrorlist generation can take up to 20 minutes. The logs can be viewed with:

oc -n mirrormanager logs -f job/update-mirrorlist-cache-manual

Once done, the job should be deleted from openshift with:

oc -n mirrormanager delete job/update-mirrorlist-cache-manual

If a faster solution is required the mirrorlist file from the previous run is available at:

/var/lib/mirrormanager/old/mirrorlist_cache.pkl

Updating the mirrorlist containers

The container used for mirrorlists is the mirrormanager2-mirrorlist container in Fedora dist git: https://src.fedoraproject.org/cgit/container/mirrormanager2-mirrorlist.git/ The one being used is defined in a ansible variable in: roles/mirrormanager/mirrorlist_proxy/defaults/main.yml (TODO: This file no longer exists, find the new place where this is defined) and in turn used in systemd unit files for mirrorlist1 and mirrorlist2. To update the container used, update this variable, run the playbook and then restart the mirrorlist1 and mirrorlist2 containers on each proxy. Note that this may take a while the first time as the image has to be downloaded from our registry.

Debugging problems with mirrorlist container startup

Sometimes on boot some hosts won’t be properly serving mirrorlists. This is due to a container startup issue. run: docker ps -a as root to see the active containers. It will usually say something like 'exited(1)' or the like. Record the container id and then run: docker rm --force <containerid> then run docker ps -a and confirm nothing shows. Then run systemctl start mirrorlist1 and it should correctly start mirrorlist1.

General debugging for mirrorlist containers

Docker commands like docker ps -a show a fair bit of information. Also, systemctl status mirrorlist1/2 or the journal should have information when a container is failing.