Skip to content

How To Host an Arch Linux Mirror

So, you want to host an Arch Mirror? Let’s do it.

Requirements

Before beginning on this journey make sure, make sure you have 100 GiB of available disk space and a reliable network connection. Although it’s recommended to have 150 GiB.

Finding a mirror

Here’s a list of all the Tier 1 mirrors for Arch Linux, sort it by mirror score (lower is better) and pick a mirror closest to you. Make sure the Completion is 100% or the mirror might be unreliable. Grab a mirror which supports rsync.

A note about Mirror Tiering

  • Arch Linux uses a 2-tier mirroring scheme.
  • All Tier 1 mirrors sync to archlinux.org
  • All Tier 2 mirrors sync to Tier 1 mirrors
  • Users can choose either Tier 1 or Tier 2 mirrors.

Syncing to a mirror

Terminal window
mkdir /srv/mirrors
rsync -rlptH --safe-links \
--delete-delay \
--delay-updates \
--info=progress2 \
--exclude='archive' \
--exclude='other' \
--exclude='sources' \
--exclude='pool/*-debug' \
rsync://mirror-url \
/srv/mirrors/archlinux

Command Breakdown

  • r: Recursive - Transfer directories and their contents.
  • l: Copy symlinks as symlinks.
  • p: Preserve permissions.
  • t: Preserve modification times.
  • H: Preserve hard links.
  • --safe-links: Ensure symlinks do not point outside the destination tree.
  • --delete-delay: Delay file deletions until the transfer is complete.
  • --delay-updates: Create updated files in a temporary location before replacing existing ones.
  • --exclude: Exclude specific files or directories from syncing
  • --info=progress2: Show progress of the transfer, including the current file and overall progress.
  • Source: rsync://mirror-rsync-url
  • Destination: /srv/mirrors/archlinux

Serving the mirror

To serve our new mirror over both http and https, we will be using Caddy web server. Why Caddy? It’s simple.

If you have a domain then I would suggest using Cloudflare to manage it. If you don’t have a domain, then you could get a free one from GitHub Student Developer Pack.

Cloudflare Tunnel Token

Head over to Cloudflare Zero Trust and create a new tunnel.

  1. Network > Tunnels > Create a tunnel > Copy & Paste the token in your env file
  2. Service = http://caddy:9001
  3. Choose a domain where your mirrors will be served

compose.yaml

services:
caddy:
image: serfriz/caddy-cloudflare:2
container_name: caddy
volumes:
- ./Caddyfile:/etc/caddy/Caddyfile
- /srv:/srv
- caddy-data:/data
ports:
- "80:80"
- "443:443"
- "443:443/udp"
environment:
- MIRROR_ROOT=$MIRROR_ROOT
restart: unless-stopped
tunnel:
container_name: cloudflared-tunnel
image: cloudflare/cloudflared
command: tunnel run
environment:
- TUNNEL_TOKEN=$TUNNEL_TOKEN
restart: unless-stopped
volumes:
caddy-data:
external: true

Caddyfile

:9001 {
root * {env.MIRROR_ROOT}
file_server browse
}

.env

Terminal window
TUNNEL_TOKEN=<your-tunnel-token>
MIRROR_ROOT=/srv/mirrors

Running the server

Terminal window
docker volume create caddy-data
docker compose up -d

Keeping in Sync

To qualify as a Tier 2 mirror, we need to sync atleast a day with an upstream Tier 1 mirror. It is also expected of us to sync on a random minute to space out requests with the other mirrors.

So, how do we stay in sync? CRON. Cron is a process scheduler that allows you to execute commands at specified schedules. All your tasks are stored inside a crontab. To edit it, run crontab -e and paste the following.

0 */2 * * * /bin/bash -c 'sleep $((RANDOM % 3600))' && rsync -rlptH --safe-links --delete-delay --delay-updates --info=progress2 --exclude="archive" --exclude="other" --exclude="sources" --exclude="pool/*-debug" rsync://mirror-url /srv/mirrors/archlinux

What is this alien language you ask? Let’s break it down.

Cron Schedule Breakdown

  • 0: This specifies the minute when the job should run. Here, it is 0, which means it runs at the start of the hour.
  • */2: This specifies every 2 hours. So the job will run every 2nd hour, on the hour.
  • *: This wildcard indicates that the job will run every day of the month.
  • *: This wildcard means it will run every month.
  • *: This wildcard means it will run every day of the week.
  • /bin/bash -c 'sleep $((RANDOM % 3600))': sleep between 0 to 60 minutes, effectively running cron at a random interval.
  • And the last part is our sync command!

Conclusion

Hosting a FOSS mirror is a great way to give back to the community, but it’s important to remember that Arch Linux is an established project with a wide network of high-speed mirrors worldwide. There are other, smaller projects that could benefit more from additional support. By hosting mirrors for niche distributions projects you can have a greater impact.

References