How To Host an Arch Linux Mirror
So, you want to host an Arch Mirror? Let’s do it.
Requirements
Before beginning on this journey make sure, make sure you have 100 GiB of available disk space and a reliable network connection. Although it’s recommended to have 150 GiB.
Finding a mirror
Here’s a list of all the Tier 1 mirrors for Arch Linux, sort it by mirror score (lower is better) and pick a mirror closest to you. Make sure the Completion is 100% or the mirror might be unreliable. Grab a mirror which supports rsync.
A note about Mirror Tiering
- Arch Linux uses a 2-tier mirroring scheme.
- All Tier 1 mirrors sync to archlinux.org
- All Tier 2 mirrors sync to Tier 1 mirrors
- Users can choose either Tier 1 or Tier 2 mirrors.
Syncing to a mirror
mkdir /srv/mirrorsrsync -rlptH --safe-links \ --delete-delay \ --delay-updates \ --info=progress2 \ --exclude='archive' \ --exclude='other' \ --exclude='sources' \ --exclude='pool/*-debug' \ rsync://mirror-url \ /srv/mirrors/archlinuxCommand Breakdown
r: Recursive - Transfer directories and their contents.l: Copy symlinks as symlinks.p: Preserve permissions.t: Preserve modification times.H: Preserve hard links.
--safe-links: Ensure symlinks do not point outside the destination tree.--delete-delay: Delay file deletions until the transfer is complete.--delay-updates: Create updated files in a temporary location before replacing existing ones.--exclude: Exclude specific files or directories from syncing--info=progress2: Show progress of the transfer, including the current file and overall progress.
- Source:
rsync://mirror-rsync-url - Destination:
/srv/mirrors/archlinux
Serving the mirror
To serve our new mirror over both http and https, we will be using Caddy web server. Why Caddy? It’s simple.
If you have a domain then I would suggest using Cloudflare to manage it. If you don’t have a domain, then you could get a free one from GitHub Student Developer Pack.
Cloudflare Tunnel Token
Head over to Cloudflare Zero Trust and create a new tunnel.
- Network > Tunnels > Create a tunnel > Copy & Paste the token in your env file
- Service = http://caddy:9001
- Choose a domain where your mirrors will be served
compose.yaml
services: caddy: image: serfriz/caddy-cloudflare:2 container_name: caddy volumes: - ./Caddyfile:/etc/caddy/Caddyfile - /srv:/srv - caddy-data:/data ports: - "80:80" - "443:443" - "443:443/udp" environment: - MIRROR_ROOT=$MIRROR_ROOT restart: unless-stopped
tunnel: container_name: cloudflared-tunnel image: cloudflare/cloudflared command: tunnel run environment: - TUNNEL_TOKEN=$TUNNEL_TOKEN restart: unless-stopped
volumes: caddy-data: external: trueCaddyfile
:9001 { root * {env.MIRROR_ROOT} file_server browse}.env
TUNNEL_TOKEN=<your-tunnel-token>MIRROR_ROOT=/srv/mirrorsRunning the server
docker volume create caddy-datadocker compose up -dKeeping in Sync
To qualify as a Tier 2 mirror, we need to sync atleast a day with an upstream Tier 1 mirror. It is also expected of us to sync on a random minute to space out requests with the other mirrors.
So, how do we stay in sync? CRON.
Cron is a process scheduler that allows you to execute commands at specified schedules. All your tasks are stored inside a crontab. To edit it, run crontab -e and paste the following.
0 */2 * * * /bin/bash -c 'sleep $((RANDOM % 3600))' && rsync -rlptH --safe-links --delete-delay --delay-updates --info=progress2 --exclude="archive" --exclude="other" --exclude="sources" --exclude="pool/*-debug" rsync://mirror-url /srv/mirrors/archlinuxWhat is this alien language you ask? Let’s break it down.
Cron Schedule Breakdown
0: This specifies the minute when the job should run. Here, it is0, which means it runs at the start of the hour.*/2: This specifies every 2 hours. So the job will run every 2nd hour, on the hour.
*: This wildcard indicates that the job will run every day of the month.*: This wildcard means it will run every month.*: This wildcard means it will run every day of the week.
/bin/bash -c 'sleep $((RANDOM % 3600))': sleep between 0 to 60 minutes, effectively running cron at a random interval.- And the last part is our sync command!
Conclusion
Hosting a FOSS mirror is a great way to give back to the community, but it’s important to remember that Arch Linux is an established project with a wide network of high-speed mirrors worldwide. There are other, smaller projects that could benefit more from additional support. By hosting mirrors for niche distributions projects you can have a greater impact.