Experimental browser for the Atmosphere
{ "uri": "at://did:plc:44ybard66vv44zksje25o7dz/com.whtwnd.blog.entry/3lo7a2a4qxg2l", "cid": "bafyreicoezwj5p4lh6xj37274xyjxwf5jzrfsyv345bedxir4ozjj3gejy", "value": { "$type": "com.whtwnd.blog.entry", "theme": "github-light", "title": "A Full-Network Relay for $34 a Month", "content": "This is an update to a [Summer 2024 blog post](https://whtwnd.com/bnewbold.net/3kwzl7tye6u2y). At the time, atproto relays required a cache of the full network on local disk to validate data structures. With the [Sync v1.1](https://github.com/bluesky-social/proposals/tree/main/0006-sync-iteration) updates, relays don't need all that disk I/O. What impact does that have on hosting setup and operating costs?\n\nTurns out the dev community beat me to the punch and folks like [@futur.blue](https://whtwnd.com/futur.blue/3lkubavdilf2m) and [@bad-example](https://bsky.app/profile/bad-example.com/post/3lnykcasepc2c) have been doing this for months on Rapsberry Pis and $19/month VPS servers in European jurisdiction.\n\nBut i'm still really psyched at how much simpler things have become, so I set up my own $34/month VPS demo instance, and this is the write-up.\n\nThe relay describe here is running at: <https://relay-vps.demo.bsky.dev>\n\n\n## Let's Go Shopping\n\nWhat kind of a server do we need? As of April 2025, the atproto network firehose throughput hits a peak of around 600 events/second on a random day. In the past we have seen sustained rates of 2000 events/sec, and it is nice to have some headroom.\n\nOn our (large) production servers, the relay has been running steady-state with a couple CPU cores utilized, and up to 12 GByte or so of RAM. Memory usage starts low and creeps up; it is mostly consumed by identity caching, which can be configured down.\n\nWe see about 30 Mbps of sustained bandwidth for a firehose WebSocket. On the ingest side, that is the sum of all the PDS connections; on the output side, it is a single unified socket (one per consumer). Some overhead is good, both for traffic and recovering from downtime, so i'd recommend looking for a server with 200 Mbps unmetered.\n\nDisk used to be the main consideration, but now it isn't. The \"backfill window\" or \"replay window\" uses the most storage, but can be configured to reduce size (eg, 1hr or 12hr instead of the 72hr default). Disk utilizaiton doesn't grow unbounded, it just scales as a function of network traffic in the past few hours or days. It is still nice to have a fast disk, but a few hundred GByte are enough today.\n\nI checked prices on a couple popular providers, and found a good deal on a VPS instance in the USA. It costs $30/month with no setup fee; with tax that came out to about $34/month. It has 8 vCPU, 16 GB RAM, 160 GB of disk, and an unmetered 1 Gbps network connection. This isn't even a bare-bones provider, and there are comparable options available in many regions and jurisdictions. This instance should have enough resources to get through spikes of traffic, or a couple doublings of network growth, before it really starts to run hot.\n\n## Provisioning and Setup\n\nI did a basic install with Ubuntu 24.04, got my SSH keys and `sudo` set up, and configured DNS.\n\nSome of the commands I ran:\n\n```\napt update\napt upgrade\napt install ripgrep fd-find dstat htop iotop iftop pg-activity httpie caddy golang postgresql yarnpkg build-essential\n\n# set up yarn command; could also have used nvm\nln -s /usr/bin/yarnpkg /usr/bin/yarn\n\n# set the hostname\nhostnamectl hostname relay-example.demo.bsky.dev\n\n# punch holes in default firewall for HTTP/S\nufw allow 80/tcp\nufw allow 443/tcp\n\n# create data directories\nmkdir -p /data/relay\nmkdir -p /data/relay/persist\nchown ubuntu:ubuntu /data/relay/\nchown ubuntu:ubuntu /data/relay/persist\n```\n\nYou can create random passwords with OpenSSL: `openssl rand -base64 24`\n\nNext I set up PostgreSQL (ran these commands after connecting with `sudo -u postgres psql`).\n\n```\nCREATE USER relay WITH PASSWORD 'CHANGEME';\n\nCREATE DATABASE relay;\nGRANT ALL PRIVILEGES ON DATABASE relay TO relay;\n\n# these are needed for newer versions of postgres\n\\c relay postgres\nGRANT ALL ON SCHEMA public TO relay;\n```\n\nI didn't bother with any other PostgreSQL or system tuning, just left it all default.\n\nCaddy is great as a simple reverse proxy. It does auto-TLS certificates and handles WebSockets by default. You can probably get this going with certbot and haproxy or nginx as well.\n\nCreate a system-wide Caddy config at `/etc/caddy/Caddyfile`, replacing any existing file. Substitute in your hostname:\n\n```\nrelay-example.demo.bsky.dev {\n reverse_proxy 127.0.0.1:2470\n}\n```\n\nThen, cloned and built the project in the home directory of the `ubuntu` user:\n\n```\n# pull source code and build. if you had patches or a working branch, would modify here\ngit clone https://github.com/bluesky-social/indigo\ncd indigo\nmake build-relay-admin-ui build\n```\n\nA nifty thing with modern Go is that it will auto-download newer toolchains if needed, so you shouldn't need to bother with installing a specific version.\n\nFor a demo, i'm just going to run this thing directly in a `screen` session, though you could also poke around in the git repo and find a Dockerfile.\n\n\n## Configuration and Bootstrap\n\nYou can see what configuration variables are available with `./relay -h` and `./relay serve -h`. I recommend creating a `.env` file to store the config.\n\nHere is what I put in the `.env` (obviously changing the passwords as needed):\n\n```\nRELAY_ADMIN_PASSWORD=CHANGEME\nDATABASE_URL=postgres://relay:CHANGEME@localhost:5432/relay\nRELAY_PERSIST_DIR=/data/relay/persist\nRELAY_TRUSTED_DOMAINS=*.host.bsky.network\nENVIRONMENT=demo\n\n# configure a short 2 hour backfill/replay window\nRELAY_REPLAY_WINDOW=2h\n\n# keep this 'true' during Sync v1.1 protocol transition, then remove (eg, late summer 2025)\nRELAY_LENIENT_SYNC_VALIDATION=true\n\n# this line is for 'goat'\nATP_RELAY_HOST=http://localhost:2470\n```\n\nIndigo comes with the `goat` command-line tool, which is helpful for administering and poking around. It can share the `.env` file as long as you run it from the same directory as the relay.\n\nAt this point, things should be ready to run! You can give it a try with either of:\n\n```\n# from built executable\n./relay serve\n\n# re-build on every run\ngo run ./cmd/relay serve\n```\n\nOut of the box, the relay doesn't know about any hosts though! You'll probably want to bootstrap it from an existing relay. Shut down the relay (if it was running), and pull in a set of hosts (this could take a few minutes):\n\n```\n./relay pull-hosts --relay-host https://relay1.us-west.bsky.network\n```\n\nWith a full host set, it might take a few minutes for the relay to ramp after every restart. The main cause of this right now is identity lookups: fetching DID documents from the network. It should be possible soon to set up a local redis instances to make restarts smoother, but it isn't really needed for a demo like this.\n\nThat's it! The relay will start consuming from the current cursor offsets of all the hosts (PDS instances), it doesn't do any backfill. The output firehose should be ready to use pretty quickly.\n\n\n## Exploring and Administering\n\nThe relay comes with a web UI which you can reach at `https://<hostname>/dash` (you'll need to put in the admin password).\n\nYou can also use the `goat relay admin` commands to do account and host takedowns, ban entire domain suffixes, tweak limits and quotas, etc.\n\nHere are some example commands for poking around:\n\n```\ngoat relay host diff https://relay1.us-west.bsky.network https://relay-example.demo.bsky.dev\n\ngoat firehose\n\ngoat relay host add pds.example.com\n\ngoat relay admin host list\n\ngoat firehose | pv -l -i10 > /dev/null\n\ngoat firehose --verify-basic --verify-sig --verify-mst -q\n```\n\nThe relay exports Prometheus metrics on a separate port:\n\n```\nhttp get :2471/metrics\n```\n\n\n## Resource Use\n\nMy demo relay isn't using much CPU:\n\n```\n# uptime\n\n03:10:30 up 8 days, 21:39, 2 users, load average: 1.17, 1.11, 1.07\n```\n\n```\n# dstat\n\n----total-usage---- -dsk/total- -net/total- ---paging-- ---system--\nusr sys idl wai stl| read writ| recv send| in out | int csw \n 5 2 92 0 0| 0 3510k|1323k 53k| 0 0 |8480 13k\n 5 1 91 1 0| 0 3989k|1723k 66k| 0 0 |9132 14k\n 6 2 90 0 0| 0 3048k|1958k 53k| 0 0 | 10k 16k\n 15 3 79 1 0| 0 8927k|4516k 92k| 0 0 | 17k 26k\n 7 2 89 1 0| 0 4392k|1973k 57k| 0 0 | 11k 18k\n 8 2 88 1 0| 0 5193k|2376k 65k| 0 0 | 13k 21k\n 7 1 89 1 0| 0 4199k|2113k 60k| 0 0 | 12k 19k\n 6 2 89 1 0| 0 4688k|1943k 61k| 0 0 | 12k 19k\n 7 2 90 1 0| 0 3968k|1932k 55k| 0 0 | 11k 17k\n 6 2 89 1 0| 0 4702k|2047k 61k| 0 0 | 11k 18k\n 7 2 88 1 0| 0 4552k|1987k 93k| 0 0 | 13k 20k\n```\n\n```\n# pg_activity\n\nPostgreSQL 16.8 - relay-vps.demo.bsky.dev - postgres@/var/run/postgresql:5432/postgres - Ref.: 2s -\n * Global: 7 days, 19 hours and 19 minutes uptime, 1.73G dbs size - 0B/s growth, 95.18% cache hit ratio\n Sessions: 41/100 total, 1 active, 40 idle, 0 idle in txn, 0 idle in txn abrt, 0 waiting\n Activity: 766 tps, 0 insert/s, 536 update/s, 0 delete/s, 1316 tuples returned/s, 0 temp files, 0B tem\n * Worker processes: 0/8 total, 0/4 logical workers, 0/8 parallel workers\n Other processes & info: 0/3 autovacuum workers, 0/10 wal senders, 0 wal receivers, 0/10 repl. slots\n * Mem.: 15.25G total, 323.92M (2.07%) free, 2.06G (13.48%) used, 12.88G (84.44%) buff+cached\n Swap: 0B total, 0B (-) free, 0B (-) used\n IO: 0/s max iops, 0B/s - 0/s read, 0B/s - 0/s write\n Load average: 1.31 1.12 1.08\n```\n\n```\n# sudo du -sh /data/relay /var/log /var/lib/postgresql/\n\n21G /data/relay\n114M /var/log\n2.4G /var/lib/postgresql/\n```\n\n```\n# df -h /\n\nFilesystem Size Used Avail Use% Mounted on\n/dev/sda1 154G 30G 125G 19% /\n```", "createdAt": "2025-05-02T16:32:34.531Z", "visibility": "public" } }