0x90-pyinfra/README.md

344 lines
9.1 KiB
Markdown

# initial setup
- install `pyinfra` with your favorite package manager
or
- install `pipx` with your favorite package manager
- add `~/.local/bin` to your `PATH`
- `pipx install pyinfra`
# before each use
- communicate your intent to do changes to your co-admins to prevent conflicting access
- run `git pull` to fetch the newest version
- run `pyinfra @local deploy.py` to install/update `0x90.ssh_config` trustmebro
- run `pyinfra --dry inventory.py deploy.py` and check that you are on the same state that is already deployed
# social practices
maintainers: people who know (next to) everything and would be able to learn the rest
adepts: people who are still learning about the infrastructure, but don't need to keep everything in mind
associates: others, who just need to maintain a certain service
Discussions can happen:
- in presence (gathering), should happen at least every 3-4 months, to discuss the big picture
- in presence (coworking), while working on new services
- in issues and PRs for concrete proposals
- in online calls to fix emergencies
- in chat groups for exploring ideas and everything else
## structure of this repository
this repository documents the current state
of the infrastructure.
For each server/VM,
it contains a directory with
- a README.md file which gives an overview on the server
- a pyinfra inventory.py file
- a pyinfra deploy.py file which documents what's installed
- the configuration files pyinfra deploys
- optional: a deploy-restore.py file which can restore data from backup
- optional: other pyinfra deploy files which only manage certain services or tasks, like upgrades
The repository also contains a lib/ directory
with pyinfra packages we reuse accross servers.
With pull requests we can propose changes
to the current infrastructure.
PRs need to be approved by at least one maintainer.
The pyinfra code in PRs can already be deployed,
if it is not destructive - decide responsibly.
## create a VM
To add a new VM for a service you want to manage,
0. Checkout a new branch with `git checkout -b your-server-name`
1. Add your VM to inventory.py
2. Create a directory for the VM
3. Add your VM to ararat/deploy.py
4. Ask the core team to run `pyinfra ararat.0x90.space ararat/deploy.py`
to create your VM
5. Write your pyinfra deployment script in your-server-name/deploy.py
6. Deploy it, if it doesn't work change it, repeat until the service works
7. Copy TEMPLATE.md to your-server-name/README.md and fill it out.
You can leave out parts which are obvious from your deploy.py file.
8. Commit your changes, push them to your branch,
open a pull request from your branch to the development branch,
and ask a maintainer to review and merge it
## tools we use
The hope is that you don't need to know all of these tools
to already do useful things,
but can systematically dive deeper into the infrastructure.
### pass
password manager to store passphrases and secrets,
the repository with our secrets
is at <https://git.0x90.space/missytake/0x90-secrets> for now.
### ssh
to connect to servers and VMs with root@,
no sudo,
root should have set a password,
but via SSH, password access should be forbidden.
There should be no shared SSH keys,
one SSH key per person.
SSH private keys should be password-protected
and only stored on laptops
with hard disk encryption.
### systemctl & journalctl
to look at status and log output of services.
systemd is a good way of keeping services running,
at least on Linux machines.
On openBSD we will use /etc/rc.d/ scripts.
### git
for updating the documentation,
pushing and pulling secrets,
and opening PRs to doku/pyinfra repos.
to be discussed:
- Keep in mind that PRs can and will be deployed to servers. OR
- The main branch should always reflect the state of the machine.
### markdown + sembr
for documenting the infrastructure.
[Semantic line breaks](https://sembr.org/) are great
for formatting text files
which are managed in git.
### kvm + virsh
as a hypervisor
which we can use to create VMs
for specific services.
The hypervisor is a minimal alpine linux,
with "boot to RAM",
the data-partition for the VM images is encrypted.
### pyinfra
as a nice declarative config tool for deployment.
we can also maintain some of the things we need
in extra python modules.
pyinfra vs. ansible? ~> need to investigate. currently ansible setup on golem, pyinfra used in deltachat and 1 ezra service.
### podman
to isolate services in root-less containers.
a podman container should run in a systemd process.
it takes some practice to understand
how to run commands inside a container
or where the files are mounted.
But it goes well with pyinfra
if it's managed in systemd.
### nftables
as a declarative firewall
which can be managed in pyinfra.
### nginx
as an HTTPS reverse proxy,
passing traffic on to the podman containers.
### acmetool
as a tool to manage Let's Encrypt certificates,
which goes well with pyinfra
because of it's declarative nature.
It also ships acmetool-redirector
which redirects HTTP traffic on port 80
to nginx on port 443.
There is a pyinfra package for it at
https://github.com/deltachat/pyinfra-acmetool/
https://man.openbsd.org/acme-client + https://man.openbsd.org/relayd on OpenBSD
### cron
to schedule recurring tasks,
like acmetool's certificate renewals
or the nightly borgbackup runs.
on OpenBSD already daily cronjob that executes /etc/daily.local
### borgbackup
can be used to back up application data
in a nightly cron job.
Backups need to be stored at an extra backup server.
There is a pyinfra package for it at
https://github.com/deltachat/pyinfra-borgbackup/
might also look at restic ~> append-only backup better restricted
### wireguard
as a VPN to connect the backup server,
which can be at some private house,
with the production servers.
### prometheus
as a tool to measure service uptime
and measure typical errors
from journalctl output.
It can expose metrics via HTTPS
behind basic auth.
### grafana
as a visual dashboard to show service uptime
and whether services throw errors.
It can also send out email alerts.
### team-bot
a deltachat bot to receive support requests
and email alerts from grafana.
# Set up alpine on hetzner
This was only tested with a cloud VPS so far.
Source: <https://gist.github.com/c0m4r/e38d41d0e31f6adda4b4c5a88ba0a453>
(but it's less of a hassle than described there)
To create an alpine server on hetzner,
you need to first create a Debian VPS or something similar.
Then you boot into the rescue system.
Get the download link of the latest VIRTUAL x86_64 alpine iso
from <https://alpinelinux.org/downloads/>.
Login to the rescue system via console or SSH,
and write the ISO to the disk:
```
ssh root@xxxx:xxxx:xxxx:xxxx::1
wipefs -a /dev/sda
wget https://dl-cdn.alpinelinux.org/alpine/v3.20/releases/x86_64/alpine-virt-3.20.3-x86_64.iso # or whatever link you got from alpine
dd if=alpine-virt-3.20.3-x86_64.iso of=/dev/sda
reboot
```
Then open the server console (SSH doesn't work),
login to root (no password required),
and proceed with:
```
cp -r /.modloop /root
cp -r /media/sda /root
umount /.modloop /media/sda
rm /lib/modules
mv /root/.modloop/modules /lib
mv /root/sda /media
setup-alpine
```
Then select what you wish,
contrary to the guide above,
DHCP is actually fine.
The drive should be sda,
the installation type can be sys
(why go through the hassle).
Voilà! reboot and login.
Probably the first SSH login will be via root password,
as copy-pasting your public SSH key into the console doesn't work really.
Make sure the SSH config allows this
(and turn passwort root access off afterwards).
## Encrypting /var/lib/libvirt partition
**Status: tested with Hetzner VPS, not deployed in production yet**
Messing with file systems and partitions
should not be done by automation scripts,
so I created the LUKS-encrypted /dev/sdb partition manually.
(So far, /dev/sdb was added via a Hetzner volume,
but it can be any partition actually)
To create a partition in the VPS volume
(which was formatted to ext4 originally),
- I ran `fdisk /dev/sdb`,
- entered `o` to create a DOS partition table,
- added `n` to add a new primary partition, using all available space,
- and `w` to save to disk and exit.
Then I ran `cryptsetup luksFormat /dev/sdb1`
and entered the passphrase from `pass 0x90/ararat/sdb-crypt`
to create a LUKS volume.
Now I could decrypt the new volume with
`cryptsetup luksOpen /dev/sdb1 sdb_crypt`
and entering the passphrase from `pass 0x90/ararat/sdb-crypt`.
Finally, I ran `mkfs.ext4`
to create an ext4 file system
in the encrypted partition.
# mount qcow2 VM disk images
This is a quick guide to mounting a qcow2 disk images on your host server. This is useful to reset passwords,
edit files, or recover something without the virtual machine running.
**Step 1 - Enable NBD on the Host**
```
modprobe nbd max_part=8
```
**Step 2 - Connect the QCOW2 as network block device**
```
qemu-nbd --connect=/dev/nbd0 /var/lib/vz/images/100/vm-100-disk-1.qcow2
```
**Step 3 - Find The Virtual Machine Partitions**
```
fdisk /dev/nbd0 -l
```
**Step 4 - Mount the partition from the VM**
```
mount /dev/nbd0p1 /mnt
```
**Step 5 - After you are done, unmount and disconnect**
```
umount /mnt/somepoint/
qemu-nbd --disconnect /dev/nbd0
rmmod nbd
```