344 lines
9.1 KiB
Markdown
344 lines
9.1 KiB
Markdown
# initial setup
|
|
- install `pyinfra` with your favorite package manager
|
|
|
|
or
|
|
|
|
- install `pipx` with your favorite package manager
|
|
- add `~/.local/bin` to your `PATH`
|
|
- `pipx install pyinfra`
|
|
|
|
# before each use
|
|
- communicate your intent to do changes to your co-admins to prevent conflicting access
|
|
- run `git pull` to fetch the newest version
|
|
- run `pyinfra @local deploy.py` to install/update `0x90.ssh_config` trustmebro
|
|
- run `pyinfra --dry inventory.py deploy.py` and check that you are on the same state that is already deployed
|
|
|
|
|
|
# social practices
|
|
|
|
maintainers: people who know (next to) everything and would be able to learn the rest
|
|
adepts: people who are still learning about the infrastructure, but don't need to keep everything in mind
|
|
associates: others, who just need to maintain a certain service
|
|
|
|
Discussions can happen:
|
|
- in presence (gathering), should happen at least every 3-4 months, to discuss the big picture
|
|
- in presence (coworking), while working on new services
|
|
- in issues and PRs for concrete proposals
|
|
- in online calls to fix emergencies
|
|
- in chat groups for exploring ideas and everything else
|
|
|
|
|
|
## structure of this repository
|
|
|
|
this repository documents the current state
|
|
of the infrastructure.
|
|
|
|
For each server/VM,
|
|
it contains a directory with
|
|
|
|
- a README.md file which gives an overview on the server
|
|
- a pyinfra inventory.py file
|
|
- a pyinfra deploy.py file which documents what's installed
|
|
- the configuration files pyinfra deploys
|
|
- optional: a deploy-restore.py file which can restore data from backup
|
|
- optional: other pyinfra deploy files which only manage certain services or tasks, like upgrades
|
|
|
|
The repository also contains a lib/ directory
|
|
with pyinfra packages we reuse accross servers.
|
|
|
|
With pull requests we can propose changes
|
|
to the current infrastructure.
|
|
PRs need to be approved by at least one maintainer.
|
|
The pyinfra code in PRs can already be deployed,
|
|
if it is not destructive - decide responsibly.
|
|
|
|
|
|
## create a VM
|
|
|
|
To add a new VM for a service you want to manage,
|
|
|
|
0. Checkout a new branch with `git checkout -b your-server-name`
|
|
1. Add your VM to inventory.py
|
|
2. Create a directory for the VM
|
|
3. Add your VM to ararat/deploy.py
|
|
4. Ask the core team to run `pyinfra ararat.0x90.space ararat/deploy.py`
|
|
to create your VM
|
|
5. Write your pyinfra deployment script in your-server-name/deploy.py
|
|
6. Deploy it, if it doesn't work change it, repeat until the service works
|
|
7. Copy TEMPLATE.md to your-server-name/README.md and fill it out.
|
|
You can leave out parts which are obvious from your deploy.py file.
|
|
8. Commit your changes, push them to your branch,
|
|
open a pull request from your branch to the development branch,
|
|
and ask a maintainer to review and merge it
|
|
|
|
|
|
## tools we use
|
|
|
|
The hope is that you don't need to know all of these tools
|
|
to already do useful things,
|
|
but can systematically dive deeper into the infrastructure.
|
|
|
|
### pass
|
|
|
|
password manager to store passphrases and secrets,
|
|
the repository with our secrets
|
|
is at <https://git.0x90.space/missytake/0x90-secrets> for now.
|
|
|
|
### ssh
|
|
|
|
to connect to servers and VMs with root@,
|
|
no sudo,
|
|
root should have set a password,
|
|
but via SSH, password access should be forbidden.
|
|
|
|
There should be no shared SSH keys,
|
|
one SSH key per person.
|
|
SSH private keys should be password-protected
|
|
and only stored on laptops
|
|
with hard disk encryption.
|
|
|
|
### systemctl & journalctl
|
|
|
|
to look at status and log output of services.
|
|
systemd is a good way of keeping services running,
|
|
at least on Linux machines.
|
|
On openBSD we will use /etc/rc.d/ scripts.
|
|
|
|
### git
|
|
|
|
for updating the documentation,
|
|
pushing and pulling secrets,
|
|
and opening PRs to doku/pyinfra repos.
|
|
|
|
to be discussed:
|
|
- Keep in mind that PRs can and will be deployed to servers. OR
|
|
- The main branch should always reflect the state of the machine.
|
|
|
|
### markdown + sembr
|
|
|
|
for documenting the infrastructure.
|
|
[Semantic line breaks](https://sembr.org/) are great
|
|
for formatting text files
|
|
which are managed in git.
|
|
|
|
### kvm + virsh
|
|
|
|
as a hypervisor
|
|
which we can use to create VMs
|
|
for specific services.
|
|
|
|
The hypervisor is a minimal alpine linux,
|
|
with "boot to RAM",
|
|
the data-partition for the VM images is encrypted.
|
|
|
|
### pyinfra
|
|
|
|
as a nice declarative config tool for deployment.
|
|
we can also maintain some of the things we need
|
|
in extra python modules.
|
|
|
|
pyinfra vs. ansible? ~> need to investigate. currently ansible setup on golem, pyinfra used in deltachat and 1 ezra service.
|
|
|
|
### podman
|
|
|
|
to isolate services in root-less containers.
|
|
a podman container should run in a systemd process.
|
|
it takes some practice to understand
|
|
how to run commands inside a container
|
|
or where the files are mounted.
|
|
But it goes well with pyinfra
|
|
if it's managed in systemd.
|
|
|
|
### nftables
|
|
|
|
as a declarative firewall
|
|
which can be managed in pyinfra.
|
|
|
|
### nginx
|
|
|
|
as an HTTPS reverse proxy,
|
|
passing traffic on to the podman containers.
|
|
|
|
### acmetool
|
|
|
|
as a tool to manage Let's Encrypt certificates,
|
|
which goes well with pyinfra
|
|
because of it's declarative nature.
|
|
|
|
It also ships acmetool-redirector
|
|
which redirects HTTP traffic on port 80
|
|
to nginx on port 443.
|
|
|
|
There is a pyinfra package for it at
|
|
https://github.com/deltachat/pyinfra-acmetool/
|
|
|
|
https://man.openbsd.org/acme-client + https://man.openbsd.org/relayd on OpenBSD
|
|
|
|
### cron
|
|
|
|
to schedule recurring tasks,
|
|
like acmetool's certificate renewals
|
|
or the nightly borgbackup runs.
|
|
|
|
on OpenBSD already daily cronjob that executes /etc/daily.local
|
|
|
|
### borgbackup
|
|
|
|
can be used to back up application data
|
|
in a nightly cron job.
|
|
|
|
Backups need to be stored at an extra backup server.
|
|
|
|
There is a pyinfra package for it at
|
|
https://github.com/deltachat/pyinfra-borgbackup/
|
|
|
|
might also look at restic ~> append-only backup better restricted
|
|
|
|
### wireguard
|
|
|
|
as a VPN to connect the backup server,
|
|
which can be at some private house,
|
|
with the production servers.
|
|
|
|
### prometheus
|
|
|
|
as a tool to measure service uptime
|
|
and measure typical errors
|
|
from journalctl output.
|
|
It can expose metrics via HTTPS
|
|
behind basic auth.
|
|
|
|
### grafana
|
|
|
|
as a visual dashboard to show service uptime
|
|
and whether services throw errors.
|
|
It can also send out email alerts.
|
|
|
|
### team-bot
|
|
|
|
a deltachat bot to receive support requests
|
|
and email alerts from grafana.
|
|
|
|
|
|
# Set up alpine on hetzner
|
|
|
|
This was only tested with a cloud VPS so far.
|
|
Source: <https://gist.github.com/c0m4r/e38d41d0e31f6adda4b4c5a88ba0a453>
|
|
(but it's less of a hassle than described there)
|
|
|
|
To create an alpine server on hetzner,
|
|
you need to first create a Debian VPS or something similar.
|
|
|
|
Then you boot into the rescue system.
|
|
|
|
Get the download link of the latest VIRTUAL x86_64 alpine iso
|
|
from <https://alpinelinux.org/downloads/>.
|
|
|
|
Login to the rescue system via console or SSH,
|
|
and write the ISO to the disk:
|
|
|
|
```
|
|
ssh root@xxxx:xxxx:xxxx:xxxx::1
|
|
wipefs -a /dev/sda
|
|
wget https://dl-cdn.alpinelinux.org/alpine/v3.20/releases/x86_64/alpine-virt-3.20.3-x86_64.iso # or whatever link you got from alpine
|
|
dd if=alpine-virt-3.20.3-x86_64.iso of=/dev/sda
|
|
reboot
|
|
```
|
|
|
|
Then open the server console (SSH doesn't work),
|
|
login to root (no password required),
|
|
and proceed with:
|
|
|
|
```
|
|
cp -r /.modloop /root
|
|
cp -r /media/sda /root
|
|
umount /.modloop /media/sda
|
|
rm /lib/modules
|
|
mv /root/.modloop/modules /lib
|
|
mv /root/sda /media
|
|
setup-alpine
|
|
```
|
|
|
|
Then select what you wish,
|
|
contrary to the guide above,
|
|
DHCP is actually fine.
|
|
The drive should be sda,
|
|
the installation type can be sys
|
|
(why go through the hassle).
|
|
|
|
Voilà! reboot and login.
|
|
Probably the first SSH login will be via root password,
|
|
as copy-pasting your public SSH key into the console doesn't work really.
|
|
Make sure the SSH config allows this
|
|
(and turn passwort root access off afterwards).
|
|
|
|
|
|
## Encrypting /var/lib/libvirt partition
|
|
|
|
**Status: tested with Hetzner VPS, not deployed in production yet**
|
|
|
|
Messing with file systems and partitions
|
|
should not be done by automation scripts,
|
|
so I created the LUKS-encrypted /dev/sdb partition manually.
|
|
|
|
(So far, /dev/sdb was added via a Hetzner volume,
|
|
but it can be any partition actually)
|
|
|
|
To create a partition in the VPS volume
|
|
(which was formatted to ext4 originally),
|
|
- I ran `fdisk /dev/sdb`,
|
|
- entered `o` to create a DOS partition table,
|
|
- added `n` to add a new primary partition, using all available space,
|
|
- and `w` to save to disk and exit.
|
|
|
|
Then I ran `cryptsetup luksFormat /dev/sdb1`
|
|
and entered the passphrase from `pass 0x90/ararat/sdb-crypt`
|
|
to create a LUKS volume.
|
|
|
|
Now I could decrypt the new volume with
|
|
`cryptsetup luksOpen /dev/sdb1 sdb_crypt`
|
|
and entering the passphrase from `pass 0x90/ararat/sdb-crypt`.
|
|
|
|
Finally, I ran `mkfs.ext4`
|
|
to create an ext4 file system
|
|
in the encrypted partition.
|
|
|
|
|
|
# mount qcow2 VM disk images
|
|
|
|
This is a quick guide to mounting a qcow2 disk images on your host server. This is useful to reset passwords,
|
|
edit files, or recover something without the virtual machine running.
|
|
|
|
**Step 1 - Enable NBD on the Host**
|
|
|
|
```
|
|
modprobe nbd max_part=8
|
|
```
|
|
|
|
**Step 2 - Connect the QCOW2 as network block device**
|
|
|
|
```
|
|
qemu-nbd --connect=/dev/nbd0 /var/lib/vz/images/100/vm-100-disk-1.qcow2
|
|
```
|
|
|
|
**Step 3 - Find The Virtual Machine Partitions**
|
|
|
|
```
|
|
fdisk /dev/nbd0 -l
|
|
```
|
|
|
|
**Step 4 - Mount the partition from the VM**
|
|
|
|
```
|
|
mount /dev/nbd0p1 /mnt
|
|
```
|
|
|
|
**Step 5 - After you are done, unmount and disconnect**
|
|
|
|
```
|
|
umount /mnt/somepoint/
|
|
qemu-nbd --disconnect /dev/nbd0
|
|
rmmod nbd
|
|
```
|
|
|