doc: added social practices & common tools
This commit is contained in:
parent
300381bd47
commit
607fd113a4
206
README.md
206
README.md
|
@ -12,3 +12,209 @@ or
|
|||
- run `git pull` to fetch the newest version
|
||||
- run `pyinfra @local deploy.py` to install/update `0x90.ssh_config` trustmebro
|
||||
- run `pyinfra --dry inventory.py deploy.py` and check that you are on the same state that is already deployed
|
||||
|
||||
|
||||
# social practices
|
||||
|
||||
maintainers: people who know (next to) everything and would be able to learn the rest
|
||||
adepts: people who are still learning about the infrastructure, but don't need to keep everything in mind
|
||||
associates: others, who just need to maintain a certain service
|
||||
|
||||
Discussions can happen:
|
||||
- in presence (gathering), should happen at least every 3-4 months, to discuss the big picture
|
||||
- in presence (coworking), while working on new services
|
||||
- in issues and PRs for concrete proposals
|
||||
- in online calls to fix emergencies
|
||||
- in chat groups for exploring ideas and everything else
|
||||
|
||||
|
||||
## structure of this repository
|
||||
|
||||
this repository documents the current state
|
||||
of the infrastructure.
|
||||
|
||||
For each server/VM,
|
||||
it contains a directory with
|
||||
|
||||
- a README.md file which gives an overview on the server
|
||||
- a pyinfra inventory.py file
|
||||
- a pyinfra deploy.py file which documents what's installed
|
||||
- the configuration files pyinfra deploys
|
||||
- optional: a deploy-restore.py file which can restore data from backup
|
||||
- optional: other pyinfra deploy files which only manage certain services or tasks, like upgrades
|
||||
|
||||
The repository also contains a lib/ directory
|
||||
with pyinfra packages we reuse accross servers.
|
||||
|
||||
With pull requests we can propose changes
|
||||
to the current infrastructure.
|
||||
PRs need to be approved by at least one maintainer.
|
||||
The pyinfra code in PRs can already be deployed,
|
||||
if it is not destructive - decide responsibly.
|
||||
|
||||
|
||||
## create a VM
|
||||
|
||||
To add a new VM for a service you want to manage,
|
||||
|
||||
0. Checkout a new branch with `git checkout -b your-server-name`
|
||||
1. Add your VM to inventory.py
|
||||
2. Create a directory for the VM
|
||||
3. Add your VM to ararat/deploy.py
|
||||
4. Ask the core team to run `pyinfra ararat.0x90.space ararat/deploy.py`
|
||||
to create your VM
|
||||
5. Write your pyinfra deployment script in your-server-name/deploy.py
|
||||
6. Deploy it, if it doesn't work change it, repeat until the service works
|
||||
7. Copy TEMPLATE.md to your-server-name/README.md and fill it out.
|
||||
You can leave out parts which are obvious from your deploy.py file.
|
||||
8. Commit your changes, push them to your branch,
|
||||
open a pull request from your branch to the development branch,
|
||||
and ask a maintainer to review and merge it
|
||||
|
||||
|
||||
## tools we use
|
||||
|
||||
The hope is that you don't need to know all of these tools
|
||||
to already do useful things,
|
||||
but can systematically dive deeper into the infrastructure.
|
||||
|
||||
### pass
|
||||
|
||||
password manager to store passphrases and secrets,
|
||||
the repository with our secrets
|
||||
is at <https://git.0x90.space/missytake/0x90-secrets> for now.
|
||||
|
||||
### ssh
|
||||
|
||||
to connect to servers and VMs with root@,
|
||||
no sudo,
|
||||
root should have set a password,
|
||||
but via SSH, password access should be forbidden.
|
||||
|
||||
There should be no shared SSH keys,
|
||||
one SSH key per person.
|
||||
SSH private keys should be password-protected
|
||||
and only stored on laptops
|
||||
with hard disk encryption.
|
||||
|
||||
### systemctl & journalctl
|
||||
|
||||
to look at status and log output of services.
|
||||
systemd is a good way of keeping services running,
|
||||
at least on Linux machines.
|
||||
On openBSD we will use /etc/rc.d/ scripts.
|
||||
|
||||
### git
|
||||
|
||||
for updating the documentation,
|
||||
pushing and pulling secrets,
|
||||
and opening PRs to doku/pyinfra repos.
|
||||
|
||||
to be discussed:
|
||||
- Keep in mind that PRs can and will be deployed to servers. OR
|
||||
- The main branch should always reflect the state of the machine.
|
||||
|
||||
### markdown + sembr
|
||||
|
||||
for documenting the infrastructure.
|
||||
[Semantic line breaks](https://sembr.org/) are great
|
||||
for formatting text files
|
||||
which are managed in git.
|
||||
|
||||
### kvm + virsh
|
||||
|
||||
as a hypervisor
|
||||
which we can use to create VMs
|
||||
for specific services.
|
||||
|
||||
The hypervisor is a minimal alpine linux,
|
||||
with "boot to RAM",
|
||||
the data-partition for the VM images is encrypted.
|
||||
|
||||
### pyinfra
|
||||
|
||||
as a nice declarative config tool for deployment.
|
||||
we can also maintain some of the things we need
|
||||
in extra python modules.
|
||||
|
||||
pyinfra vs. ansible? ~> need to investigate. currently ansible setup on golem, pyinfra used in deltachat and 1 ezra service.
|
||||
|
||||
### podman
|
||||
|
||||
to isolate services in root-less containers.
|
||||
a podman container should run in a systemd process.
|
||||
it takes some practice to understand
|
||||
how to run commands inside a container
|
||||
or where the files are mounted.
|
||||
But it goes well with pyinfra
|
||||
if it's managed in systemd.
|
||||
|
||||
### nftables
|
||||
|
||||
as a declarative firewall
|
||||
which can be managed in pyinfra.
|
||||
|
||||
### nginx
|
||||
|
||||
as an HTTPS reverse proxy,
|
||||
passing traffic on to the podman containers.
|
||||
|
||||
### acmetool
|
||||
|
||||
as a tool to manage Let's Encrypt certificates,
|
||||
which goes well with pyinfra
|
||||
because of it's declarative nature.
|
||||
|
||||
It also ships acmetool-redirector
|
||||
which redirects HTTP traffic on port 80
|
||||
to nginx on port 443.
|
||||
|
||||
There is a pyinfra package for it at
|
||||
https://github.com/deltachat/pyinfra-acmetool/
|
||||
|
||||
https://man.openbsd.org/acme-client + https://man.openbsd.org/relayd on OpenBSD
|
||||
|
||||
### cron
|
||||
|
||||
to schedule recurring tasks,
|
||||
like acmetool's certificate renewals
|
||||
or the nightly borgbackup runs.
|
||||
|
||||
on OpenBSD already daily cronjob that executes /etc/daily.local
|
||||
|
||||
### borgbackup
|
||||
|
||||
can be used to back up application data
|
||||
in a nightly cron job.
|
||||
|
||||
Backups need to be stored at an extra backup server.
|
||||
|
||||
There is a pyinfra package for it at
|
||||
https://github.com/deltachat/pyinfra-borgbackup/
|
||||
|
||||
might also look at restic ~> append-only backup better restricted
|
||||
|
||||
### wireguard
|
||||
|
||||
as a VPN to connect the backup server,
|
||||
which can be at some private house,
|
||||
with the production servers.
|
||||
|
||||
### prometheus
|
||||
|
||||
as a tool to measure service uptime
|
||||
and measure typical errors
|
||||
from journalctl output.
|
||||
It can expose metrics via HTTPS
|
||||
behind basic auth.
|
||||
|
||||
### grafana
|
||||
|
||||
as a visual dashboard to show service uptime
|
||||
and whether services throw errors.
|
||||
It can also send out email alerts.
|
||||
|
||||
### team-bot
|
||||
|
||||
a deltachat bot to receive support requests
|
||||
and email alerts from grafana.
|
||||
|
|
Loading…
Reference in a new issue