diff options
author | 2023-08-14 19:27:09 +0200 | |
---|---|---|
committer | 2023-08-16 11:38:44 +0200 | |
commit | 464a96e670cbfd774440bb44d81d2760cb1d0d44 (patch) | |
tree | e702d1b63e0e87db9f2e12b5ebf72083a3f430b7 /docs/onboarding | |
parent | Add further infomraiton to READMEs (diff) |
Integrate onboarding documents from Notion
Co-authored-by: Chris Lovering <[email protected]>
Diffstat (limited to 'docs/onboarding')
-rw-r--r-- | docs/onboarding/access.md | 18 | ||||
-rw-r--r-- | docs/onboarding/resources.md | 30 | ||||
-rw-r--r-- | docs/onboarding/rules.md | 15 | ||||
-rw-r--r-- | docs/onboarding/tools.md | 50 |
4 files changed, 113 insertions, 0 deletions
diff --git a/docs/onboarding/access.md b/docs/onboarding/access.md new file mode 100644 index 0000000..1bfe7cf --- /dev/null +++ b/docs/onboarding/access.md @@ -0,0 +1,18 @@ +--- +title: Access table +date: 2022-09-18 +description: | + Who has access to what. +--- + + +| **Resource** | **Description** | **Keyholders** | +|:------------:|:---------------:|:--------------:| +| Linode Kubernetes Cluster | The primary cluster where all resources are deployed. | Hassan, Joe, Chris, Leon, Sebastiaan, Johannes | +| Linode Dashboard | The online dashboard for managing and allocating resources from Linode. | Joe, Chris | +| Netcup Dashboard | The dashboard for managing and allocating resources from Netcup. | Joe, Chris | +| Netcup servers | Root servers provided by the Netcup partnership. | Joe, Chris, Hassan, Johannes | +| Grafana | The primary aggregation dashboard for most resources. | Admins, Moderators, Core Developers and DevOps (with varying permissions) | +| Prometheus Dashboard | The Prometheus query dashboard. Access is controlled via Cloudflare Access. | Hassan, Joe, Johannes, Chris | +| Alertmanager Dashboard | The alertmanager control dashboard. Access is controlled via Cloudflare Access. | Hassan, Joe, Johannes, Chris | +| `git-crypt`ed files in infra repository| `git-crypt` is used to encrypt certain files within the repository. At the time of writing this is limited to kubernetes secret files. | Chris, Joe, Hassan, Johannes, Xithrius | diff --git a/docs/onboarding/resources.md b/docs/onboarding/resources.md new file mode 100644 index 0000000..498e903 --- /dev/null +++ b/docs/onboarding/resources.md @@ -0,0 +1,30 @@ +--- +title: Resources +date: 2022-09-18 +description: | + Important reference documents for the team. +--- + +The following is a collection of important reference documents for the DevOps +team. + +## [Infra Repo](https://github.com/python-discord/infra) + +This GitHub repo contains most of the manifests and configuration applies to +our cluster. It’s kept up to date manually and is considered a source of truth +for what we should have in the cluster. + +It is mostly documented, but improvements for unclear or outdated aspects is +always welcome. + +## [Knowledge base](https://python-discord.github.io/infra/) + +Deployed using GH pages, source can be found in the docs directory of the k8s +repo. + +This includes: + +- Changelogs +- Post-mortems +- Common queries +- Runbooks diff --git a/docs/onboarding/rules.md b/docs/onboarding/rules.md new file mode 100644 index 0000000..0cd42e4 --- /dev/null +++ b/docs/onboarding/rules.md @@ -0,0 +1,15 @@ +--- +title: Rules +date: 2022-09-18 +description: | + The rules any DevOps team member must follow. +--- + +1. LMAO - **L**ogging, **M**onitoring, **A**lerting, **O**bservability +2. Modmail is the greatest piece of software ever written +3. Modmail needs at least 5 minutes to gather all its greatness at startup +4. We never blame Chris, it's always <@233481908342882304>'s fault +5. LKE isn’t bad, it’s your fault for not paying for the high availability + control plane +6. Our software is never legacy, it's merely well-aged +7. Ignore these rules (however maybe not 1, 1 seems important to remember) diff --git a/docs/onboarding/tools.md b/docs/onboarding/tools.md new file mode 100644 index 0000000..4fb4e4c --- /dev/null +++ b/docs/onboarding/tools.md @@ -0,0 +1,50 @@ +--- +title: Tools +date: 2022-09-18 +description: | + The tools that DevOps uses to run their shop. +--- + +We use a few tools to manage, monitor, and interact with our infrastructure. +Some of these tools are not unique to the DevOps team, and may be shared by +other teams. + +Most of these are gated behind a Cloudflare Access system, which is accessible +to the [DevOps Team](https://github.com/orgs/python-discord/teams/devops) on +GitHub. These are marked with the ☁️ emoji. If you don’t have access, please +contact Chris or Joe. + +## [Grafana](https://grafana.pythondiscord.com/) + +Grafana provides access to some of the most important resources at your +disposal. It acts as an aggregator and frontend for a large amount of data. +These range from metrics, to logs, to stats. Some of the most important are +listed below: + +- Service Logs/All App Logs Dashboard + + Service logs is a simple log viewer which gives you access to a large + majority of the applications deployed in the default namespace. The All App + logs dashboard is an expanded version of that which gives you access to all + apps in all namespaces, and allows some more in-depth querying. + +- Kubernetes Dashboard + + This dashboard gives quick overviews of all the most important metrics of + the Kubernetes system. For more detailed information, check out other + dashboard such as Resource Usage, NGINX, and Redis. + + +Accessed via a GitHub login, with permission for anyone in the dev-core or +dev-ops team. + +## [Prometheus Dashboard](https://prometheus.pythondiscord.com/) (☁️)) + +This provides access to the Prometheus query console. You may also enjoy the +[Alertmanager Console](https://alertmanager.pythondiscord.com/). + +## [King Arthur](https://github.com/python-discord/king-arthur/) + +King Arthur is a discord bot which provides information about, and access to +our cluster directly in discord. Invoke its help command for more information +(`M-x help`). |