aboutsummaryrefslogtreecommitdiffstats
path: root/kubernetes (follow)
Commit message (Collapse)AuthorAgeLines
* Bump mogno mem requests and limitGravatar Chris Lovering2024-07-10-2/+2
|
* Add sumer code jam announcement channel idGravatar Chris Lovering2024-07-05-0/+1
|
* Add YouTube API key to king-arthurGravatar Chris Lovering2024-07-05-0/+3
| | | | This also documents secrets that were already present in the file.
* Update Sir Robin to CJ11 (#399)Gravatar Boris Muratov2024-07-03-1/+1
|
* Move noqa definition required in latest ruff versionGravatar Chris Lovering2024-07-01-2/+2
|
* Allow new kube-state-metrics image to watch ingressesGravatar Joe Banks2024-07-01-0/+1
|
* Move away from vendored kube-state-metricsGravatar Joe Banks2024-07-01-1/+1
|
* Add issuer for Vault certificates in tooling namespaceGravatar Joe Banks2024-06-27-0/+5
| | | | | We will use this to deploy internal TLS certificates from a self-signed CA that allows for TLS traffic within the cluster.
* Add deployment of KeycloakGravatar Joe Banks2024-06-27-0/+122
|
* Scale AM back to 3 replicasGravatar Chris Lovering2024-06-24-1/+1
|
* Add ff-bot deploymentGravatar Joe Banks2024-06-16-0/+82
|
* Add Kubernetes volume alertsGravatar Joe Banks2024-06-16-0/+11
| | | | | | | | | | | It seems that Linode has added storage reporting info to the CSI driver allowing us to pick up on the storage use of persistent volume claims within the cluster. This creates and deploys an alert that will report if any volume has under 10% of space left. I have excluded Prometheus as our TSDB retention settings mean that it will always stay just below it's volume size by design.
* Update Loki config with new compactor preferences for retention modesGravatar Joe Banks2024-06-13-1/+6
| | | | | | | | | * `retention_enabled`: enable retention mode within the compactor * `delete_request_store`: store deletion requests within the s3 cluster that is also used to house log chunks * `delete_request_cancel_period`: do not exercise log deletion instructions until at least one hour has passed to prevent accidental deletion
* Update Prometheus deployment with a tmpfs for the reloaderGravatar Joe Banks2024-06-10-0/+9
|
* Add secrets for reloader webhookGravatar Joe Banks2024-06-10-0/+0
|
* Add sidecar container to reload Prometheus config on changeGravatar Joe Banks2024-06-10-0/+25
|
* Add reloader hook configmap to reload prometheus on changeGravatar Joe Banks2024-06-10-0/+38
|
* Add Alert for Prometheus config reload failureGravatar Joe Banks2024-06-10-0/+9
|
* Enable scraping of Prometheus podsGravatar Joe Banks2024-06-10-0/+3
|
* Update Pinnwand logo to square imageGravatar Joe Banks2024-06-09-1/+1
|
* Update from command to args in site deploymentGravatar Joe Banks2024-06-07-1/+1
| | | | | | | | Kubernetes renames ENTRYPOINT in Docker images to command and any additional args go in `args` (confusing, I know!) This ensures that we run within the context of Poetry so can reach Django and other installed requirements when running migrations.
* Remove unnecessary shell execution for migration initContainerGravatar Joe Banks2024-06-07-3/+3
|
* Update site to run migrations in an init containerGravatar Joe Banks2024-06-07-0/+13
| | | | | | | | | | | | | | | | | | In accordance with updates from python-discord/site#1338 this changes the way migrations are run. Previously, migrations would be run all from within the manage.py execution process with the command being manually spawned using Django internals. After python-discord/site#1338 merges the Dockerfile will directly invoke gunicorn and bypass manage.py to simplify the process and avoid problems with shared database contexts. Hence, we need to manually run migrations using an init container. With testing there is no additional delay in doing this as spinning up an init container is cheap and we don't cut over any traffic until the site passes a healthcheck anyway.
* Rename relabelledpods to just podsGravatar Joe Banks2024-06-07-1/+1
| | | | | | | This was a redundant rename and reduced the clarity of jobs when querying from inside Grafana. This rectifies that by renaming the stream to just `pods`.
* Reflect pydis.wtf certificate into Loki namespaceGravatar Joe Banks2024-06-07-2/+2
|
* Add secret for Loki authenticationGravatar Joe Banks2024-06-07-0/+0
|
* Add new Ingress for Loki gatewayGravatar Joe Banks2024-06-07-0/+25
|
* Add Metricity manifestGravatar Joe Banks2024-06-06-0/+30
| | | | Copies the Metricity deployment manifest from the Metricity repo.
* Add tmpfs to King ArthurGravatar Joe Banks2024-06-05-0/+9
|
* Remove PostgreSQL Exporter from KubernetesGravatar Joe Banks2024-06-02-55/+0
|
* Remove Kubernetes PostgreSQL AlertsGravatar Joe Banks2024-06-02-29/+0
|
* Remove Kubernetes PostgreSQL backup from BlackboxGravatar Joe Banks2024-06-02-6/+1
|
* Remove PostgreSQL deployment from KubernetesGravatar Joe Banks2024-06-02-127/+0
|
* Update pixels environment variableGravatar Joe Banks2024-06-02-0/+0
|
* Update Metabase configuration secretGravatar Joe Banks2024-06-02-0/+0
|
* Update site secret with new database addressGravatar Joe Banks2024-06-01-0/+0
|
* Update site and metricity with new metricity db user credentialsGravatar Joe Banks2024-05-28-0/+0
|
* Update kube-system namespace docs with new metrics-server detailsGravatar Joe Banks2024-05-28-4/+5
|
* Add Helm deployment info for metrics-serverGravatar Joe Banks2024-05-28-0/+24
| | | | | | | Due to the way Linode seems to issue certificates for our nodes, we need to disable TLS verification for communications to fetch metric information. It's unfortunate but non-critical and it does restore metrics-server functionality.
* Add documentation on services deployed to the kube-system namespaceGravatar Joe Banks2024-05-28-0/+33
|
* Add new ServiceAccount for cert issuanceGravatar Joe Banks2024-05-27-0/+5
|
* Update mTLS bundle for ingress-nginxGravatar Joe Banks2024-05-27-36/+46
|
* Add Helm instructions for VaultGravatar Joe Banks2024-05-27-0/+54
|
* Add pydis.wtf cert to vault namespaceGravatar Joe Banks2024-05-27-2/+2
|
* Fix AlertManager Discord instance formattingGravatar Joe Banks2024-05-27-1/+1
| | | | | | | | | | | We made a change to include the instance in alerts sent to Discord, but not all of our configured alerts send this field. As a result, we would have incorrectly formatted alerts being sent through to Discord which were tricky to read. The format template has now been changed to only conditionally render the instance label if it is present on a triggered alert.
* Take 15 minutes before alerting on high latencyGravatar Johannes Christ2024-05-20-2/+2
|
* Instruct code jam management to connect to lovelaceGravatar Johannes Christ2024-05-18-0/+0
|
* Instruct black knight to connect to lovelaceGravatar Johannes Christ2024-05-18-0/+0
|
* Annotations.instance => Labels.instanceGravatar Joe Banks2024-05-18-1/+1
|
* Add instance to AlertManager Discord embedsGravatar Joe Banks2024-05-17-1/+1
|