Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@ all: install-site build

.PHONY: assets
assets:
docker run --rm -it -v $(PWD):/data rlespinasse/drawio-export:4.1.0 -s 3 -b 10 -f jpg --remove-page-suffix -o /data/static/img/assets/xks/operator-guide/ /data/assets/xks/operator-guide/
docker run --rm -it -v $(PWD):/data rlespinasse/drawio-export:4.1.0 -s 3 -b 10 -f jpg --remove-page-suffix -o /data/static/img/assets/xks/developer-guide/ /data/assets/xks/developer-guide/
docker run --rm -it -v $(PWD):/data rlespinasse/drawio-export:4.1.0 -s 3 -b 10 -f jpg --remove-page-suffix -o /data/static/img/assets/xkf/operator-guide/ /data/assets/xkf/operator-guide/
docker run --rm -it -v $(PWD):/data rlespinasse/drawio-export:4.1.0 -s 3 -b 10 -f jpg --remove-page-suffix -o /data/static/img/assets/xkf/developer-guide/ /data/assets/xkf/developer-guide/

.SILENT:
serve: all
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,12 @@ import useBaseUrl from '@docusaurus/useBaseUrl';

In the terminology of [Microsoft Cloud Adoption Framework](https://docs.microsoft.com/en-us/azure/cloud-adoption-framework/ready/enterprise-scale/architecture) (CAF), Xenit Kubernetes Service is an enterprise-scale landing zone. Additionally, the workload supports multiple cloud providers and AWS is also supported at the moment (but still requires the governance part in Azure).

<img alt="XKS Overview" src={useBaseUrl("img/assets/xks/operator-guide/aks-overview.jpg")} />
<img alt="XKF Overview" src={useBaseUrl("img/assets/xkf/operator-guide/aks-overview.jpg")} />

### Glossary

- Platform team: the team managing the platform (XKF)
- Tenant: A group of people (team/project/product) at the company using XKS
- Tenant: A group of people (team/project/product) at the company using XKF

## Role-based access management

Expand Down Expand Up @@ -76,7 +76,7 @@ Other than that, most of the access and work with the tenant resources are done

By default, the network setup is expected to be quite autonomous and usually considered to be an external service compared to everything else in the organization using it. It is possible to setup peering with internal networks, but usually it begins with a much simpler setup and then grows organically when required.

<img alt="XKS Simple Network Design" src={useBaseUrl("img/assets/xks/operator-guide/simple-network-design.jpg")} />
<img alt="XKF Simple Network Design" src={useBaseUrl("img/assets/xkf/operator-guide/simple-network-design.jpg")} />

The cluster environments are completely separated from each other, but a hub in the production subscription has a peering with them to provide static IP-addresses for CI/CD like terraform to access resources.

Expand Down Expand Up @@ -104,10 +104,10 @@ Most of the management of the workloads that the tenants deploy are handled thro

## Xenit Kubernetes Framework

XKF is set up from a set of Terraform modules that when combined creates the full XKS service. There are multiple individual states that all fulfill their own purpose and build
XKF is set up from a set of Terraform modules that when combined creates the full XKF service. There are multiple individual states that all fulfill their own purpose and build
upon each other in a hierarchical manner. The first setup requires applying the Terraform in the correct order, but after that ordering should not matter. Separate states are used
as it allows for a more flexible architecture that could be changed in parallel.
<img alt="XKS Overview" src={useBaseUrl("img/assets/xks/operator-guide/aks-overview.jpg")} />
<img alt="XKF Overview" src={useBaseUrl("img/assets/xkf/operator-guide/aks-overview.jpg")} />

The AKS Terraform contains three modules that are used to setup a Kubernetes cluster. To allow for blue/green deployments of AKS clusters resources have to be split up into
global resources that can be shared between the clusters, and cluster-specific resources.
Expand All @@ -117,4 +117,4 @@ The aks-global module contains the global resources like ACR, DNS and Azure AD c
The aks and aks-core module creates an AKS cluster and configures it. This cluster will have a suffix, normally a number to allow for temporarily creating multiple clusters
when performing a blue/green deployment of the clusters. Namespaces will be created in the cluster for each of the configured tenants. Each namespace is linked to a resource
group in Azure where namespace resources are expected to be created.
<img alt="AKS Resource Groups" src={useBaseUrl("img/assets/xks/operator-guide/aks-rg-xks-overview.jpg")} />
<img alt="AKS Resource Groups" src={useBaseUrl("img/assets/xkf/operator-guide/aks-rg-xks-overview.jpg")} />
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ title: Continuous Delivery

import useBaseUrl from '@docusaurus/useBaseUrl';

Continuous Delivery (CD) should be the only way to make changes to running applications in the XKS service.
Continuous Delivery (CD) should be the only way to make changes to running applications in the XKF service.
This is to ensure that changes are consistent and tracked in a centralized manner that can be observed by all.

## GitOps
Expand All @@ -27,7 +27,7 @@ The core feature of the gitops repo is that one of the pipelines automatically u

We have to grant it permissions to do this, sadly manually...

<img alt="CI access" src={useBaseUrl("img/assets/xks/developer-guide/gitops_repo_settings.png")} />
<img alt="CI access" src={useBaseUrl("img/assets/xkf/developer-guide/gitops_repo_settings.png")} />

### Service connections

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
id: gitops
title: GitOps a la XKS
title: GitOps a la XKF
---

import useBaseUrl from '@docusaurus/useBaseUrl';
Expand All @@ -9,9 +9,9 @@ import useBaseUrl from '@docusaurus/useBaseUrl';

> GitOps works by using Git as a single source of truth for declarative infrastructure and applications. With GitOps, the use of software agents can alert on any divergence between Git and what is running in [an environment]. If there is a difference, Kubernetes reconcilers automatically update or rollback the cluster depending on what is appropriate. &dash; _[Weave Works - Guide To GitOps](https://www.weave.works/technologies/gitops/)_

XKS supports GitHub and Azure DevOps with almost identical workflows. XKF refers to these as Git providers. For simplicity, we refer to their CI/CD automation as "pipelines". If you are using GitHub, whenever this text refers to "pipeline", think "GitHub Actions workflow". As you saw in the previous section, XKS comes with a set of pipelines that automatically detects app releases and promotes them through a series of environments. The allows both rapid iteration and strong validation of apps.
XKF supports GitHub and Azure DevOps with almost identical workflows. XKF refers to these as Git providers. For simplicity, we refer to their CI/CD automation as "pipelines". If you are using GitHub, whenever this text refers to "pipeline", think "GitHub Actions workflow". As you saw in the previous section, XKF comes with a set of pipelines that automatically detects app releases and promotes them through a series of environments. The allows both rapid iteration and strong validation of apps.

XKS is built around [trunk-based development](https://trunkbaseddevelopment.com/).
XKF is built around [trunk-based development](https://trunkbaseddevelopment.com/).

## User story: Emilia updates an app

Expand All @@ -36,7 +36,7 @@ The `dev` and `qa` environments have `auto: true` which means that new releases

The flow is fully automatic and is triggered by the container image upload.

<img alt="Apply to dev" src={useBaseUrl("img/assets/xks/developer-guide/developer-flow-apply-dev.jpg")} />
<img alt="Apply to dev" src={useBaseUrl("img/assets/xkf/developer-guide/developer-flow-apply-dev.jpg")} />

1. The <img src={useBaseUrl("img/gitops/acr-icon.png")} style={{width: '1em'}} /> / <img src={useBaseUrl("img/gitops/ecr-icon.png")} style={{width: '1em'}} /> container image upload triggers a pipeline in the GitOps repository that runs the <img src={useBaseUrl("img/gitops/devops-icon.png")} style={{width: '1em'}} /> / <img src={useBaseUrl("img/gitops/github-icon.png")} style={{width: '1em'}} /> [gitops-promotion new](https://github.com/XenitAB/gitops-promotion#gitops-promotion-new) command. It pushes a new branch and updates the `dev` environment manifest for the app with the new tag. It then opens an "auto-merging" pull request to integrate the new tag into the main branch.
1. The <img src={useBaseUrl("img/gitops/pr-icon.png")} style={{width: '1em'}} /> pull request triggers another pipeline that runs <img src={useBaseUrl("img/gitops/devops-icon.png")} style={{width: '1em'}} /> / <img src={useBaseUrl("img/gitops/github-icon.png")} style={{width: '1em'}} /> [gitops-promotion status](https://github.com/XenitAB/gitops-promotion#gitops-promotion-new) command. Since `dev` is the first environment in the list, it does nothing and reports success.
Expand All @@ -46,7 +46,7 @@ The flow is fully automatic and is triggered by the container image upload.

### Applying to qa

<img alt="Apply to qa" src={useBaseUrl("img/assets/xks/developer-guide/developer-flow-apply-qa.jpg")} />
<img alt="Apply to qa" src={useBaseUrl("img/assets/xkf/developer-guide/developer-flow-apply-qa.jpg")} />

1. Merging a promotion to the main branch triggers a pipeline in the GitOps repository that runs the [gitops-promotion promote](https://github.com/XenitAB/gitops-promotion#gitops-promotion-promote) command. Like `new`, it creates a branch and updates the `qa` environment manifest for the app with the new tag. Because the configuration for this environment says `auto: true` it creates an auto-merging pull request.
1. As before, this new pull request triggers another pipeline that runs the `status` command. This time there is a previous environment and the status command reads the Flux commit status for that environment. Since Flux managed to apply the change in `dev` the `status` command reports success.
Expand All @@ -58,14 +58,14 @@ Emilia's team has configured Flux to notify them when updates fail and so Emilia

### Application to prod is blocked

<img alt="Apply to prod" src={useBaseUrl("img/assets/xks/developer-guide/developer-flow-apply-prod-blocked.jpg")} />
<img alt="Apply to prod" src={useBaseUrl("img/assets/xkf/developer-guide/developer-flow-apply-prod-blocked.jpg")} />

The workflow for applying to `prod` is similar to that of `qa` above, but since Flux reported failure when applying the update to `qa`, the pipeline running the `status` command will fail and the Git provider will block merging of the pull request.

Seeing that the rollout failed, Emilia investigates and realizes that the release is missing a database migration script. She pushes an updated release tagged `cc2b7e0a` and so triggers the pipline running the `new` command. Because the configuration says `prflow: per-app`, the command "resets" the blocked pull request to apply to the updated release to the `dev` environment.

### Second attempt applying to prod

<img alt="Apply to prod" src={useBaseUrl("img/assets/xks/developer-guide/developer-flow-apply-prod-success.jpg")} />
<img alt="Apply to prod" src={useBaseUrl("img/assets/xkf/developer-guide/developer-flow-apply-prod-success.jpg")} />

Emilia's updated app with database migration is successfully applied, first to the `dev` environment and then to the `qa` environment. The `status` check for the pull request against `prod` turns green and the pull request can be merged. Since the configuration says `auto: false`, the pull request is not automatically merged. Emilia can now verify the update in the `qa` environment and then merge the pull request through the Git provider's user interface.
Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,16 @@ id: cloud-iam
title: Cloud IAM
---

Sometimes applications will need to integrate with other cloud resources as they require things like persistent data storage. When working with XKS each namespace is accompanied by an Azure resource
Sometimes applications will need to integrate with other cloud resources as they require things like persistent data storage. When working with XKF each namespace is accompanied by an Azure resource
group or an AWS account. This is where cloud resources can be created by each tenant. To keep things simple it may be a good idea to not share these resources across multiple tenants, as one of the
tenants has to own each resource. Instead look at other options like exposing an API inside the cluster instead. As one may expect the authentication methods differ when running XKS in Azure and AWS,
tenants has to own each resource. Instead look at other options like exposing an API inside the cluster instead. As one may expect the authentication methods differ when running XKF in Azure and AWS,
this is because the APIs and underlying authentication methods differ greatly. It is important to take this into consideration when reading this documentation.

## Cloud Providers

### Azure

The recommended way to authenticate towards Azure in XKS is to make use of [AAD Pod Identity](https://github.com/Azure/aad-pod-identity) which runs inside the cluster. AAD Pod Identity allows Pods
The recommended way to authenticate towards Azure in XKF is to make use of [AAD Pod Identity](https://github.com/Azure/aad-pod-identity) which runs inside the cluster. AAD Pod Identity allows Pods
within the cluster to use [managed identities](https://docs.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/overview) to authenticate towards Azure. This removes the need
for static credentials that have to be passed to the Pods. It works by intercepting API requests before they leave the cluster and will attach the correct credential based on the source Pod of the
request.
Expand Down Expand Up @@ -121,7 +121,7 @@ TBD

### AWS

When authenticating towards AWS in XKS we recommend using [IAM Roles for Service Accounts](https://docs.aws.amazon.com/emr/latest/EMR-on-EKS-DevelopmentGuide/setting-up-enable-IAM.html) (IRSA). IRSA
When authenticating towards AWS in XKF we recommend using [IAM Roles for Service Accounts](https://docs.aws.amazon.com/emr/latest/EMR-on-EKS-DevelopmentGuide/setting-up-enable-IAM.html) (IRSA). IRSA
works by intercepting AWS API calls before leaving the cluster and appending the correct authentication token to the request. This removes the need for static security credentials as it is handled
outside the app. IRSA works by annotating a Service Account with a reference to a specfic AWS IAM role. When that Service Account is attached to a Pod, the Pod will be able to assume the IAM role.
The reason IRSA works in a multi-tenant cluster is because the reference is multi-directional. The Service Account has to specify the full role ARN it wants to assume and the IAM role has to specify
Expand Down
Loading