TAP GitOps Reference Implementation Overview
In TAP 1.5 we now have a new and very exciting installation model, based on the GitOps methodology.
The new installation model offers a secure and simple way to manage a TAP installation, no matter what type of configuration you are targeting.
What Problems Is This Trying To Solve
One of the issues with any system we manage that must be dealt with, is where is my source of truth, and where do i store my configuration.
TAP requires credentials to be provided at installation time to multiple different types of systems such as Git Repositories, Image Registries, Authentication Providers for TAP GUI, Kubernetes credentials (in multi cluster setups), etc.
The challenge this brings is that when we have such needs, storing the config file simply in git as is would not be a secure solution. With that being said, the industry has standardized on Git being our source of truth, and keeping config locally on your machine, is not a sustainable solution either.
Another issue with the current installation flow, is the imperative nature of the process. While the installation can be greatly streamlined with some experience, the real ideal is to take the kubernetes approach of being declarative and not imperative.
We also want to solve the issue of drift detection and remediation. In the current state of TAP installation, drift is very common, and very often what is configured in the TAP Values file, is not what is actually running on the cluster. This occurs due to people adding overlays manually, or tweaking configurations inline like pausing package reconciliation, and other things that simply happen, whether we like it or not.
Another issue is that while the installation of TAP can be very simple, TAP is not all we need to do typically. We usually want to install the TBS Full dependencies, which is another package repository and package installation. We may want to install Kubernetes operators like the RabbitMQ operator, or the VMware SQL operators, to allow for easy provisioning of data services for our workloads to bind to. Or another common example is that we want to install an optional package that is part of TAP but not in a profile such as External Secrets Operator (ESO), Spring Cloud Gateway (SCG), Application Configuration Service (ACS) etc.
The challenge this brings, is that while in our minds, this is all one solution, the installation process is a bunch of bespoke imperative commands, with different configuration files, that all need to be managed in combination with one another, but nothing ties them together and provides the well needed glue.
What Is The Traditional Installation Model
The traditional TAP installation in a simple situation is in general built out of a few different steps:
- Accept The EULAs
- Install Tanzu CLI
- Relocate Tanzu Cluster Essentials images to your own registry
- Relocate TAP Package Repository to your own registry
- Relocate TBS Full Dependencies Package Repository to your own registry
- Install Tanzu Cluster Essentials
- Add the TAP package repository to your cluster
- Add the TBS Full Dependencies repo to your cluster
- Generate your TAP values file
- Install TAP via a single Tanzu CLI command
- Install the TBS full dependencies package to your cluster
After this, there are many additional steps one may take such as those mentioned above in the previous section.
What Does The GitOps Flow Look Like
In the GitOps flow the steps start out the same:
- Accept The EULAs
- Install Tanzu CLI
- Relocate Tanzu Cluster Essentials images to your own registry
- Relocate TAP Package Repository to your own registry
- Relocate TBS Full Dependencies Package Repository to your own registry
- Install Tanzu Cluster Essentials
At this point though things change.
The following steps in the GitOps flow greatly depend on which of the 2 solutions provided you use in order to manage your secrets in the TAP configuration.
Choosing A Secret Management Solution
As mentioned above, one of the things we must take care of when dealing with storing things in Git, is how to protect and secure our credentials that are needed as part of the configuration.
In TAP 1.5, the GitOps installation process, offers support for 2 different options, SOPS and ESO.
Let’s quickly try and understand what each solution provides, and why one may want to use it.
SOPS – Mozilla Secret OPerationS
With this approach, which is the easier approach to configure, we store all of our sensitive data in an encrypted format within our Git repository.
SOPS is a tool from Mozilla that allows us to easily encrypt and decrypt files.
SOPS supports YAML, JSON, ENV, INI and BINARY formatted files and can encrypt the needed data with different mechanisms such as:
- AWS KMS
- GCP KMS
- Azure Key Vault
- age
- PGP
This approach allows for the easiest way to get started, and allows all of our configuration to be stored in Git, as well as the fact that it has no dependencies on the existence of other tools in your environment.
What are the remaining steps if we choose SOPS
While SOPS supports multiple different solutions for managing the keys used for encrypting and decrypting our data, currently AGE is the supported mechanism. While you could with some customization make GPG work as well, the supported mechanism is AGE so that is what we will use.
The remaining steps if we go down this path are:
- Install SOPS and AGE CLIs to your machine
- Skaffold a Git repository with the needed configuration based on the GitOps download bundle included in the TAP 1.5 downloads
- Create an AGE key which will be used to encrypt our sensitive data
- Generate a TAP Values file excluding any sections that are sensitive
- Generate another YAML file with all of the sensitive fields you left out of the TAP values file from the previous step
- encrypt the sensitive file using the SOPS CLI with your new AGE key generated above
- Push the configuration to a git repository
- run the provided bash script to configure the GitOps flow. This will create a secret with your new AGE key in order to be able to decrypt the sensitive values, install a carvel application CR which will pull down the git configuration and deploy TAP for you automatically, as well as keep the clusters config always in sync with your git repositories configuration.
Getting started with this approach is extremely easy, and the documentation is also very good at walking you through all of the needed steps to get it up and running in no time.
ESO – External Secrets Operator
While SOPS is the easiest solution to implement, many organizations have a central Key management system such as Hashicorp Vault, AWS KMS, Azure Key Vault or many other tools that exist in this area.
External Secrets Operator is a kubernetes operator that reads information from a third-party service like those mentioned above as well as many more, and automatically injects the values into Kubernetes Secrets.
This approach allows us to store our sensitive values needed for TAP installation in a key management tool, and have ESO pull those down into our cluster as part of the installation process.
This approach is more involved in terms of setup requirements, as well as it splits our source of truth between 2 different systems, as our sensitive data is not saved in Git, but it aligns with the general secret management strategy many companies are adopting in the market, of storing all sensitive data in a dedicated system built exactly for this purpose.
What are the remaining steps if we choose ESO
In TAP 1.5, the ESO integration is curated for AWS Secret Manager and EKS clusters specifically, but with some changes to the scripts and configuration files, one could make it work with any supported provider in ESO, however EKS + AWS Secrets Manager is the only tested and verified solution as of now.
The remaining steps if we go down this path are:
- install and configure AWS CLI
- Skaffold a Git repository with the needed configuration based on the GitOps download bundle included in the TAP 1.5 downloads
- Generate a TAP Values file excluding any sections that are sensitive
- Generate another YAML file with all of the sensitive fields you left out of the TAP values file from the previous step
- create 2 IAM policies to allow the GitOps flow as well as TAP installation itself to pull the secrets from the AWS secret store
- Create 2 IRSA pairs for your cluster, binding the policies from above to the needed 2 service accounts in your cluster for the GitOps flow and TAP installation itself.
- create the needed secrets in AWS secrets manager using the AWS CLI
- Create a configuration file which will be stored in git pointing to the relevant secrets you have stored the TAP sensitive values in
- Push the configuration to a git repository
- run the provided bash script to configure the GitOps flow. This will create secrets to access your container registry, install a carvel application CR which will pull down the git configuration and deploy TAP for you automatically, as well as install ESO and generate the external secret CRs needed to pull down the sensitive values from your AWS secret manager and keep them in sync. Just as with SOPS, this flow will keep the clusters config always in sync with your git repositories configuration.
As you can tell, this is a bit more work to configure, but also here the documentation is very well written, with good examples, and clearly guides you along the path to setting this up.
Adding Custom Configurations To The Mix
While this approach has huge value already for just installing TAP itself, it becomes even more powerful when we add our own additional configurations in the mix, to make it a much more holistic and end to end solution, for our cluster configuration.
Lets see a few examples of where this could be extremely valuable:
TBS Full dependencies
TAP comes pre baked with the lite dependencies of TBS, which while fine for a POC, should really be replaced with the full dependencies package in a production setup.
The TBS dependencies installation is similar to TAP in that we need to install a dedicated package repository and then install a package from that repository into our cluster.
We can easily add the manifests for the repository and package installation in our GitOps repo, and let the system install it for us automatically, directly alongside our TAP installation itself.
You can see an example of such a configuration in my Sample TAP GitOps Repository.
Configuring TAP Overlays
While TAP exposes many knobs for us to configure things as we need, there are situations where we may need to use YTT overlays, to overcome either an issue / bug, or simply to change a specific decision made by the TAP engineering team in a component that does not meet the needs of our specific environments.
The need for such a mechanism is understood well by the TAP team, and they have provided us with a great mechanism for allowing such overlays directly via secrets references in our TAP values file.
In the traditional non GitOps installation mode of TAP, we would simply create the tap-install namespace, create these secrets containing our overlays, and add references to the secrets in our TAP values before installing TAP.
As the tap-install namespace, and TAP itself are now all part of the GitOps installation flow, we need another way to create these secrets, and optimally they should be managed side by side with our TAP config itself.
We can simply store our secret manifests alongside our TAP configuration in the same repo, and we can have the GitOps process manage these secrets for us as well!
You can see some examples of these types of overlays and the setup in my Sample TAP GitOps Repository.
Installing Operators
Another common thing many people will want in a TAP cluster, is additional operators to be installed such as the VMware Data Services operators for MySQL, PostgreSQL and RabbitMQ, or other operators you want to utilize in this cluster alongside TAP.
This is also not limited to operators, and can actually be any software you want installed in your cluster.
Typically these operators would also be installed imperatively into our cluster, and we would have the same management challenges as we mentioned above for TAP.
Because this solution is easily customizable, we can add the installation of these operators into our repo alongside the TAP installation config, and manage the entire setup together.
You can see some examples of these types of additions in my Sample TAP GitOps Repository.
Creating TAP Resources
Another common use case may be to install your own custom supply chains, or Cluster instance classes, or any other TAP related custom resources.
TAP is truly amazing in the amount of capabilities it offers, and as it is a kubernetes native solution, where almost everything is defined in a dedicated CR, we have many extension points, and configurations we can setup, in order to make our clusters perfectly fit to our needs.
By adding these custom supply chains, or cluster instance classes, or really any other resources that are TAP related, you can truly manage YOUR TAP configuration in a unified manner, following industry standard approaches such as GitOps.
You can see an example of some custom resources i have put together in my Sample TAP GitOps Repository.
Summary
The new GitOps installation model is extremely beneficial, and makes the management of TAP environments a much cleaner and seamless experience.
It is great to see GitOps being added as an official way to manage TAP environments, and I’m sure we will see more and more innovation in this area as the product keeps maturing and growing.
You can tell that the engineers behind this feature, truly thought deeply of how this flow should look, and have built the solution in a manner that makes lifecycle management easy, will make upgrades a much more seamless process, and at the same time have left enough room for us to be able to add in our own customizations, to fit the solution to our needs, in a simple and secure manner!