TAP And Helm – A Story Of YTT Magic

In this post I want to cover a really interesting scenario of customizing Tanzu Application Platform (TAP) I have been working on, and how YTT magician John Ryan helped me reach an elegant and really cool solution to an edge use case that I was really not sure how to handle.

General Use Case

The use case I was trying to solve was making TAP easier to start with and more brownfield friendly for customers that already are heavily invested in kubernetes and that manage their micro services via Helm Charts.

The goal was to have a supply chain, that would instead of deploying a Knative Service, would deploy a Helm chart.

I wanted to reuse as much of the Out Of The Box (OOTB) supply chains as possible and really only change what was needed for my specific use case.

General Approach

The technical approach I wanted to use was to integrate with the FluxCD Helm Controller, which provides a CRD called HelmRelease, which defines as one would expect a Helm release.

The Helm release object itself points to a Repository and a chart within it that it wants to deploy, some additional values around how to deal with upgrades and things of that sort, and most importantly for our use case, a field in which i can supply the values I want to use when deploying the Helm chart.

Expected UX

The way I imagined the UX to work, is that the platform administrator would install the Helm controller on the cluster, and then either the platform administrator or the Developer depending on the roles and responsibilities within the organization, would create a HelmRepository CR for the relevant helm repository in which the charts we want to deploy are located.

Once that was done, from a TAP workload perspective, the user would define as parameters on the Workload yaml, which chart they want to deploy, which version, from which repository, and also would define any values they would like to use when deploying the helm chart.

What was the first hurdle

One of the key differences when deploying a Helm chart vs for example a Knative Service, is that while in a Knative service the path in which we need to update the image reference is well defined as the API itself of the Knative service is well defined, Every helm chart could expect the image to be placed in a different field altogether as the user is simply defining a values file for a templating engine and is not actually working with a kubernetes native object.

My first thought was to simply hard code this to images.repository as that is the default in many Helm charts, however this doesn’t always work, and I decided that I wanted a more generic solution that could cover more scenarios.

How I planned to solve this hurdle

The idea I had in mind for solving this, was to set a default to be as mentioned above images.repository, but to add another optional parameter a user could supply on the workload with a dot separated path (basically a JSON path) to the field in which we should update the image reference with the image URI generated by the supply chain.

The Main Challenge

After coming up with the general idea, I started working on the Cartographer ClusterConfigTemplate resource that in the end will generate the HelmRelease yaml manifest.

All was going well until I started to deal with how to configure the image path as a variable for the helm chart.

The challenge was basically how do i generate a yaml overlay in YTT from the json path variable i was provided via the workload parameter.

To make this easier to understand lets look at an example.

The manifest i am trying to manipulate looks something like this:

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: #@ data.values.workload.metadata.name
  namespace: #@ data.values.workload.metadata.namespace
spec:
  interval: 5m
  chart:
    spec:
      chart: #@ data.values.params.chart_name
      version: #@ data.values.params.chart_version
      sourceRef:
        kind: HelmRepository
        name: #@ data.values.params.chart_name
        namespace: #@ data.values.workload.metadata.namespace
      interval: 1m
  upgrade:
    remediation:
      remediateLastFailure: true
  values: #@ chart_values()

And the data values that is received by the template looks something like:

workload:
  metadata:
    name: test-iterate
    namespace: default:
params:
  image_key_path: example.awesome.custom_image.field
  chart_name: tomcat
  chart_version: 1.0.0
  chart_values: 
    hello: 1
    testVar: test
    example:
      test: true
image: demo.image.url/example/test:1.0.0

And the need is to take the value of data.values.image (demo.image.url/example/test:1.0.0) and overlay the value of the chart_values which is a yaml fragment at the path specified in the image_key_path parameter.

Basically, the end goal of the chart_values() function in the HelmRelease manifest i am templating would be in our case:

hello: 1
testVar: test
example: 
  test: true
  awesome:
    custom_image:
      field: demo.image.url/example/test:1.0.0

After trying on my own and not being able to think of a solution, I reached out to John Ryan from the Carvel team to see if he had any ideas on how to solve this issue.

Luckily, John who always is happy to help and has an amazing way of thinking and reasoning about tech in general and YTT in particular, came to the rescue and we started discussing the use case, and trying to really break down the scenario and figure out how to solve it one piece at a time.

The first part we needed to solve was how to go from a json path base notation into the corresponding yaml fragment we could then use.

Basically, how to go from:

example.awesome.custom_image.field

To:

example: 
  awesome:
    custom_image:
      field: demo.image.url/example/test:1.0.0

But as we were discussing this and trying to come up with the mechanism, John brought up a very valid point which needed to be addressed as well. In yaml, a key can also contain a dot. this makes the simple splitting on a dot a less optimal solution.

After back and forth discussions, we came up with the idea that if we were able to split on every dot that didn’t have a “\” before it, which is a standard escaping character, and then remove the “\” character from the key itself we would have a solid solution to this issue.

Very quickly John came back to me with a set of YTT functions he wrote that solve this part of the issue.

#@ def as_dict(parts, leaf):
#@   if len(parts) == 0:
#@     return leaf
#@   end
#@   return {parts[0]: as_dict(parts[1:], leaf)}
#@ end

#@ def split_dot_path(path):
#@   # "replace any dot that does NOT have a leading slash with ':::'
#@   path = regexp.replace("([^\\\])\.", path, "$1:::")
#@   path = path.replace("\.", ".")   # consume escaping of '.'
#@   return path.split(":::")
#@ end

Basically, we are splitting the string at every dot that is not preceded by a backslash which is checked via a regular expression and then we pass that values into a function that converts the list into a dict which in the end gives us the needed output.

With these 2 functions defined, we can run something like:

#@ path = data.values.params.image_key_path
#@ image = data.values.image
#@ image_path = as_dict(split_dot_path(path), image)

Now that is pretty slick and is extremely useful!!!

But then came challenge number 2. Now we needed to tackle the task of how to overlay this newly generated yaml for the image on top of the other chart values the user has supplied.

This would have been pretty simple if our chart values we needed to overlay were static and were specified in the YTT template itself as we could simply add something like:

#! original config: want to preserve all this...
---
chart_values:
  hello: 1
  testVar: test
  example:
    test: true
    
#@overlay/match by=overlay.subset({"chart_values":{}})
---
#@overlay/match-child-defaults missing_ok=True
chart_values: #@ image_path

This would work perfectly and would be great ……. except that our chart values we need to overlay, are also located in a data value themselves and are not static.

So my thought was to simply use the programmatic way of using the overlay mechanism in YTT which would basically look something like:

#@ chart_value: overlay.apply(data.values.params.chart_values, image_path)

In the example above we are basically saying to overlay what exists within the params.chart_values data value with the yaml we have defined in the variable image_path from above.

While i was hopeful this would work, sadly it did not.

The reason it didn’t work was that YTT is a very safe tool, and it does its best to make sure you don’t make mistakes when manipulating yaml.

One of the mechanisms it has in place that implements this security aspect is that when overlaying on a yaml, by default if you define a key that doesn’t exist in the base yaml, it wont allow you to overlay it. The reasoning behind this, is that in many cases you may have made a typo for example and really meant to update spec.replicas for example but in the overlay you wrote by mistake spec.replica without the “s”.

Many other tools would simply comply and would add a new field called replica which in the end would not give you the solution and end result you were looking for.

While this is a great default, that can save a lot of hassle battling with unfortunate typos, we also need a way around this which would allow an overlay to define a new key as well, if and when needed, but it should be explicit that we know that the key may not exist and that we are ok with that.

The way this is typically defined in YTT is via adding an annotation like is in the static example from above:

#@overlay/match-child-defaults missing_ok=True

This annotation which we place above the overlaying step itself, basically says that we are aware that a key may be missing and that we are ok with that and want YTT to proceed with the overlay we have provided whether a key exists or not.

While this solution is great when using the annotation based approach for overlaying, there is no comparable solution offered for the overlay.apply() programmatic way of doing an overlay which is what we are using in this case.

In the end after more testing and a lot of trial and error I worked out a solution to this issue. We can define a new function that returns our image_path value under a key that would be in our case chart_values, but would add that annotation of missing_ok=True above it, and then use that function inside our overlay.apply() call.

#@ chart_values: overlay.apply(data.values.params, image_path())

While this got us now very close to a solution, we still have one issue, which is that our new yaml that is exposed by this overlay.apply() function is not:

conf:
  hello: 1
  testVar: test
  example: 
    test: true
    awesome:
      custom_image:
        field: demo.image.url/example/test:1.0.0

But rather it is:

conf:
  chart_values:
    hello: 1
    testVar: test
    example: 
      test: true
      awesome:
        custom_image:
          field: demo.image.url/example/test:1.0.0

Which to me meant we probably just need to reference the overlay.apply() call to be something like:

#@ conf: overlay.apply(data.values.params, image_path()).chart_values

But I was wrong. The error was that the value returned by the overlay.apply() call, is not a struct object in which we can reference all keys via the dot notation, as each key is an attribute of the parent object, rather we are returned a YTT custom object type called a yamlfragment which does not support the dot notation referencing as the keys themselves are not attributes of the parent object.

While i did come up with a working solution for this, which i found through some slack history searching of the carvel channel, When speaking with John, he gave me an even better approach i did not know was possible.

Basically, while we cant use the dot notation for referencing a key in a yamlfragment, we can use the [] notation to reference them.

This meant that i could use something like:

#@ conf: overlay.apply(data.values.params, image_path())["chart_values"]

While this solution is already pretty awesome and is definitely a good solution, i wanted to make it even more resilient, i wanted to support scenarios where perhaps the user didn’t define any additional chart_values, or maybe the user did not define a image_key_path as they are fine with our default path of image.repository.

The Full Solution

In the end, after adding in the defaulting logic, and adding in some checks to make sure i can support as many different permutations of inputs provided by the end user, I had a fully working, and pretty cool piece of YTT magic in my hands.

The final YTT template section that covers this solution in the end turned out to be:

#@ def as_dict(parts, leaf):
#@   if len(parts) == 0:
#@     return leaf
#@   end
#@   return {parts[0]: as_dict(parts[1:], leaf)}
#@ end
#@ def split_dot_path(path):
#@   # "replace any dot that does NOT have a leading slash with ':::'
#@   path = regexp.replace("([^\\\])\.", path, "$1:::")
#@   path = path.replace("\.", ".")   # consume escaping of '.'
#@   return path.split(":::")
#@ end

#@ def image_path():
#@  if hasattr(data.values.params, "image_key_path"):
#@    return data.values.params.image_key_path
#@  else:
#@    return "image.repository"
#@  end
#@ end
#@ chart_config = as_dict(split_dot_path(image_path()), data.values.images.image.image)

#@ config_source = data.values.params
#@ def chart_overrides():
#@ return data.values.params
#@ end

#@ def image_override():
#@overlay/match-child-defaults missing_ok=True
chart_values: #@ chart_config
#@ end

#@ def chart_values():
#@ if hasattr(data.values.params, "chart_values"):
#@ return overlay.apply(chart_overrides(), image_override())["chart_values"]
#@ else:
#@ return image_override().chart_values
#@ end
#@ end

#@ def helm_release():
---
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: #@ data.values.workload.metadata.name
  namespace: #@ data.values.workload.metadata.namespace
  labels: #@ merge_labels({ "app.kubernetes.io/component": "helm-release", "app.tanzu.vmware.com/release-type": "helm" })
spec:
  interval: 5m
  chart:
    spec:
      chart: #@ data.values.params.chart_name
      #@ if/end hasattr(data.values.params, "chart_version"):
      version: #@ data.values.params.chart_version
      sourceRef:
        kind: #@ data.values.params.chart_repo.kind
        name: #@ data.values.params.chart_repo.name
        #@ if/end hasattr(data.values.params.chart_repo, "namespace"):
        namespace: #@ data.values.params.chart_repo.namespace
      interval: 1m
  upgrade:
    remediation:
      remediateLastFailure: true
  values: #@ chart_values()
#@ end

Summary

As you can see, this was a challenging but really interesting use case, and while it was not trivial to solve, YTT has all of the capabilities to handle these types of situations if we simply approach the tool in the right mindset.

YTT is truly amazing in my opinion because while it has a very easy to understand and easy to write mechanism for day to day simple tasks, it also has the full power of a programming language to extend the capabilities and possible use cases without needing to bring in other tooling.

The other part of YTT and Carvel in general that makes me love it so much, is the Carvel team itself.

I have worked with many open source projects, from the kubernetes ecosystem and beyond, but have never encountered a team as eager to help and support the end users as the Carvel team.

With great technology, and great people building and leading the Carvel toolset, I really think that the Carvel tools are a must for those in the kubernetes ecosystem that want to really boost up their game, and start exploring the true cutting edge of what is possible.

I really want to thank John Ryan again for his help in this scenario as well as many other times in the past!

For those interested in trying out this supply chain and more details and how to utilize it, you can check out the dedicated Git Repository where i have uploaded the supply chain and the cartographer template that implement this solution.

Leave a Reply

%d bloggers like this: