Integrating Trivy scanner in TAP

Introduction

One of the really great features in TAP is the pluggable architecture of the scanning tools.

TAP by default integrates with Grype as the source code and image scanner, however it also has beta support currently for Snyk and for Carbon Black (both limited currently for image scanning only).

While these are the provided solutions from VMware, the pluggable architecture of the scanning components, allows us to easily plug in our own scanner of choice. this could be an open source scanner like Trivy or a proprietary tool like Aqua or Prisma.

In this blog post, we will discuss how one could build such a custom integration, using the very common scanner Trivy.

Overview

The goal is to create a scantemplate that will provide the exact same UX and features as the provided grype one offers but simply change which scanner we want to use.
The default scantemplate we will use as our baseline is called private-image-scan-template which is automatically created in your developer namespace defined in the TAP values file.
The scan template has the following flow in which each step runs in its own dedicated container:

  1. Scan the image
  2. Configure access to the metadata store
  3. input the scan results to the metadata store
  4. check compliance against the defined scan policy
  5. aggregate results and create final output

In order to integrate our own scanner, the only image we need to build ourselves is the image used for the scanning process itself. we will also make some changes to the command line flags passed to the other containers but we can use the provided images as is without any issues.

The way data is passed between the containers is via a shared mounted volume which is then mounted into all containers at the path /worksapce.

The general process we need to follow is:

  1. build an image that accepts an image URI as an input via an environment variable
  2. the image should run the scan against the inputted image
  3. the image must output the scan results in Cyclonedx or SPDX SBOM formats
  4. the image must output a summary YAML with the CVE count in the image split by severity
  5. the SBOM and summary yaml should be saved to files in the shared mounted volume so that they can be used in the following steps of the scanning process.

Pre Requisites

  1. The first thing we need is a TAP environment with the testing and scanning supply chain installed.
    We will utilize the out of the box scan templates defined for grype later on as a baseline to build from.
  2. A machine with docker installed to build our image

Lets get this working

Creating the scanning script

We will be basing the logic of our script on the script which is used in the official grype scan template which can be retrieved by running the following commands:

IMAGE=`kubectl get scantemplate -n default private-image-scan-template -o json | jq -r '.spec.template.initContainers[] | select(.name == "scan-plugin") | .image'`
docker pull $IMAGE
id=$(docker create $IMAGE)
docker cp $id:/image/scan-image.sh ./grype-script.sh
docker rm -v $id

We will start just like the grype script by accepting the scan directory, scan file path and whether or not to pull the image as variables.

#!/bin/bash
set -eu
SCAN_DIR=$1
SCAN_FILE=$2
PULL_IMAGE=""

if [[ $# -gt 2 ]]
then
    PULL_IMAGE=$3
fi

The next step is to change directories to the shared volume, where we have write permissions

pushd $SCAN_DIR &> /dev/null

The next step is very important, we need to specify how to reference the image in the scan command in 2 different cases. The first case which is designated by the variable PULL_IMAGE being an empty string is when the source image is a publicly accessible image, not requiring credentials to pull it. The second case is where credentials are needed (this is the default assumption in TAP). In this second case, we need to pull down the image as a tarball and then tell our scanner to scan the local tarball instead of pulling from the registry.
There are many ways to pull the image but we will use the same tool as VMware use in the grype image called krane, as it also will be beneficial in a later step.

if [[ -z $PULL_IMAGE ]]
then
    ARGS=$IMAGE
else
    krane pull $IMAGE myimage
    ARGS="--input myimage"
fi

Now the next step is to run the scan itself and output the SBOM with the vulnerability data embedded in it, in a supported format which in the case of Trivy will by CycloneDX JSON and put this in a file.

trivy image $ARGS --format cyclonedx --security-checks vuln > $SCAN_FILE

While this does give us a valid CycloneDX SBOM as an output, TAP requires 2 specific fields be set correctly in order for the metadata store to be able to index the data correctly which trivy does not do out of the box. The needed fields are ".metadata.component.name" which should be the image repo URI without a tag or a digest at the end, and ".metadata.component.version" which should be the sha256 value of the image.
In order to solve this, we will extract that data from the SBOM if it is there, and otherwise we will parse the inputted image URI, and finally we will add these fields to the outputted BOM file.

NAME=`cat $SCAN_FILE | jq -r '.metadata.component.properties[] | select(.name == "aquasecurity:trivy:RepoDigest") | .value | split("@") | .[0]'`
DIGEST=`cat $SCAN_FILE | jq -r '.metadata.component.properties[] | select(.name == "aquasecurity:trivy:RepoDigest") | .value | split("@") | .[1]'`
if [[ -z $NAME ]]; then
  NAME=`echo $IMAGE | awk -F "@" '{print $1}'`
fi
if [[ -z $DIGEST ]]; then
  DIGEST=`echo $IMAGE | awk -F "@" '{print $2}'`
fi
if [[ -z $DIGEST ]]; then
  if [[ -z $PULL_IMAGE ]]; then
    DIGEST=`krane digest --tarball myimage`
  else
    DIGEST=`krane digest $IMAGE`
  fi
fi
cat $SCAN_FILE | jq '.metadata.component.name="'$NAME'"' | jq '.metadata.component.version="'$DIGEST'"' > $SCAN_FILE.tmp && mv $SCAN_FILE.tmp $SCAN_FILE

Now we need to create the summary report, of number of CVEs at each of the different CVE criticality levels. TAP has 5 defined levels: critical, high, medium, low, unknown. While this may seem easy, the issue is that in the SBOM, we may receive multiple different ratings for a single vulnerability comming from different sources. In this example, i have decided to go with whatever the highest criticality level is found for each CVE.

critical=0
high=0
medium=0
low=0
unknown=0

for row in $(cat $SCAN_FILE | jq -r '.vulnerabilities[] | @base64'); do
  VULN=`echo ${row} | base64 --decode`
  if [[ `echo $VULN | jq '.ratings[] | select(.severity == "critical")'` != "" ]]; then
    critical=$((critical+1))
  elif [[ `echo $VULN | jq '.ratings[] | select(.severity == "high")'` != "" ]]; then
    high=$((high+1))
  elif [[ `echo $VULN | jq '.ratings[] | select(.severity == "medium")'` != "" ]]; then
    medium=$((medium+1))
  elif [[ `echo $VULN | jq '.ratings[] | select(.severity == "low")'` != "" ]]; then
    low=$((low+1))
  elif [[ `echo $VULN | jq '.ratings[] | select(.severity == "info")'` != "" ]]; then
    low=$((low+1))
  else
    unknown=$((unknown+1))
  fi
done

Now that we have the counters for each level we can create the summary YAML file:

trivyVersion=`trivy --version --format json | jq -r .Version`
cat << EOF > $SCAN_DIR/out.yaml
scan:
  cveCount:
    critical: $critical
    high: $high
    medium: $medium
    low: $low
    unknown: $unknown
  scanner:
    name: Trivy
    vendor: Aqua
    version: $trivyVersion
  reports:
  - /workspace/scan.json
EOF

And the final step is to print the content of our 2 files, first the SBOM itself and then the summary YAML:

cat $SCAN_FILE
cat $SCAN_DIR/out.yaml

The script can now be saved on your machine and in my case, i called it scan-image.sh like VMware call their script in the Grype scanner.

Creating the Dockerfile

Now that we have our script all configured and ready to be used, we need to build a container image with the script and all needed tools within it.
While most of the dependencies are very common and can be downloaded as precompiled binaries, krane is not available as a precompiled binary and building it from source would be too much of a pane. To solve this we will simply copy that file from the image for the grype scanner we referenced earlier as part of the Dockerfile and build process.
I have decided to use ubuntu as my source image, and know that the dependencies we need beyond krane are jq, wget, curl, trivy, and of course our script we built before.

chmod 755 scan-image.sh
IMAGE=`kubectl get scantemplate -n default private-image-scan-template -o json | jq -r '.spec.template.initContainers[] | select(.name == "scan-plugin") | .image'`
cat <<EOF > Dockerfile
FROM ubuntu
RUN apt-get update && apt-get install -y wget curl && rm -rf /var/lib/apt/lists/*                                                                                                                                RUN wget "http://stedolan.github.io/jq/download/linux64/jq" && chmod 755 jq && mv jq /usr/local/bin/jq && curl -sfL https://raw.githubusercontent.com/aquasecurity/trivy/main/contrib/install.sh | sh -s -- -b /usr/local/bin && mkdir /workspace                                                                                                                                                                                 COPY --from=$IMAGE /usr/local/bin/krane /usr/local/bin/krane
COPY scan-image.sh /usr/local/bin/
USER 65534:65533
EOF

Now that we have our Dockerfile created, we can build the image and tag it with the repo URL we want this saved to, for example:

docker build . -t harbor.vrabbi.cloud/tap/trivy-scanner:1.0.0

Now we can push the image to our registry:

docker push harbor.vrabbi.cloud/tap/trivy-scanner:1.0.0

Creating the scan template CR

The final preparation step is to create our custom scan template CR YAML.
The first step is to output the out of the box scan template to a file we can edit:

kubectl get scantemplate -n default private-image-scan-template -o yaml > private-scan-template.yaml

We now need to clean up the yaml a bit, and remove the following fields:

  • .metadata.annotations
  • .metadata.creationTimestamp
  • .metadata.generation
  • .metadata.labels
  • .metadata.namespace
  • .metadata.resourceVersion
  • .metadata.uid
    The only field under metadata we should have left is the metadata.name field.
    Now we need to make a few changes to the spec of the scan template itself.
    The first change is to point the initContainer with the name "scan-plugin" to use the image we just created and pushed to our registry.
    After that we need to change the arguments passed to the container from:
./image/scan-image.sh /workspace /workspace/scan.xml true

To:

scan-image.sh /workspace /workspace/scan.json true

That deals with our scan step and now we need to make a change in the 3rd initContainer which has the name "metadata-store-plugin". here the change we need to make is also in the arguments passed to the container where we need to change the format it expects as an input, as well as the SBOM file name. We need to change the args section from:

    - args:
      - image
      - add
      - --cyclonedxtype
      - xml
      - --path
      - /workspace/scan.xml

To:

    - args:
      - image
      - add
      - --cyclonedxtype
      - json
      - --path
      - /workspace/scan.json

And the final change we need to make is in the policy compliance step which is the final initContainer and is named "complaince-plugin", where we also need to change the file name and the input format type in the args section from:

    - args:
      - check
      - --policy
      - $(POLICY)
      - --scan-results
      - /workspace/scan.xml
      - --parser
      - xml
      - --format
      - yaml
      - --output
      - /workspace/compliance-plugin/out.yaml

To:

    - args:
      - check
      - --policy
      - $(POLICY)
      - --scan-results
      - /workspace/scan.json
      - --parser
      - json
      - --format
      - yaml
      - --output
      - /workspace/compliance-plugin/out.yaml

Replacing the initial Scan Template

Because the initial scan template is managed by a carvel package, making the change in place will actually be undone within a matter of minutes, the next time kapp controller reconciles our packages.
To work around this for testing, we can pause the reconciliation of the relevant package.

kubectl patch pkgi -n tap-install grype --patch '{"spec": {"paused": true}}' --type=merge

Now we can apply the updated scantemplate to the cluster:

kubectl apply -f private-scan-template.yaml

If you want to revert back to the initial scantemplate, you can simply run:

kubectl patch pkgi -n tap-install grype --patch '{"spec": {"paused": false}}' --type=merge

Same UX – Different Scanner

As TAP is all based on kubernetes and that is always the source of truth, one of the great things, is that TAP GUI picks the data up about what scanner we used to scan the image, and we actually get visibility into that in the supply chain plugin inside of TAP GUI:

And even with the custom scanner being used, we loose none of the capabilities and nice features we get with TAP like the visibility into CVEs from the supply chain plugin:

And the data is also integrated exactly the same in the new security analysis plugin which was added in TAP 1.3, which gives you a clear way to see the entire landscape of your workload in terms of security in a clear and concise way.

Summary

While there is a bit of work in building an integration like this, especially in terms of the parsing of the data and making sure you output the data in the correct way, the fact that TAP allows us to do this, and when we do, we still get the same great UX, is pretty amazing.
As CycloneDX is becoming a highly adopted standard, and scanners like Aqua Enterprise and Prisma Cloud already support this format as an output of their scanners, the ability to integrate them into TAP is almost identical to what we have done here, and the huge benefit is, that you can truly integrate TAP with your existing tooling in a non disruptive yet very beneficial way.

Leave a Reply

%d bloggers like this: