Dangerous Defaults
Don’t let your IaC Configurations Drift
Manage configuration drift with IaC: understand, automate, and scan modules for security risks.
Manage configuration drift with IaC: understand, automate, and scan modules for security risks.
Configuration drift, like s—, happens.
Configuration drift happens when the system’s actual state is different from the original or intended state. We often think about configuration drift as the result of people’s actions, i.e., patching software, adding new resources, making temporary “fixes” to prod but not altering the codebase. But we continue to see a lack of clarity/understanding about the expected state of the system, i.e., when we leverage PaaS/IaaS and accept default settings, or we don’t pin the version on containers included in a project.
Wasn’t Infrastructure as Code (IaC) supposed to fix this? The goal is to increase development velocity, more features deployed sooner. Using IaC and Continuous Deployment (the CD in CI/CD) can be powerful tools to establish a known expected state of the system. But it does require that you have full insight into all the buttons and knobs (configuration settings) for the infrastructure before it’s deployed.
It’s possible using the right tools and automation techniques to minimize configuration drift. However, if teams (Dev, DevOps, IT, SRE, etc.) are able to make manual changes to apps or infrastructure (maybe by ssh’ing into prod to make manual changes, just this one time) these changes are often not observed until a breach or failure occurs. And when that happens, engineers/developers are required to spend valuable time to understand why something that works in one environment behaves differently.
There are 2 key tactics that can help manage configuration drift:
This requires an automated deployment environment and central repository of configurations. But when some of the configurations are unknown, not explicitly set by templates/modules or embedded in provided IaaS/PaaS settings it leaves risk for configuration drift and misconfigurations.
We’re adopting IaC and CD to go faster. To release more software sooner. But cloud native applications are complex. We use Terraform for our infrastructure.
Take the setup of AWS’s Elastic Kubernetes Service (EKS) for example. It can be very tedious. So we cheat a little. People create modules for common use cases. It helps teams reduce the learning curve and workload of adopting new technologies, thus going faster. For a lot of modules there is an “official” provider, here are the official Hashicorp AWS modules. And there are modules for specific use cases. Just for AWS, there are about 5000 unofficial modules that have a variety of special use cases.
Modules are great. We just want to understand and be aware of configuration and security risks associated with each module before deploying them into prod. There are no guarantees that using current scanners (open source and/or commercial) will detect the configuration parameters in most modules, unless someone has extended the ruleset or queries to include the specific modules and their configuration parameters. If you’re using community Kubernetes modules, you will probably get a false sense of security when you run your terraform files through scanners and they generate no complaints or errors.
Example: Terraform using the kubectl-module:
First of all, when you use an unofficial or a partner module, you need to know exactly why. Most of the time it is your DevOps/Developer velocity, and that is indeed an okay reason. If it is not, then stick to the official modules whose side-effects are well-understood and automatically checked.
In a perfect world, all of the configuration settings would be understood and set in a centralized repository before software has been deployed. But we’re realists. We often start by looking at production systems for misconfigurations and the unintended side-effects of configurations. To figure this out, we export the configuration of deployed cloud infrastructure to Terraform.
You can do this by for AWS by running:
The current cloud configuration infrastructure is exported as Terraform files using the official providers (we do not use any community modules, it’s “pure” Terraform ;-). And these files are scanned using CoGuard for misconfigurations, security best practices and compliance. This provides us a method by exporting the full configuration settings, to understand what the modules have actually done as part of their inclusion. And more importantly what needs to be adjusted. The variances/issues can be fixed directly by setting the corresponding configuration parameters in your Terraform file (and checked into your IaC repo), or file a bug-report to the module-maintainers to ensure that they address the issue.
The use of third-party modules in Terraform or other IaC solutions can cause unintended configuration drift. In order to not sacrifice the great promise of having all configurations nicely as code in your repo, CoGuard offers the ability to snapshot, export and scan your production cloud configurations, so that you can use third party modules, but still ensure that you are not hit by unintended side-effects.
Sign up for CoGuard and install the CoGuard CLI to create a Terraform file containing your cloud configuration settings (this assumes you have the CLI from your cloud provider installed locally).