Kafka, Storm, or other similar services are at the core of modern data streaming and live processing infrastructure. May it be social media feeds, or logs from your application, the data you collect and the way you process it requires protection. By default, none of these services enforces authentication or encryption in transit. This is where the main issue lies. If someone manages to reach one of the nodes in your infrastructure, valuable information can be leaked.
Did you know that today there are no automated tools for Kafka or Storm clusters? Companies have to trust that their IT professionals read and apply information from the manuals. Do you think they have time for this?
CoGuard helps change this! CoGuard supports the scan of configuration files of these services. In addition, any interdependencies (like Zookeeper for older Kafka versions) are also taken into account. Using CoGuard, one ensures not only security, but also the scalability and performance are not affected by bad configurations.
Today, not enough is done.
Some databases are considered in security benchmarks, others are not. It is not entirely transparent to managers nowadays how the services are configured, and what the access controls are.
With CoGuard, you can now scan all configuration files and access controls. How the databases connect to the network is essential information, and the least amount of access necessary to perform certain actions should be given.
Databases are the most fundamental services which run for every organization. They can be classical relational databases (MySQL, Postgres, Oracle), NoSQL databases (MongoDB) or streaming databases (like KQL). No matter what you use, databases are where your customer data lies, and maybe even part of your intellectual property. And they are the most attacked service type, by far, and they need to be protected at all costs. All the databases have dangerous default configurations, which ease the deployment, but are not recommended to be used in a production environment. How do you know you crossed all t’s and dotted all i’s there? And, if you think about your infrastructure: How do you know that one of your applications is not accidentally allowing full access to the otherwise hopefully isolated and locked database?
Text Search Engines
Collecting and monitoring your application logs is fundamental to every compliance framework out there. The monitoring and analysis is usually accomplished by large text search engines like ElasticSearch or Apache Solr. These engines contain valuable information, and are sometimes even used like databases. ElasticSearch has been in the news a lot in the past in connection with data breaches, exactly due to the fact that it does not enforce authentication by default. But also the setup can be tricky: Does your cluster use all the good features like fail-overs, proper indexing and encryption in transit? There are many tweaks possible in the configuration which make the use of these engines more secure and fast.
What is done today? The correct configuration of elasticsearch is such a widely searched question that it has a price of above 10 dollars per click in online advertisements…There is a need, but not a supply…
With CoGuard, you can change this. These engines, given they are distributed, have constantly sensitive data in transit. All needs to be secured, and ensured that proper scaling parameters are set, so that the cluster can grow in the same way as the data grows. Similar to databases, least privilege access should be enforced. CoGuard ensures this is done.
Read example of past leaks online HERE.
What is done today? The book on Hadoop Security (online HERE) has 340 pages… Do you need to know more?
What really should be done? All configurations should be checked by a non-human eye. Period. This beast is far too complex.
Distributed file systems
Computers break. Did you ever lose files because of one computer failing, and backups were not a thing? What if the amount of data you are handling is so large that backups alone are not feasible any more? This is where distributed file systems like Hadoop shine. Setting up Hadoop is easy, securing it is not. Plus, there are more than five different ways to set up a Hadoop cluster, and each one of the ways may add nuances to securing it. Furthermore, one misconfiguration, and you are back at the original problem that one of your failed computer nodes was the only node with the data you tried to protect so dearly.
Information as Code (IaC) Tools
Cloud and cloud-like infrastructures are the future. To have resources at our fingertips and create reproducible environments has been a major driver of innovation and productivity in the last years. The good thing about it is: Everyone can obtain and manage compute resources. The bad thing about it is: Everyone can obtain and manage compute resources. People by-pass IT professionals, and with that best practices are by-passed. The most simple problem happens regularly: Someone exposes an internal service online, and people can then gain access to it. Using infrastructure as code (Kubernetes, Terraform, CloudFormation, Ansible, Chef, Puppet,…), managing infrastructure has become something that can be encoded, versioned, and validated pre-deployment.
Today, a large number of different cloud monitoring solutions exist. These are mainly post-deployment, and either agent-based or analyze the state using APIs. For pre-deployment, there are checkers for the surface level of infrastructure as code files. These checkers do simple sanity checks for common known best practices on individual services.
The simple checks for parameters are a good starting point, but there is more to the story: Every piece of infrastructure that is deployed has also potentially different services installed inside, and these are configured as well. These configuration files may point to other configurations in the infrastructure, and the connection is not evident. To be able to define policies that span also the interconnection between services is what makes CoGuard unique. In this way, breach-paths can be detected pre-deployment.
Often disregarded as just the intermediary between content and browser, web servers can, if misconfigured, do great harm. Configuring them is an art itself, with many ways to doing things wrong. The most prominent web server based breach is Capital One in 2019, where a misconfigured web application firewall in the Apache web server allowed access to sensitive data.
What is done today? There are benchmarks for most of the web servers today, and people manually check the configurations. HOWEVER: Web server configuration files are almost like full programming languages. It does not suffice to just check if certain parameters are set. One needs to look into different contexts.
Instead, you really should be doing this. Web servers are your front door and your first line of defence. Have configuration checked and track if new configuration recommendations are posted. CoGuard can help you stay on top of this, and allow only minimal access through the page.