Demystifying MDR buzzwords
The term Managed Detection and Response (MDR) was coined by the global research and advisory firm Gartner back in 2016. At its core, MDR is a managed service that involves a combination of network and endpoint monitoring technologies for intrusion detection of known and unknown cyber-threats, as well as the human evaluation of data, to provide more accurate attack context and remediation support to stop identified attacks in near real-time.
Yet, with today's wide variety of managed service offerings, it can be challenging to understand the value of MDR. Why do organizations need MDR and what components make up a quality solution? Where does Endpoint Detection and Response (EDR) fit in, and what about XDR? How can you make sense of all of the MDR buzzwords that accompany any popular product? Lastly, how can you draw the most value from an MDR solution?
Gain context to common MDR terms
The MDR market is evolving, growing, and ultimately becoming more crowded. Vendors are trying to differentiate themselves by introducing additional features and using buzzwords to get your attention. While these buzzwords may check the box in your MDR solution wishlist, marketers sometimes use them in the incorrect context, making it unclear what a vendor is truly offering.
In the following sections, we break down some of the commonly misunderstood buzzwords around MDR in order to clarify the components of typical offerings. Understanding the meaning behind marketing speak will enable you to truly make an apples-to-apples comparison between vendors.
Artificial Intelligence (AI) & Machine Learning (ML)
Two terms that cyber security marketers have really latched onto are artificial intelligence (AI) and machine learning (ML), a subset of AI. It has become increasingly rare to pick up a piece of marketing literature for cyber products where these terms aren't front and center. They are even starting to appear in RFPs and are used to disqualify companies that do not employ such techniques. That said, it's important to quickly touch on the general relationship of the terms.
Artificial Intelligence (AI)
At a high level, AI is the ability to recognize data relationships and take an action, usually for a specific task. It's typically broken into Weak AI and Strong AI.
Weak AI, also called Narrow AI, is what you find in most cyber products. It relies on detection algorithms and predictive responses to "simulate" intelligence. Strong AI is much more complex. It endeavors to determine potential context to mimic reasoning to integrate into the decision-making process. It more closely relates to the workings of the human brain. Since most cyber monitoring is less reliant on emotions, you don't normally see strong AI in cyber products.
Machine Learning (ML)
ML involves the ability to compare data from a known model with varied sets of data from an environment over varying timeframes to better learn and compare system behavior and take actions based on changes. By rapidly processing large volumes of disparate data, ML can more readily find patterns that may be lost during human review.
Baselining is an important element of this process, as it helps with understanding anomalous activity or automating a response. Something that's important with ML is this: data scientists building the model must clearly understand what good and bad data looks like.
There are numerous examples of AI and ML in cyber products. Some of these include:
- Determining if an email is legitimate or a phishing message.
- Looking at the behavior of a file as it runs to determine if it is malicious.
- Using previous attack and response data to recommend solutions to an analyst during real-time review.
Assessing vendors for AI + ML capabilities
With those definitions in mind, there are three points to consider about the AI or ML a potential vendor will use to deliver their services. These points have a large impact on how one vendor differentiates themselves from another vendor making similar claims.
1) AI/ML "Out of the box"
The first question to ask a vendor is whether the AI or ML is embedded in third-party tools or in the processes of the vendor. This is important because the algorithms used in ML are not static. You will want to make sure the vendor has some control over the algorithms to get the most value from the offering and the maximum possible customization for your environment and/or the latest threats. If AI or ML is part of their processes, they will need to have a data engineering and/or data science team that works on the models over time; you should review the size and expertise of this team. Just because you have tools doesn’t mean you’ll be effective at using them.
2) AI/ML applied to efficiency gains
The next point of focus is where AI or ML is used and what are the related security improvements? One main area of focus is improved tool efficiency; false positive reduction is a good example of this application category. You would expect that most companies are concerned with overall system performance and improvement over time. This is the most common example vendors may reference to check the box of ML use. Are you seeing a cost savings over time from this application of machine learning or is there some other benefit this application affords you as a client?
3) AI/ML applied to better security/detection/response
Effective application of AI or ML should result in improved security detection. The vendor should be able to give you specifics about new security detections they are working on. Going back to the required baselining, the vendors that are most effective with making security detection enhancements have a ready source of threat intelligence to clearly know the model of the latest attacks and/or vulnerabilities. There needs to be a separate team of threat researchers, malware reverse engineers, incident responders, red teamers and vulnerability hunters that feed the data science team with new problems to solve in order to move beyond standard detections.
Use cases versus MITRE ATT&CK mapping
It’s impossible to talk about Security Information Event Management (SIEM) tools, SIEM as-a-service, or Endpoint Detection and Response without talking about Use Cases and MITRE ATT&CK. Many SIEM tools will talk about the wide range of supported use cases, but what exactly does this mean? At its core, a use case is a set of rules. When you see this code or set of codes, send this type of alert message. The broader meaning is sometimes tied to business cases, such as an “impossible travel” scenario where a user logs into the system from Europe and 10 minutes later logs in from the United States. The challenge that use cases present is they typically lack context and enrichment. While it may show you a certain alert, the alerts themselves often require additional investigation to really understand the who, where, why, how and what of the situation. This is what earned many Managed Security Service Providers (MSSP) a reputation for being a bell-ringing service. The alerts by themselves don’t provide the full picture and leave the receiver with a lot of work to do in figuring out what actually happened and how to respond.
To compensate for the lack of context, many tool providers started mapping their alerts to the MITRE ATT&CK framework, a collection of tactics and techniques known to compromise systems. Tactics capture the classic kill chain concept of attack sequencing and contain techniques which describe the multitude of ways that the tactics can be implemented. The mapping gives significantly more context to the nature of the attack. By understanding where in the attack sequence the abnormalities you’re looking at fit into a possible larger attack, the framework can help shed insight on where the threat actor may have gained access. It can also help you find the threat group behind the attack (if the group is regularly tracked) and where you should look next based on previous investigations of that threat group’s tactics. This can significantly reduce the time required to confirm attacker presence and activity beyond the initial alert.
Threat hunting
The next term used in marketing materials that carries multiple meanings is threat hunting. We’ve broken threat hunting down into its three main forms: reactive threat hunting, hypothesis-based threat hunting, and vulnerability threat hunting.
Reactive threat hunting
Considering that new threats are coming out on a daily basis, it is not practical to think your prevention and detection technologies will be able to stop or identify every attack. It is inevitable that threat actors will compromise a system at some point. When this happens, good MDR or EDR companies will investigate attack artifacts to determine the indicators of compromise (IoCs), which are typically threat actor commands and control servers, or connections to known malicious sites. As the data team identifies new IPs and domains, threat hunters search through logs to identify communication with these systems that might show data exfiltration or general system compromise.
This type of information typically comes from the MDR provider’s Incident Response (IR) practice that performs digital forensics and/or malware reverse engineering. This information helps a threat hunter understand the stages of an attack, identify behavior markers, and build those markers into new detection filters and use cases. Threat hunters can then take your log data and pass it back through the detection engine to determine whether you were a victim of a previously unknown type of attack. To do this in-house would require an even greater investment in highly trained staff (assuming you could even find them).
A major benefit of using a third party to manage and monitor your EDR tool is that reactive threat hunting IoCs do not solely come from your environment or your EDR tool vendor. What you learn from your network is small compared to the visibility a third party can collect while monitoring 100s of networks. Additionally, security service providers often subscribe to more advanced threat feed services and have dedicated staff to work with global law enforcement and security organizations, which enables them to keep their finger on the pulse of the latest attacks and related detections. This is not something you can easily do on your own without significant investment in staff time, expertise, and service subscriptions (assuming you are even invited to participate).
Hypothesis-based threat hunting
One of the biggest challenges of log monitoring in general is the massive quantities of data that SOC analysts and threat hunters need to sort and process. This is where the second form of threat hunting, hypothesis-based threat hunting, comes into play. This form of threat hunting helps answer questions on when and where to investigate for unknown attacks.
As discussed in the machine learning (ML) overview, data scientists create models to identify anomalous and potentially malicious patterns in log data to identify systems that may require a deeper inspection but that don’t lend themselves to typical detection countermeasures, usually because of high false positive rates. While ML is often driven by threat intelligence and research teams, strong SOC analysts also play a role in noticing new patterns or linking two or more behaviors together and hunting in those scenarios. Once identified, threat hunters can then bring additional inspection tools to collect new artifacts for use in the reactive hunting process. Similar to malware reverse engineers, finding and retaining these types of data scientists is very difficult and expensive.
Organizations that understand how attackers operate can help data scientists ask better questions to enhance the results generated in the models. Vendors with a history of red team and traditional penetration testing can especially have an upper hand. Part of hypothesis-based hunting also involves going through logs with a higher rate of false positives to ensure there are no other IoCs or unusual behavior that might necessitate more formal investigations. This is often the most time consuming and difficult work analysts do.
Vulnerability threat hunting
Vulnerability hunting, the third form of hunting, involves your vulnerability management team. On a daily basis, software and hardware vendors disclose new vulnerabilities and the related patches to correct the situation. Unfortunately, many organizations have a period of days, weeks, or even months between scans. Leveraging scan device data, continuous monitoring shows when existing devices have new vulnerabilities without having to constantly run intrusive devices scans. Keeping up with new vulnerabilities on a daily basis allows your MDR vendor’s team to predict where new attacks may occur. Similar to the hypothesis-based hunting, this provides a place to start looking for potentially unknown attacks
Response
Another term that has taken on multiple meanings is the Response portion of MDR. This term is confusing to many, since the first thing one might think of when they hear “response” is Incident Response or IR. This include digital forensics of full disc images and/or memory to understand root cause analysis of an attack. This big “R” often involves a chain of custody for use in legal cases. This is a very formal process and often takes as much as a week to analyze and write up a report for a single system. If you need legal support to file an insurance claim or for use in court, you’ll want to make sure your MDR provider has these services in house. In addition to ensuring a smooth transition between teams to help you get back to operations faster, the IR team can be a tremendous source of threat intelligence around the latest successful attacks. The challenge is that many smaller, regional MDR providers are not able to find or pay these consultants full time, which could lead to support challenges for your environment at the worst time.
As small organizations started using MSSPs, another type of “response” started to be a common request. This little “r” response was to fix the problem in the moment where damage mitigation was more important than root cause analysis. For this little “r” type of response, an MSSP will need to have some kind of management access to the devices in question. Historically, an MSSP might also manage the firewall, so this would be a place they could address an attack. In another situation, they might control an intrusion prevention system (IPS), where they could write a blocking rule for the attack to stop the problem. If an MSSP doesn’t have management access to devices that can isolate an affected system or attack vector, you will only get an alert to address the attack and you will have to take action to stop the attack.
Endpoint detection and response tools typically allow the practitioner to isolate the device and prevent the attack from moving beyond that device. If you really want your provider to be able to stop an attack in its tracks, having a managed EDR tool as part of the solution will provide more response flexibility.
Somewhat related to the little “r” response would be to plug up the hole that the threat actor may have slipped through into the network. Your MDR service provider may identify that unpatched systems caused your system attack. While many companies would like to have that same provider apply the necessary patches, it is often less expensive to hire a local contractor or employee to take care of this situation. Many managed security service providers don’t want to take on the additional up-time liability, and standard security analysts command a higher rate than IT technicians that patch boxes and adjust configurations
Managed
To wrap up this section on buzzwords and definitions, let’s take a look at the term “Managed”. As cyber security is an ever-evolving arena, you can expect your needs to change over time. Ideally, anyone delivering managed security solutions should be able to keep pace with changes in the external threat landscape while also aligning with your internal program needs as your business expansion and contraction changes your risk. Your current cyber program size and maturity will often drive your requirements. Keeping up with new vulnerabilities on a daily basis allows your MDR vendor’s team to predict where new attacks may occur.
Additionally, when you measure a vendor’s ability to provide value, your focus needs to be on the security impact the management provides. How does the company go beyond the basics and/or what you are capable of doing in-house today? If your goal is to find a partner that can grow with you and/or take your program to a higher level, here are some elements of management to consider when evaluating your needs and potential vendor capabilities
What do I need managed?
In the MDR ecosystem, there are 3 main tool sets to consider, in order of complexity:
• Network devices, especially firewalls and Intrusion Prevention/Detection Systems (IPS/IDS)
• Endpoint detection tools
• SIEM tools
Firewall and IPS/IDS management give the vendor the ability to block attacks at the network traffic layer through dropping traffic based on active attack data or providing additional countermeasures beyond those the hardware vendor offers off-the-shelf for improved prevention and/or detection. Endpoint detection tool management gives the vendor the ability to isolate a device during an attack. Similar to network devices, vendors can also deploy additional countermeasures based on their threat intelligence in these tools to enhance detection capabilities. They may further use the tool to set up investigations and threat hunts on your behalf, either proactively or in response to tool alerts. SIEM tool management is a much larger list and normally where legacy MSSPs were brought in for help. It starts with identifying the logging requirements for a range of in-scope devices to make sure it is collecting the right information on an on going basis. Next is the tuning of the correlation logic to minimize false positives, which varies widely in approach based on the SIEM platform and capabilities of the vendor.
Some questions to consider are:
• Beyond minimizing false positives, is alert enrichment offered, similar to what the section on mapping to the MITRE ATT&CK Framework described?
• Are there remediation recommendations for next steps?
• Is the provider capable of taking those extra steps as part of their offering?
As described in the Threat Hunting section, there may be other tools that enhance event context like threat intelligence feeds for insights on emerging threats or potential attack vectors, vulnerability data that exposes new risks and managed deception technologies to confirm threat actors bypassed your existing controls or you have malicious insiders. What capabilities does the vendor have to manage or incorporate such tools?
50% of organizations will be using MDR services for threat monitoring, detection, and response functions that offer threat containment capabilities by 2025.
Conclusion: MDR buzzwords demystified
As you can tell, the buzzwords surrounding MDR are numerous and making sense of them isn’t the easiest task. That’s really because each word incorporates several components that can be broken down into more detailed pieces. To get the most out of the Managed aspect of an MDR program, you need to confirm what is managed and how that drives security outcomes beyond what you could do with your existing team today. Is the vendor flexible in their alert prioritization and can they accommodate your unique alerting and reporting needs? Artificial intelligence is the ability to recognize data relationships and take an action, usually for a specific task. Machine learning is a subset of AI. In security, machine learning reviews data sets over time to identify anomalous activity and automate responses.
There are three things to consider when reviewing vendor AI and ML capabilities:
• AI and ML embedded into third-party tools and/ or processes often limit the impact the vendor can make with it.
• Vendors should have in-house data scientist to build and modify the threat models used by AI/ML.
• Advanced threat intel from a cyber research team helps data scientist build better models by clearer identification of what is bad.
Use Cases are sets of rules, but can also be used when discussing business use cases.
• To compensate for the lack of context, many tool providers started mapping their alerts to the MITRE ATT&CK framework, a collection of tactics known to compromise systems.
Threat Hunting also includes three components: reactive threat hunting, hypothesis-based threat hunting, and vulnerability threat hunting.
• Reactive hunting- Investigates attack artifacts to determine the indicators of compromise when one does occur.
• Hypothesis-based hunting- Leverages machine learning to answer questions on when and where to investigate for unknown attacks.
• Vulnerability hunting- Predicts where attacks may occur by keeping up with new vulnerabilities on a daily basis
Response includes the little “r”, ability of your third-party support to take action during an attack to prevent more damage or data exfiltration, as well as the big “R” of digital forensics when necessary.
• Make sure your third-party support has in-house forensics capabilities to ensure a smooth transition when bringing in this team to save time and money.
About the author
Read part 2 of this series or learn more about our MDR solutions.
This guide and series will seek to provide clarity around a fast-moving, cluttered managed security marketplace. The installments of this guide will cover:
Part 1: Common buzzwords associated with MDR and their true meaning
Part 2: Why Endpoint Detection and Response (EDR) can only take you so far
Part 3: Is your MSSP providing true MDR capability?
Part 4: In-house MDR creates more problems than it solves, and the case for outsourcing MDR.