AI and Cyber Security: New Vulnerabilities CISOs Must Address

With the rapid deployment of Artificial Intelligence (AI) and Large Language Models (LLM) across virtually every business sector and use case, CISOs are rightfully concerned. With any new technology comes new threats, and just as companies are developing, testing, and evolving AI capabilities, rest assured that threat actors are doing the same.

While we know AI inherently introduces risk, the exact threat vectors are not well established, and many are theoretical—we know it's possible, even if we haven't seen it in the wild yet. Because business use of AI is still so new, threat research and mitigation practices are still in nascent stages. Yet, every day, organizations roll out vulnerable systems just waiting to be exploited. All it takes is for someone to notice.

Given this uncertainty, CISOs need to be aware of potential threats and organizational impacts and prioritize building resilience before their AI utilization exceeds their risk tolerance—or worse, invites a breach.

How has generative AI affected security?

With AI, trust goes out the window.

Part of what makes AI different from any other technology when it comes to security is that it is fundamentally untrustworthy. For example, let's say you built a web app. You made it, you know how it works, and you control it—therefore, you can trust it.

Conversely, AI interacts with and operates across many data sets, applications, and users, both trusted and untrusted. It can be manipulated to wreak havoc even when built on the most trusted components.

Organizations are only now recognizing the risks, and we've observed a pattern of recurring vulnerabilities. Here are just a few we've seen:

Prompt inclusion: In most instances of AI, user prompts become part of the model—a fact many casual users are unaware of. Unless specifically excluded (like in the case of Microsoft Copilot), prompts are used to train the LLM, and users should assume this unless assured otherwise. That means if users upload proprietary data for analysis, it becomes irrevocably part of the data set and is essentially impossible to extract. This ties directly to the notion of "shadow AI," where employees are using third-party LLMs without knowledge or authorization by IT teams. This practice has led to serious data leaks, so it's best to exercise discretion even for models that claim to exclude prompts. Training users on this important fact is essential guidance.

Prompt injection variants: As the use cases for AI grow, so too do the variations on compromise. One example is what OWASP calls "indirect prompt injection," when an attacker hijacks the conversation between the AI and the user. Cross-user prompt injection is also becoming more common. In this variation, instead of telling the AI to do something malicious, bad actors will prompt the AI to do something malicious to another user's account—like delete it, for example. This is a massive vulnerability caused by a failure to properly isolate data submitted by potential threat actors from the data the AI reads from trusted users.

Data poisoning: LLMs operate by repeating patterns but can't discern good patterns from bad. Bad actors can poison data sets so that, with the right code or phrase, they can control the model and direct it to perform malicious actions—and your company may never know. That's what happened with Microsoft's Tay chatbot, which was poisoned by prompts, and Microsoft had to quickly shut it down.

Model extraction or inversion: In this attack, a bad actor can prompt the model to either extract your data, duplicate the functionality, or clone the model itself. That means if you train the model on anything sensitive, threat actors can steal that data, even if they don't have direct access to it. That's why models trained on proprietary data should never be exposed to external parties. While this attack is academic right now, it could easily be exploited without proper segmentation.

Data pollution: In a similar scenario, threat actors can take advantage of models that interact with live data from untrusted sources and intentionally pollute that data. This introduces a wide range of vulnerabilities, the least of which is inaccurate results. For instance, an LLM that scrapes Amazon product reviews can be polluted with falsified reviews, resulting in skewed analysis and output. In some cases, LLM agents exposed to malicious data can, in turn, become agents of that data. CISOs must be aware of and cautious about the data their LLMs get exposed to to prevent this.

Excessive agency: One of the most severe vulnerabilities is when a model is given access to privileged functions that a user shouldn't have access to. This allows users to manipulate the model to escalate their privileges and access privileged functions.

New threats, same security fundamentals.

Because AI is mostly open-source and widely available for anyone to learn, we should anticipate more of these attacks as AI adoption skyrockets. Unfortunately, it doesn't take sophisticated nation-state actors or someone with vast experience in deep technical exploitation chains.

The good news is that defending against AI model attacks requires essentially the same security fundamentals CISOs have been leveraging for years, with a few new twists. While there are some frameworks to guide developers—ISO 42001 standardizes how organizations should implement AI systems, and Europe has just introduced the AI Act—these aren't holistic or broadly applicable enough.

For companies figuring it out as they go, here are some best practices to consider.

1. Training

Educate employees about the risks of even casual AI use in the workplace. Unfortunately, you can never fully trust it, and they should approach every interaction with that assumption.

2. Prioritize security by design

Developers need to think carefully about who and what the model will be exposed to and how that can influence its behavior. Security should be part of the process from inception, not as an afterthought. Don't assume the model will always behave the way you expect.

3. Conduct threat modeling

Just as you would with any new introduction to your technology landscape, perform a threat analysis on AI tools. Identify what the model has access to, what it's exposed to, and how it's intended to interact with other components or applications. Understand risks, data flows, and the threat landscape, and then implement trust boundaries.

4. Consider multi-directional access

Because there are so many touchpoints, most organizations don't realize the full scope of the risk. While User A may not have ill intent, User B can manipulate User A's account to control the model (horizontal threat), or someone could manipulate the model to escalate functionality or privileges (vertical threat). Where two models intersect—say, a text-based model that interacts with an image generation model—a multi-modal risk opens up when channels are left unrestricted to leak data across one another to the end user.

5. Deploy data-code segmentation

Anytime you expose an ML model to untrusted data, the model becomes an agent of that data. The solution: segment models from the data using a "gatekeeper" approach that prevents the model from accessing untrusted data and trusted functions simultaneously.

Even if the fundamentals are the same, the timeline has accelerated for many CISOs. Rapid adoption calls for urgent solutions before things get completely out of hand.

David Brauchler III

Technical Director, NCC Group NA

David Brauchler III is an NCC Group Technical Director in Dallas, Texas. He is an adjunct professor for the Cyber Security graduate program at Southern Methodist University with a master's degree in Security Engineering and the Offensive Security Certified Professional (OSCP) certification

David Brauchler published Analyzing AI Application Threat Models on NCC Group's research blog, introducing new Models-As-Threat-Actors (MATA) methodology to the AI security industry, which provided a new trust flow centric approach to evaluating risk in AI/ML-integrated environments. David also released several new threat vector categories, AI/ML security controls, and recommendations to maximize the effectiveness of AI penetration tests.

Put AI security best practices into action

NCC Group has you covered. Our AI Security Assessments include a broad range of bias, toxicity, configuration, and implementation assessments that provide insight into your current AI security posture and recommendations for remediation.

The best practice is a holistic approach examining your model in the context of your system's entire architecture. We look for possible connections to unexpected data sources and conduct novel exploits against these multilayered applications to understand how things might go wrong from a realistic and impact-driven point of view.

Our experts have examined AI systems across a wide range of business sectors and applications, and we're on top of the latest exploits—including those that are still theoretical.

As a CISO, you might feel thrown into the deep end of this brand-new technology without the resources and expertise to shore up your defenses. An experienced team like ours could be the partner you need to help assess, understand, and mitigate your AI platform risk.

Start building cyber resilience into AI systems.

Contact us today to open a conversation with one of our experts or jump right in with our AI/ML security assessment.

Learn more

AI and Cyber Security:New Vulnerabilities CISOs Must Address

How has generative AI affected security?

New threats, same security fundamentals.

Put AI security best practices into action

Start building cyber resilience into AI systems.

AI and Cyber Security:
New Vulnerabilities CISOs Must Address