Machine learning (ML) is now actively engaged in the forefront of data security. However, when technological advancements progress swiftly, security frequently becomes a secondary priority. This trend is becoming more evident due to the improvised nature of many implementations, where organizations lack a defined strategy for the responsible use of ML.
Exploitation opportunities are not solely expanding due to the risks and vulnerabilities within ML models, but also in the foundational infrastructure that sustains them. Numerous base models and the datasets used to train them are openly accessible for both developers and adversaries.
Distinct hazards for ML models
Ruben Boonen, CNE Capability Development Lead at IBM, highlighted a crucial issue: “A key problem is hosting these models on vast open-source databases. The origins and modifications of these models are often unclear, leading to potential issues. For instance, consider loading a PyTorch model from one of these databases, only to discover undisclosed alterations. Identifying such changes can be particularly challenging since the model’s behavior may seem normal in the majority of cases.”
A recent discovery unveiled numerous harmful files hosted on Hugging Face, one of the major repositories for open-source creative AI models and training datasets. Among them were approximately a hundred malevolent models capable of injecting malicious code into user devices. In an instance, cybercriminals disguised themselves as the genetic testing company 23AndMe to distribute a compromised model capable of extracting AWS passwords. This model was downloaded extensively before being reported and eliminated.
In another recent case, red team researchers identified weaknesses in ChatGPT’s API, where a solitary HTTP request led to two unusual responses, suggesting a potential exploitable code path. This could result in information leakage, denial of service attacks, and even privilege escalation. The team also uncovered vulnerabilities in ChatGPT plugins that could lead to unauthorized account access.
While open-source licensing and cloud solutions fuel innovation in the realm of AI, they also introduce risks. In addition to these AI-specific risk factors, general concerns regarding infrastructure security, such as cloud configuration vulnerabilities and inadequate monitoring, are also pertinent.
AI models: The emerging domain of intellectual property theft
Imagine devoting considerable resources to developing a proprietary AI model, only to have it stolen or reverse-engineered. Unfortunately, model theft is a looming issue because AI models often contain sensitive data and can expose an organization’s confidential information if breached.
One common method of model theft involves model extraction, where hackers exploit API vulnerabilities to access and utilize models. This can potentially allow them to access black-box models, like ChatGPT, and extract sufficient data for reverse engineering.
Typically, AI systems operate in cloud environments rather than local servers. Cloud platforms offer the scalable data processing power essential for running AI models efficiently. However, this accessibility widens the surface for attack, enabling adversaries to exploit vulnerabilities like misconfigured access permissions.
Boonen noted, “When organizations deploy these models, they often have customer-facing services like AI chatbots. Unauthorized individuals might exploit APIs to access unreleased models.”
Security of AI models by red teams
Preventing model theft and reverse engineering necessitates a multi-faceted strategy that combines established security practices with offensive measures.
In this context, red teams play a vital role. These teams can systematically assess various aspects of AI model security, including:
- API breaches: By mimicking adversary queries on black-box models, red teams can pinpoint weaknesses like inadequate rate limits or insufficient response filters.
- Side-channel vulnerabilities: Red teams can conduct side-channel evaluations, monitoring metrics like CPU and memory usage to glean insights into model structure and parameters.
- Container and orchestration flaws: By scrutinizing containerized AI dependencies, red teams can identify orchestration vulnerabilities such as misconfigured permissions or unauthorized container access.
- Supply chain attacks: Red teams can probe entire AI supply chains to ensure the use of only trusted components across different environments.
A comprehensive red teaming strategy can replicate real-world attacks on AI infrastructure, exposing security gaps and aiding in devising effective incident response plans to prevent model theft.
Addressing the issue of excessive authority in AI systems
Most AI systems possess varying degrees of autonomy in interfacing with systems and processing inputs. However, an excess of autonomy, functions, or permissions in AI systems, referred to as “excessive authority” by OWASP, can lead to harmful outcomes, security gaps, or unpredictable processes.
Boonen cautioned that components like optical character recognition (OCR) in multimodal systems may introduce vulnerabilities if not properly secured.
Granting excessive authority to AI systems broadens the unnecessary attack surface, providing more entry points for adversaries. Enterprise-class AI systems are typically integrated into broad environments encompassing diverse infrastructures, data sources, and APIs. Excessive authority arises when these integrations compromise security for functionality.
Consider an instance where an AI-powered personal assistant has unrestricted access to an individual’s Microsoft Teams meeting recordings stored in OneDrive for Business. While the purpose may be to summarize meeting content conveniently, an unsecured plugin could potentially access other confidential data stored in the same OneDrive account. If the plugin has writing capabilities, security vulnerabilities could pave the way for malicious uploads.
Red team exercises can help uncover vulnerabilities in AI integrations, particularly in environments with diverse plugins and APIs. Through simulated attacks and detailed analyses, red teams can identify access permission flaws and unnecessary security risks, reducing the attack surface even in the absence of security vulnerabilities.