Pushing the Limits: Red Teams Dive into Stress-Testing Multimodal AI Applications

Human interactions involve multiple modes. Information is received through various channels, enabling our minds to perceive the world from different perspectives and integrate these diverse forms of data to form a unified understanding of reality.

We have now advanced to a stage where artificial intelligence (AI) can accomplish a similar feat, albeit to some extent. Similar to our brains, multimodal AI applications process different varieties — or modalities — of information. For instance, OpenAI’s ChatGPT 4.0 can analyze text, images, and audio, enhancing its contextual comprehension and enabling more human-like interactions.

Nevertheless, despite the evident value of these applications in a business setting concentrated on productivity and adaptability, their inherent complexity also introduces unique risks.

As per Ruben Boonen, CNE Capability Development Lead at IBM: “Assaults on multimodal AI systems generally involve manipulating them to generate harmful outcomes in end-user applications or evade content moderation mechanisms. Envision these systems in a high-stakes environment, such as a computer vision model in an autonomous vehicle. If one could deceive a vehicle into thinking it should not stop when it should, the consequences could be disastrous.”

An instance of Multimodal AI risks in the financial sector

Consider the following real-world scenario:

A financial firm employs a multimodal AI application to aid in its trading strategies, processing both textual and visual data. The system leverages a sentiment analysis tool to scrutinize text-based data like earnings reports, analyst opinions, and news updates to gauge market participants’ sentiments regarding specific financial instruments. Subsequently, it performs a technical analysis of visual data, such as stock charts and trend analysis graphs, to provide insights into stock performance.

Subsequently, a malevolent hedge fund manager exploits vulnerabilities in the system to manipulate trading decisions. In this instance, the attacker launches a data poisoning assault by inundating online news platforms with fabricated stories concerning certain markets and financial assets. Next, they execute an adversarial strike by making undetectable pixel-level alterations — referred to as perturbations — to stock performance charts, tricking the AI’s visual analysis capacity.

The consequence? Owing to the manipulated input data and erroneous signals, the system recommends purchasing orders at artificially inflated prices. Unaware of the manipulation, the company adheres to the AI’s recommendations, while the attacker, possessing shares in the targeted assets, sells them for an illicit gain.

Predicting and thwarting adversaries’ moves

Picture a scenario where the attack wasn’t carried out by a fraudulent hedge fund manager but rather simulated by a red team specialist aiming to uncover vulnerabilities before malevolent actors exploit them.

By simulating these intricate, multifaceted assaults in secure, controlled environments, red teams can unveil potential weaknesses that traditional security systems are highly likely to overlook. This proactive strategy is crucial for strengthening multimodal AI applications before their deployment in operational settings.

According to the IBM Institute of Business Value, 96% of executives believe that implementing generative AI will elevate the likelihood of security breaches in their firms within the next three years. The swift expansion of multimodal AI models will only exacerbate this challenge, hence underscoring the escalating significance of AI-specialized red teaming. These professionals can proactively address the unique threats accompanying multimodal AI: cross-modal attacks.

Cross-modal assaults: Distorting inputs to yield malicious results

A cross-modal attack entails injecting malevolent data in one form to derive detrimental outcomes in another. These attacks can materialize as data poisoning assaults during the model creation and development phase or adversarial attacks taking place after the model has been put into operation.

“When operating multimodal systems, they essentially take in input data, and a module will interpret that information. For example, upon uploading a PDF or image, there will be an image-parsing or OCR library that extracts data. Nonetheless, such libraries have faced issues,” as per Boonen.

Cross-modal data poisoning assaults are arguably the most severe since a substantial vulnerability might mandate retraining the entire model on revised datasets. Generative AI employs encoders to transmute input data into embeddings — numerical representations encoding relationships and meanings of the data. Multimodal systems use various encoders for each data type, such as text, image, audio, and video. Additionally, they utilize multimodal encoders to harmonize and align disparate data types.

In a cross-modal data poisoning attack, a malevolent actor with access to training data and systems could tamper with input data to induce encoders to formulate malicious embeddings. For instance, they could intentionally append erroneous or deceptive captions to images to mislead the encoder, resulting in unwanted outputs. In scenarios where precise data classification is pivotal, such as in AI systems for medical diagnoses or autonomous vehicles, the ramifications can be severe.

Red teaming is imperative for simulating such situations before they manifest real-world repercussions. “Imagine you have an image classifier in a multimodal AI application,” Boonen illustrates. “Various tools enable you to generate images and have the classifier assign a score. Now, suppose a red team targets the scoring mechanism to progressively skew it into misclassifying images. Since we lack a precise insight into how classifiers determine elements within images, you keep modifying it, like introducing noise. Eventually, the classifier’s accuracy diminishes.”

Weak points in real-time machine learning models

Several multimodal models incorporate real-time machine learning functionalities, continuously learning from fresh data, as evident from the aforementioned scenario. This serves as an illustration of a cross-modal adversarial attack. In these scenarios, an adversary could bombard an operational AI application with manipulated data to deceive it into misinterpreting inputs. This could also occur unintentionally, which is why some claim that generative AI is progressively becoming “less intelligent.”

Ultimately, models trained or retrained with faulty data inevitably deteriorate over time — a phenomenon known as AI model drift. Multimodal AI systems exacerbate this issue due to heightened risks of inconsistencies among diverse data types. Therefore, red teaming is fundamental in identifying vulnerabilities concerning how different modalities interact, both during the training and inference stages.

Red teams can also pinpoint flaws in security procedures and their integration across modalities. Distinct data types necessitate distinct security protocols, yet these must be harmonized to avert potential gaps. Consider, for instance, an authentication system permitting users to verify via voice or facial recognition. If the voice verification component lacks robust anti-spoofing measures, there is a high likelihood that attackers may exploit the less secure modality.

Multimodal AI systems employed in surveillance and access control systems are also susceptible to data synchronization risks. Such a system may leverage video and audio data to detect suspicious activities by correlating lip movements captured on video with spoken passphrases or names. If an adversary manages to manipulate the feeds, thereby inducing a slight delay between the two, they could dupe the system by employing pre-recorded video or audio to illicitly gain access.

Commencing the journey into multimodal AI red teaming

Although attacks targeting multimodal AI applications are still nascent, it is prudent to adopt a pre-emptive stance.

As advanced AI applications become integral to routine business operations and the very systems safeguarding them, red teaming not only instills confidence but also uncovers weaknesses that traditional reactive security measures are likely to overlook.

Multimodal AI applications introduce a new frontier for red teaming, mandating organizations to tap into specialized know-how to proactively identify vulnerabilities before adversaries exploit them.

Freelance Content Marketing Writer

Pushing the Limits: Red Teams Dive into Stress-Testing Multimodal AI Applications

An instance of Multimodal AI risks in the financial sector

Predicting and thwarting adversaries’ moves

Cross-modal assaults: Distorting inputs to yield malicious results

Weak points in real-time machine learning models

Commencing the journey into multimodal AI red teaming

Recent Posts

Categories

Quick links

Contact