The Impact of Cyber Criminals on AI Software Supply Chains

Having artificial intelligence (AI) gaining popularity in various sectors and scenarios, safeguarding against AI-powered software supply chain attacks is now more vital than ever.

A recent investigation by SentinelOne uncovered a new ransomware actor called NullBulge. This actor focuses on software supply chains by leveraging code in public repositories like Hugging Face and GitHub. The group, posing as a hacktivist entity driven by an anti-AI agenda, specifically targets these resources to corrupt data sets utilized in AI model training.

Whether you utilize mainstream AI solutions, integrate them into your current technology stack through application programming interfaces (APIs), or create your own models from foundational open-source models, the entire AI software supply chain is now under the scrutiny of cyber attackers.

Contaminating community-driven data sets

Open-source elements significantly contribute to the AI supply chain. Only major corporations possess the extensive data required to train a model from the ground up, necessitating heavy reliance on community-driven data sets like LAION 5B or Common Corpus. The vastness of these data sets makes it extremely challenging to uphold data quality and adhere to copyright and privacy regulations. Conversely, several popular generative AI models like ChatGPT utilize proprietary data sets, posing unique security risks.

Exclusive and proprietary models could enhance foundational open-source models through further training using their proprietary data sets. For instance, a company developing an advanced customer service chatbot might leverage its past customer interactions to build a customized model. This kind of data has historically been a target for cyber offenders, but the explosive growth of generative AI has made it even more enticing to malicious entities.

By tampering with these data sets, cybercriminals can contaminate them with false information or malicious content. Subsequently, once this compromised data enters the AI model training phase, a chain reaction spanning the entire AI software lifecycle can occur. Training a large language model (LLM) can be an arduous task, requiring extensive time and computing power. It is a highly expensive and environmentally taxing process. However, if the data sets used in training are compromised, restarting the entire process may become necessary.

Discover AI cybersecurity solutions

Emerging attack avenues

Most AI software supply chain attacks involve backdoor interference methods like those mentioned previously. Nevertheless, this is not the sole approach, particularly with the proliferation of cyber intrusions targeting AI systems becoming more sophisticated. Another technique is flood attacks, where attackers inundate an AI system with massive volumes of benign data to mask something else, such as a segment of malicious code.

Furthermore, we are witnessing an increase in assaults on APIs, especially those lacking robust authentication mechanisms. APIs are crucial for integrating AI into the myriad operations that businesses now employ it for, and while the responsibility for API security is often assumed to lie with the solution provider, in reality, it’s a shared duty.

Recent instances of AI API incursions include the ZenML breach and the Nvidia AI Platform vulnerability. While both issues have been resolved by their respective providers, more incidents are expected as cybercriminals broaden and diversify their attacks on software supply chains.

Securing your AI initiatives

These developments should not dissuade the adoption of AI. Similar to how you wouldn’t cease using email due to phishing threats. What this signifies is that AI has become the new frontier in cybercrime, necessitating a robust security framework in every phase of developing, implementing, utilizing, and maintaining AI-driven technologies — be it proprietary or third-party solutions.

For this purpose, businesses must have comprehensive visibility into all elements utilized in AI development. They also require complete interpretability and validation for each AI-generated outcome. Achieving this without human oversight and prioritizing security in your strategy is unfeasible. Conversely, if you view AI only as a means to save time and reduce costs by laying off staff, with little concern for the repercussions, it’s just a matter of time before calamity strikes.

AI-driven security solutions also play a key role in combatting these threats. They supplement skilled security analysts rather than replacing them, empowering them to excel at what they do best on a scale that would otherwise be unachievable.

Freelance Content Marketing Writer

The Impact of Cyber Criminals on AI Software Supply Chains

Contaminating community-driven data sets

Emerging attack avenues

Securing your AI initiatives

Recent Posts

Categories