David Dewaele, Author at Checkmarx https://checkmarx.com/author/david-dewaele/ The world runs on code. We secure it. Tue, 10 Feb 2026 21:59:03 +0000 en-US hourly 1 https://checkmarx.com/wp-content/uploads/2024/06/cropped-cx_favicon-32x32.webp David Dewaele, Author at Checkmarx https://checkmarx.com/author/david-dewaele/ 32 32 The AI Inventory Gap: Why Your Organization Has No Idea What AI Assets Are Part of Your Software Supply Chain https://checkmarx.com/blog/ai-llm-tools-in-application-security/the-ai-inventory-gap-why-your-organization-has-no-idea-what-ai-assets-are-part-of-your-software-supply-chain/ Sun, 11 Jan 2026 10:42:46 +0000 https://staging.checkmarx.com/?p=106327 Your developers are already embedding or calling AI assets as part of your applications – whether you know it or not. Models, weights, MCPs, agent frameworks, and AI libraries are quietly making their way into codebases.   

Once these AI assets land in your repositories or container images, they become part of your software supply chain. The next Log4J doesn’t have to be a package; it can just as easily be a model, an MCP, or an AI asset you didn’t even know you shipped. 

AI Supply Chain risks include any risks introduced by AI assets that become part of your software supply chain, ranging from poisoned data, malicious or over-privileged MCPs/agents and unknown provenance.   

And yet, despite this rapid adoption, organizations can’t answer a simple question:  

What AI components are we using in our software development, and where?  

Without visibility, AI-related risk compounds: AI assets spread across codebases without security review, inventory, or policy controls, creating new blind spots or widening existing ones across your software supply chain.  

Why Most Organizations Lack Visibility into Their Devs’ AI Usage 

AI adoption is rapidly growing across organizations. In fact, our recent Future of AppSec report found that one in three respondents said over 60% of their organization’s code is written by AI.  Yet only 18% have any sort of AI governance in place. 

This combination of rapid growth in AI usage (especially among developers), combined with a lack of oversight, has created a visibility gap fuelled by fragmentation and tooling that was never built to address AI‑specific risks. 

Here are the main reasons:  

  • Emergence of new AI-focused protocols and technologies. Example: MCP  (Model Context Protocol) was introduced only in November 2024. 
  • AI fragmentation (Copilot, Claude, Microsoft, OpenAI, etc.). Multiple providers, teams picking different tools, no standardization. Lots of tools with different security approaches. 
  • Code security and Supply Chain Security require different approaches. AppSec tools are evolving to detect AI-related code vulnerabilities—identifying prompt injections, tracking sensitive data flows to LLMs, and flagging improper output handling. This addresses how developers write and use AI in their code. But a separate challenge remains: gaining visibility into what AI assets exist across your software supply chain. Traditional scanners analyze data flows and patterns but AI supply chain security requires comprehensive asset inventory, provenance tracking, and governance. 

What is the AI Inventory Gap? 

The AI Inventory Gap refers to all the AI-related components embedded in your applications that your organization hasn’t tracked, reviewed, or governed, yet still ships as part of your software supply chain. 

It typically includes: 

  • Models & weights: Pre-trained or fine-tuned LLMs, CV models, embeddings 
  • Agent frameworks: A software toolkit and structure for building, managing, and orchestrating autonomous AI agents 
  • MCP servers: Program that enables AI models, particularly large language models (LLMs), to access external data, tools, and workflows, acting as a bridge for AI agents to interact with the real world. 
  • Datasets: Training and evaluation data, sometimes with sensitive or licensed content 
  • Prompts: Operational logic dispersed across code and configuration 
  • AI libraries & integrations: SDKs, connectors, and wrappers that pull AI into runtime 

The Risks of the AI Inventory Gap 

When AI components operate without control, the consequences are serious: From hidden attack surfaces, operational surprises, compliance failures, to reputational damage, often discovered only after an incident or audit begins. 

Key risks include:  

  • Model poisoning: Silently introduces backdoors, blind spots, or biased behavior that attackers exploit without detection 
  • Unverified or malicious weights: Sourced from unknown origins without integrity checks. Unverified or malicious weights are like running untrusted binaries, they  can expose you to remote code execution, contain hidden payloads or logic or create backdoors for data exfiltration or resource abuse. 
  • Dataset exposure: Sensitive or licensed data leaked via training, prompts, or logs 
  • Unsafe agents & tools: Autonomous agents that can access files, networks, or services without guardrails 
  • Unpinned versions: Silent updates to models or libraries change behavior and risk posture overnight. Unpinned versions can allow unexpected or malicious updates to be introduced automatically, leading to supply-chain attacks, breaking changes, or non-reproducible and insecure builds. 
  • Compliance gaps: Missing documentation, provenance, and audit trails lead to penalties and delays 

AI Governance: Can You Trace Every Model, Dataset, and Dependency? 

AI adoption is following a pattern similar to what we see with open-source software: developers move fast while governance lags behind.  

The expectation from a compliance point of view is that reporting and keeping an inventory of AI components is no longer optional 

But the reality is messy: 

  • Regulatory Expectations Are Rising: frameworks and regulations (e.g., EU AI-related requirements, AI governance standards) demand accountability and evidence, teams lack the tooling to inventory, assess, and report on AI usage across the enterprise. Compliance becomes reactive and costly.  
  • Developer-Led AI Adoption Outpaces Governance: Developers integrate models, datasets, and frameworks to solve real problems fast. If governance processes are slow or unclear, people ship and promise to “clean it up later.” Those quick wins become permanent dependencies, often without reviews, version pinning, or provenance checks. 
  • Fragmented Work Processes Makes Inventory Hard: AI usage spans multiple teams and repos: data science, platform engineering, mobile, web, back-end, cloud functions. Without a central AI inventory, leadership cannot answer basic questions about what AI assets are being used, deployed, and what risks are attached. This leads to reactive security, and the risk of having to deal with vulnerabilities after it’s too late.  
  • No Clear Ownership of AI Governance: Developers are focused on shipping features. They experiment with models, libraries, MCPs, and agent frameworks to solve problems quickly—not to define governance boundaries or maintain inventories. 

Application security teams, meanwhile, are often left to document and report on AI usage after the fact. They’re suddenly asked to answer questions about models, datasets, and agents embedded across the software supply chain, without the visibility or tooling needed to do so. 

At the leadership level, responsibility is also often fragmented. CTOs drive adoption and velocity, CISOs are accountable for risk and compliance, and no single function clearly owns end-to-end governance of AI assets. The result is predictable: AI moves fast, ownership stays unclear, and Shadow AI fills the gap. 

What can Organizations do about it? 

AI inventory isn’t a problem you solve with a single control. It requires ownership, visibility, and governance embedded into how software is built. 

This is a relatively new challenge, and while tooling is evolving, organizations can take concrete steps today to regain control: 

  • Decide on a clear owner of AI inventory, with defined responsibilities and authority, to whom other teams report. 
  • Baseline your AI usage: Run deterministic discovery across prioritized repos and services to build an initial inventory. 
  • Classify & assess risks: Tag assets by type (model, agent, dataset, prompt, library) and apply AI-specific risk checks. 
  • Generate AI‑BOMs: Produce standards-aligned BOMs with provenance, licensing, dependencies, and risk metadata. 
  • Define your policies: Blacklist/whitelist of assets, acceptable risk thresholds, block build for specific type of risks, etc. 
  • Embed governance where work happens:  
  • Add PR checks, CI/CD gates, and dashboards to enforce policies and track trends. 
  • Measure & iterate: Monitor coverage, findings, MTTR, and compliance posture. Expand to more teams, apps, and environments. 

Final Thoughts  

AI has moved beyond the experiment phase. It is now part of the day-to-day reality of modern development teams, already deeply embedded into modern software stacks. But without visibility, every untracked model, dataset, or agent becomes a potential vulnerability.  

The bottom line? If you can’t see the AI in your software, you can’t control the risk.  

]]>
image
How Checkmarx Defends Against the Shai-Hulud Second Coming Malicious Package Campaign https://checkmarx.com/blog/how-checkmarx-defends-against-the-shai-hulud-second-coming-malicious-package-campaign/ Sun, 30 Nov 2025 05:49:01 +0000 https://staging.checkmarx.com/?p=105872 On 24 November 2025, news broke of a major attack against the NPM open-source package repository, the primary source of open-source software dependencies used by JavaScript and TypeScript applications. And of course, Checkmarx responded rapidly to keep our customers safe. This attack was an aggressive and stealthy enhancement of the previous Shai-Hulud attack; the attackers called this a “Second Coming” of Shai-Hulud, the fictional great worm from the Dune science-fiction novels. 

This malicious package campaign created a self-replicating “worm” that:

  1. Steals GitHub, NPM, and related credentials from developer workstations and CI/CD environments. 
  2. Uses those credentials to infect other npm packages (over 770 as of this writing) and GitHub repositories (over 27,000 as of this writing), allowing the malicious code to spread on its own. 
  3. Deletes user directories (also known as home directories) if it is unable to successfully harvest credentials, causing damage to developer workstations, failed builds, and the associated lost productivity. 

The Checkmarx Zero security research team is continuously identifying potentially affected packages, and when verified to be malicious, adding them to our Malicious Package Protection(MPP) system. Our customers who use MPP are alerted if any of their applications consume one of the infected package versions, so that security teams can respond quickly to address the threat. 

And customers who adopt the Malicious Package Identification API (MPIAPI) as a proactive defense can actively block the installation of package versions infected with Shai-Hulud or the Shai-Hulud Second Coming, preventing the compromise from occurring in the first place.  

These defenses are possible because Checkmarx maintains the world’s largest human-verified database of malicious open-source packages. 

Malicious Packages: The Exploit That Installs Itself 

Everyone worries about vulnerabilities, but malicious packages are unique in that they don’t wait to be exploited – they are the exploit. 

When you think of software supply chain threats, vulnerabilities also come to mind: legitimate open-source packages with hidden weaknesses that attackers can exploit. But these flaws require a trigger – a hacker, a campaign, a moment of exploitation. Anything. 

Malicious packages are different. 
They’re the attacker’s code, published directly into public repositories like npm or PyPI, but disguised as legitimate software. The moment they’re installed, they execute harmful code inside your environment, no exploit needed. These packages can exfiltrate credentials, steal data, or establish persistent access before you even know they’re there. 

That’s what makes malicious packages the most insidious threat in modern software development: they bypass traditional vulnerability scanners because they embed the attacker directly into your supply chain. 

Malicious Packages Are Everyone’s Problem 

Malicious packages aren’t just a developer mistake or a DevOps oversight – they’re a business risk. Once installed, they can: 

  • Exfiltrate sensitive data and credentials 
  • Compromise systems and CI/CD pipelines 
  • Leak intellectual property 
  • Disrupt operations and introduce backdoors 
  • Damage customer trust and your organization’s reputation 
  • Trigger regulatory and compliance violations 

In a hyperconnected ecosystem, one malicious dependency can cascade across partners, customers, and entire industries. 

Proactive Defense With Checkmarx Malicious Package Protection 

The best defense against malicious packages is to detect and block them before they ever enter your environment. 

Checkmarx Malicious Package Protection (MPP) provides complete, automated protection that fits seamlessly into existing workflows: 

  • Within Checkmarx Software Composition Analysis (SCA): Customers receive automated alerts whenever a malicious package is detected, along with safe, vetted alternatives. 
  • Through the Checkmarx Malicious Package Identification API: 
    Teams can integrate detection at key checkpoints. This provides full flexibility to be protected even within a team’s own processes and pipelines: 
    – Before downloading from public repositories (npm, Maven Central, etc.) 
    – Before adding or retrieving from private registries 
    – During SCA scans of existing dependencies 
    – In CI/CD build stages, prior to installation 
  • Within Checkmarx AI Developer Assist: Developers are shielded directly in their IDE, ensuring they never inadvertently import malicious open-source dependencies. 

This multilayered approach ensures your pipelines, developers, and software assets stay protected at every stage of the software lifecycle. 

Built on the World’s Largest Malicious Package Database 

Effectiveness depends on intelligence and Checkmarx leads with the world’s largest malicious package database, powered by years of research and continuous monitoring. 

  • Over 420,000 malicious packages identified across 92.8 million versions
  • Coverage across ecosystems including PyPI, npm, RubyGems, NuGet, and Maven Central
  • Powered by advanced automation and Checkmarx Zero, our dedicated research team who manually validates every package before inclusion

This unparalleled intelligence ensures accuracy, reliability, and confidence when identifying emerging threats before they become public. 

Stay Ahead of the Next Attack 

The difference between containing the attack and being compromised comes down to proactive visibility and automated prevention. Checkmarx Malicious Package Protection enables you to stay ahead of attackers, protect your software supply chain, and empower developers to innovate safely. 

Want to learn more about how Checkmarx can protect your organization against the next attack? Contact us or see it in action

Learn more about Checkmarx Malicious Package Protection 

Explore Checkmarx Security Research website 

]]>