The Problem That Looks Like a Tuesday

It usually starts with a help desk ticket or a Slack message that says something like: “The model we deployed last month is throwing errors in production.” An engineer digs in. The culprit is not the model itself. It is a transitive dependency—a package that another package requires—that updated silently, broke a version constraint, and destabilized the environment your model runs in.

This is not an edge case. According to Anaconda’s Bridging the AI Model Governance Gap report, 67% of organizations experience AI deployment delays due to security and dependency issues. Meanwhile, only 25% describe their AI toolchain as highly unified. The rest are managing a patchwork of environments, registries, and scanning tools that don’t talk to each other, and hoping the seams hold.

The engineering team files this under “dev problem.” The security team says “just don’t use it.” The compliance officer doesn’t hear about it at all. And leadership wonders why the AI roadmap keeps slipping.

Dependency management in machine learning is not a developer inconvenience. It is an enterprise risk with direct exposure to security, regulatory compliance, reproducibility, and production reliability.

Why AI Dependency Chains Are Uniquely Dangerous

Software dependency management has always been complex. In AI and ML environments, it is an order of magnitude more complicated.

A typical enterprise Python machine learning stack might involve PyTorch or TensorFlow as a base framework, with dozens of downstream libraries, each with their own dependency trees: NumPy, SciPy, scikit-learn, Hugging Face Transformers, CUDA bindings, and more. A single model training pipeline can carry 150 to 400 transitive dependencies. Change one, and the ripple effects are rarely predictable.

The volatility compounds the problem. The ML ecosystem moves fast. Major libraries ship breaking changes across minor versions. CUDA and cuDNN version requirements shift with new hardware generations. Package maintainers deprecate APIs without coordinated notice. When a data scientist installs a new library to test a technique, they may unknowingly introduce a version conflict that doesn’t surface until a scheduled training job fails at 2 a.m.

The security surface is equally exposed. Open-source packages can contain malicious code, undisclosed vulnerabilities, or, increasingly, supply chain attacks where a dependency several layers deep has been compromised. In 2024, one enterprise entertainment company lost 1.1 terabytes of confidential data to an attack that succeeded not through sophisticated intrusion, but through a malicious file disguised as a legitimate AI tool. The entry point was exactly the kind of unvetted package that gets installed informally during experimentation.

And because AI environments are typically spun up by data scientists who prioritize experimentation speed, governance mechanisms—like vulnerability scanning, registry controls, environment pinning—are often applied inconsistently, or not at all.

Where Standard Approaches Fall Short

Most enterprise teams handling this problem are running one of three playbooks, and each has a structural limitation when applied to AI environments.

pip install + requirements.txt / pyproject.toml. The traditional approach: pin versions in a requirements file, lock them with pip-compile or similar, and hope. The problem is that pip’s dependency resolver is not designed for the kind of deep, interacting constraint graphs that ML stacks generate. It resolves greedily rather than globally, which means conflict-free resolution is not guaranteed. Worse, requirements files document what a developer intended to install, not what was actually installed—a subtle but critical distinction for reproducibility and audit.

Virtual environments per project. Isolation helps, but it does not solve the governance problem. It relocates it. If every data scientist maintains their own conda environment or venv, security teams have no visibility into what is running across the organization. When a CVE is disclosed against a specific NumPy version, without intentionality in its set up and deployment, understanding the threat exposure is difficult without manually auditing dozens of disconnected environments.

Point-in-time security scans. Many enterprises run vulnerability scans at deployment time. This catches known CVEs at the moment of release, but it misses two classes of risk: vulnerabilities disclosed after deployment (and ML models stay in production far longer than typical web services) and the ongoing drift between the environment that was scanned and the environment that is actually running. Without continuous monitoring, a clean scan at deployment provides a false sense of security within weeks.

The core failure is that these approaches treat dependency management as a point-in-time technical task, when the actual risk is continuous, distributed, and crosses team boundaries.

A Governance-First Framework for ML Dependency Management

Addressing this at the enterprise level requires shifting from reactive dependency management to proactive dependency governance. The distinction is practical, not semantic. Governance means visibility, policy, accountability, and audit trail—not just tooling.

1. Treat the environment as an artifact, not a configuration.

Every ML environment—the combination of packages, versions, and dependencies used to train or serve a model—should be captured, versioned, and stored alongside the model itself. Conda environment exports (.yaml) or pip-compiled lockfiles (.txt) should be first-class artifacts in your model registry, not optional developer notes. When a model is promoted to production, the exact environment used to build it should be reproducible on demand. This is the foundation of both reproducibility and incident response.

2. Implement an internal package registry with vetted, curated distributions.

Allowing data scientists to install packages directly from PyPI or conda-forge without review is the equivalent of allowing employees to install arbitrary software from the internet. An internal registry populated with pre-vetted, vulnerability-scanned packages gives security teams control over what enters the environment while giving developers the flexibility they need to work. The security team defines the approved perimeter; data scientists operate within it without friction.

3. Generate and maintain an AI Bill of Materials (AIBOM).

Software Bill of Materials (SBOM) practices—now increasingly required by regulators and enterprise security standards—need to extend explicitly to AI and ML supply chains. An AIBOM documents not just which packages are used, but their provenance, license terms, version history, and known vulnerability status. When a new CVE is disclosed, an AIBOM lets your security team immediately identify which models and pipelines are affected, turning a multi-day investigation into a targeted, rapid response.

4. Separate experimentation environments from governed production paths.

Data scientists need freedom to explore. Production models need stability and accountability. These goals are not incompatible, but they require structural separation. Sandbox environments can allow broader package access for experimentation; promotion to staging and production requires passing through a governed pipeline that validates dependencies against your security policy, checks licenses for compliance, and generates an audit record. The governance checkpoint happens at the boundary, not at the expense of exploration velocity.

5. Monitor continuously, not just at deployment.

AI models run in production for months, sometimes years. A dependency that was clean at deployment can become a critical vulnerability six months later. Continuous monitoring is not an operational luxury. This involves automated alerts when packages are flagged by new CVE disclosures and drift detection when environments diverge from their baseline. For organizations operating under SOC 2, HIPAA, or EU AI Act requirements, it is increasingly a compliance obligation.

The Regulatory Horizon Makes This Urgent

AI governance is no longer an internal best practice conversation. The EU AI Act, with most provisions applicable from August 2026, introduces explicit requirements for documentation, traceability, and risk management in AI systems. Organizations using AI in high-risk contexts (healthcare, financial services, critical infrastructure, HR) will need to demonstrate not just that their models perform as intended, but that they can account for the entire development and deployment environment.

This makes dependency governance retroactively important. If you cannot reproduce the environment in which a model was trained, you cannot reliably explain its behavior. If you cannot show that your packages were scanned for vulnerabilities at deployment and monitored continuously thereafter, you cannot demonstrate the security controls regulators are beginning to require.

The organizations building these practices now—environment-as-artifact, internal registries, AIBOM generation, governed promotion pipelines—will have a meaningful head start. Those treating dependency management as a tactical dev problem will face these requirements reactively, under time pressure, with incomplete records.

The teams that win on AI governance will not be the ones with the most restrictive policies. They will be the ones that make secure practices the path of least resistance for their developers.

Governance at the Ecosystem Level

The practical challenge of implementing dependency governance at enterprise scale is that it requires consistent tooling across every team, project, and environment. Ad hoc solutions—a scanner here, a registry there, lockfiles when someone remembers—produce inconsistent security posture and incomplete audit trails.

Curated, enterprise-grade open-source ecosystems address this by embedding governance at the infrastructure level. When the packages available to your teams are pre-vetted, vulnerability-scanned, and license-filtered before they are ever installed, security is not a checkpoint that slows developers down. It is a property of the environment they operate in. When SBOM generation is automated and environment artifacts are captured as part of standard workflow, compliance documentation is produced continuously, not scrambled together before an audit.

The data is instructive: Forrester’s Total Economic Impact report found that organizations using automated vulnerability scanning reduced security incidents by up to 60%. Not because the scanning was uniquely sophisticated, but because it was systematic and applied consistently across the entire package surface—not just at deployment or when someone remembered to run it.

Dependency hell is a real phenomenon. But it is not inevitable. It is the predictable result of treating a governance problem as a development problem. Organizations that recognize the distinction and build their ML infrastructure accordingly will find that the same practices that improve security also improve reproducibility, accelerate compliance, and reduce the deployment delays that currently cost AI projects their momentum.

Ishita Verma is a Senior Machine Learning Engineer at Netflix on the Core Recommendations team, where she designs and scales AI systems that power member experiences and large-scale personalization worldwide. Previously, she worked as a Senior Machine Learning Engineer at Adobe on the Globalization team, leading the development and deployment of AI and generative AI capabilities across products. Her work focuses on building high-impact recommender systems, advancing generative AI, and delivering robust, scalable machine learning solutions in production.