The Hidden Dangers of AI/ML Libraries: Remote Code Execution Unveiled
Beware! Your AI/ML models might be under attack. We've uncovered critical vulnerabilities in three open-source AI/ML Python libraries, exposing a shocking truth: remote code execution (RCE) is possible when loading model files with malicious metadata. But here's the twist: these libraries are widely used in popular models on HuggingFace, with tens of millions of downloads.
The Culprits: NeMo, Uni2TS, and FlexTok
These libraries, NeMo by NVIDIA, Uni2TS by Salesforce, and FlexTok by Apple and EPFL VILAB, are designed for research and development of diverse AI/ML models and complex systems. The vulnerabilities arise from their use of metadata to configure models, where a shared third-party library executes the provided data as code, allowing attackers to embed arbitrary code in model metadata.
The Discovery and Response
Our investigation revealed that vulnerable versions of these libraries enable RCE. Palo Alto Networks took swift action, notifying affected vendors in April 2025. NVIDIA, Salesforce, and Apple responded with fixes and CVE records, addressing the issues.
The Hydra Connection
All identified vulnerabilities utilize the hydra.utils.instantiate() function, which is designed to instantiate different implementations of an interface. However, a critical oversight allows attackers to exploit this function for RCE using Python's built-in functions like eval() and os.system().
NeMo's Vulnerability
NVIDIA's NeMo library, a scalable AI framework, uses its own file formats with .nemo and .qnemo extensions. The main entry points, restorefrom() and frompretrained(), are vulnerable due to the lack of sanitization on metadata before instantiation. This allows attackers to create malicious model_config.yaml files, leading to RCE.
Uni2TS and FlexTok: Exploiting Safetensors
Uni2TS and FlexTok, designed for time series analysis and image processing respectively, exclusively work with the safetensors format, which is intended to be safe. However, both libraries use Hydra to load configurations from model metadata, enabling RCE. Uni2TS leverages a mechanism in huggingfacehub to decode configurations, while FlexTok uses Python's ast.literaleval() for decoding.
The Impact and Mitigation
As of January 2026, no malicious activity has been detected, but the potential for exploitation remains. Palo Alto Networks offers enhanced protection through Prisma AIRS and Cortex Cloud's Vulnerability Management, identifying vulnerable models and managing base images for cloud environments. The Unit 42 AI Security Assessment further assists organizations in securing AI innovation and governance.
The Ongoing Battle
While newer formats and updates address security issues, they don't make applications impervious to traditional exploits. The attack surface is vast, with numerous libraries and formats in use. Palo Alto Networks remains vigilant, sharing findings with the Cyber Threat Alliance to disrupt malicious cyber actors and protect customers.
The Takeaway
This discovery highlights the importance of comprehensive security assessments in AI/ML development. As AI/ML models become more sophisticated, so do the threats they face. It's crucial to stay vigilant and implement robust security measures to safeguard these powerful technologies.