Exposed Hugging Face API tokens jeopardized GenAI models

Lasso Security researchers discovered 1,681 Hugging Face API tokens exposed in code repositories, which left vendors such as Google, Meta, Microsoft and VMware open to potential supply chain attacks.

In a blog post published Monday, Lasso Security said the exposed API tokens gave its researchers access to 723 organizations’ GitHub and Hugging Face repositories, which contained high-value data on large language models (LLMs) and generative AI projects. Hugging Face, a data science community and development platform, says it hosts more than 500,000 AI models and 250,000 data sets.

According to Lasso Security, the exposed API tokens left organizations’ GenAI models and data sets open to a variety of threats, including supply chain attacks, poisoning of training data and theft of models. Bar Lanyado, security researcher at Lasso, wrote that 655 organizations’ tokens had write permissions, which gave the researchers full access to the repositories.

Some of the repositories that were open to full access were for platforms and LLMs such as the open source Meta Llama 2, EleutherAI’s Pythia and BigScience Workshop’s Bloom.

“The gravity of the situation cannot be overstated. With control over an organization boasting millions of downloads, we now possess the capability to manipulate existing models, potentially turning them into malicious entities,” Lanyado wrote in the blog post. “This implies a dire threat, as the injection of corrupted models could affect millions of users who rely on these foundational models for their applications.”

In a statement to TechTarget Editorial, Hugging Face said all exposed API tokens have been revoked, but the company appeared to put the blame primarily on customers. “The tokens were exposed due to users posting their tokens in platforms such as the Hugging Face Hub, GitHub and others,” the company said. “In general, we recommend users do not publish any tokens to any code hosting platform.”

Readers Also Like: New Mollars ICO Token may see Bigger ROI Yields than Chainlink, Fetch.AI, and Injective Combined - ZyCrypto

However, Lanyado wrote that Hugging Face bears responsibility as well, and recommended that it continually scan for exposed API tokens and either revoke them directly or notify users. “Organizations and developers should understand Hugging Face and other likewise platforms aren’t taking active actions for securing their users exposed tokens,” he wrote.

Lanyado credited several organizations with fast responses to Lasso Security’s findings. “Many of the organizations (Meta, Google, Microsoft, VMware, and more) and users took very fast and responsible actions, they revoked the tokens and removed the public access token code on the same day of the report,” he wrote in the blog post.

Hugging Face said it is working on measures that will better prevent other exposures in the future.

“All Hugging Face tokens detected by the security researcher have been invalidated and the team has taken and is continuing to take measures to prevent this issue from happening more in the future, for example, by giving companies more granularity in terms of permissions for their tokens with enterprise hub and detection of malicious behaviors,” the company said in its statement. “We are also working with external platforms like GitHub to prevent valid tokens from getting published in public repositories.”

Exposed Hugging Face API tokens jeopardized GenAI models – TechTarget

Searching for API tokens

Related posts:

You Might Also Like

Recommended For You