Microsoft admitted it made a big security mistake on Monday. They quickly fixed it after accidentally revealing 38 terabytes of private data.
The leak happened on Microsoft’s AI GitHub repository and was accidental. It occurred when they were sharing a collection of open-source training data.
This collection also contained a backup of two ex-employees’ computer files, which held confidential information like secrets, keys, passwords, and more than 30,000 internal Teams messages.
The repository, known as “robust-models-transfer,” is no longer available. Before it was removed, it contained source code and machine learning models related to a research paper from 2020 titled “Do Adversarially Robust ImageNet Models Transfer Better?”
The data breach occurred due to a leniently configured SAS token, which is a feature in Azure allowing users to share data in a way that’s challenging to monitor and revoke, according to a report by Wiz. Microsoft was informed of the problem on June 22, 2023.
The README.md file within the repository contained instructions for developers on how to obtain the machine learning models. These instructions included a specific Azure Storage URL.
Unfortunately, this URL had an unintended flaw: it inadvertently granted access not only to the desired models but also to the entirety of the storage account.
This oversight allowed unauthorized individuals to access and potentially retrieve additional private data stored within that Azure Storage account, resulting in a more extensive data exposure than initially intended.
“In addition to the overly permissive access scope, the token was also misconfigured to allow “full control” permissions instead of read-only,” Wiz researchers Hillai Ben-Sasson and Ronny Greenberg said. “Meaning, not only could an attacker view all the files in the storage account, but they could delete and overwrite existing files as well.”
In light of the discovered security lapse, Microsoft conducted an investigation and reported that they did not find any evidence of unauthorized access to customer data. They assured that no other internal services were compromised or endangered due to this issue. Importantly, Microsoft stressed that customers do not need to take any specific actions in response to this incident.
Microsoft took prompt action to address the issue. They revoked the SAS token and blocked all external access to the storage account. The problem was successfully resolved just two days after it was responsibly disclosed.
To prevent similar risks in the future, Microsoft has taken proactive steps. They have broadened their secret scanning service to encompass any SAS token that might have excessively lenient expiration periods or permissions.
Additionally, they have identified a bug in their scanning system that incorrectly flagged the SAS URL in the repository as a false positive, which has since been rectified. These measures are part of Microsoft’s commitment to enhancing security and minimizing vulnerabilities.
Here’s what researchers have to say:
“Due to the lack of security and governance over Account SAS tokens, they should be considered as sensitive as the account key itself,” the researchers said.
“Therefore, it is highly recommended to avoid using Account SAS for external sharing. Token creation mistakes can easily go unnoticed and expose sensitive data.”
This incident is not the first instance where misconfigured Azure storage accounts have raised security concerns. In July 2022, JUMPSEC Labs drew attention to a situation in which malicious actors could exploit these misconfigurations to potentially infiltrate an enterprise’s on-premise environment.
It underscores the ongoing importance of robust security practices and vigilance in cloud-based services to safeguard sensitive data and systems.
This recent incident adds to a series of security lapses at Microsoft. Just under two weeks ago, the company disclosed that hackers believed to be operating from China successfully breached their systems.
They managed to obtain a highly sensitive signing key by compromising an engineer’s corporate account, and it’s suspected they accessed a crash dump of the consumer signing system. These security challenges highlight the ongoing efforts needed to protect Microsoft’s infrastructure and sensitive data from various threats.
“AI unlocks huge potential for tech companies. However, as data scientists and engineers race to bring new AI solutions to production, the massive amounts of data they handle require additional security checks and safeguards,” Wiz CTO and co-founder Ami Luttwak said in a statement.
“This emerging technology requires large sets of data to train on. With many development teams needing to manipulate massive amounts of data, share it with their peers or collaborate on public open-source projects, cases like Microsoft’s are increasingly hard to monitor and avoid.”