The Microsoft AI research division accidentally leaked dozens of terabytes of sensitive data starting in July 2020 while contributing open-source AI learning models to a public GitHub repository.
Almost three years later, this was discovered by cloud security firm Wiz whose security researchers found that a Microsoft employee inadvertently shared the URL for a misconfigured Azure Blob storage bucket containing the leaked information.
Microsoft linked the data exposure to using an excessively permissive Shared Access Signature (SAS) token, which allowed full control over the shared files. This Azure feature enables data sharing in a manner described by Wiz researchers as challenging to monitor and revoke.
When used correctly, Shared Access Signature (SAS) tokens offer a secure means of granting delegated access to resources within your storage account.
This includes precise control over the client's data access, specifying the resources they can interact with, defining their permissions concerning these resources, and determining the duration of the SAS token's validity.
"Due to a lack of monitoring and governance, SAS tokens pose a security risk, and their usage should be as limited as possible. These tokens are very hard to track, as Microsoft does not provide a centralized way to manage them within the Azure portal," Wiz warned today.
"In addition, these tokens can be configured to last effectively forever, with no upper limit on their expiry time. Therefore, using Account SAS tokens for external sharing is unsafe and should be avoided."
38TB of private data exposed via Azure storage bucket
The Wiz Research Team found that besides the open-source models, the internal storage account also inadvertently allowed access to 38TB worth of additional private data.
The exposed data included backups of personal information belonging to Microsoft employees, including passwords for Microsoft services, secret keys, and an archive of over 30,000 internal Microsoft Teams messages originating from 359 Microsoft employees.
In an advisory on Monday by the Microsoft Security Response Center (MSRC) team, Microsoft said that no customer data was exposed, and no other internal services faced jeopardy due to this incident.
Wiz reported the incident to MSRC on June 22nd, 2023, which revoked the SAS token to block all external access to the Azure storage account, mitigating the issue on June 24th, 2023.
"AI unlocks huge potential for tech companies. However, as data scientists and engineers race to bring new AI solutions to production, the massive amounts of data they handle require additional security checks and safeguards," Wiz CTO & Cofounder Ami Luttwak told BleepingComputer.
"This emerging technology requires large sets of data to train on. With many development teams needing to manipulate massive amounts of data, share it with their peers or collaborate on public open-source projects, cases like Microsoft's are increasingly hard to monitor and avoid."
BleepingComputer also reported one year ago that, in September 2022, threat intelligence firm SOCRadar spotted another misconfigured Azure Blob Storage bucket belonging to Microsoft, containing sensitive data stored in files dated from 2017 to August 2022 and linked to over 65,000 entities from 111 countries.
SOCRadar also created a data leak search portal named BlueBleed that enables companies to find out if their sensitive data was exposed online.
Microsoft later added that it believed SOCRadar "greatly exaggerated the scope of this issue" and "the numbers."
Comments
Shplad - 7 months ago
<sigh>. These companies just have no respect for anyone's data at all. Governments, please legislate fines for this sort of flagrant mess up.
h_b_s - 7 months ago
This report along with the private key loss gives lie to Microsoft's assurances that it can handle data security better than others. It's obvious to people paying attention that *not even Microsoft* can handle data security in their own platform any longer. So don't let Microsoft force you off prem. They can't do it any better than you can, and you have absolutely no control over what they're doing in their data centers. You can retain control if you properly train, compensate, acknowledge their contributions, and otherwise keep your employees happy with your management and listen to your IT staff on matters related to questionable security practices.
horsedoggs - 7 months ago
Here we go, I bet you’re an old white guy with grey hair tied up in a pony tail. Yes no one’s perfect. At the end of the day Cloud is the future of computing, Cloud benefits heavily out weigh onprem anyday and MS do an amazing job of delivering the infrastructure and yes people are human…. The amount of onprem failures and poor practices I’ve had to clean up in my time is shameful but old white guys with Grey hair tied up in a ponytail are also only human.