Microsoft's Copilot AI assistant has been found to expose contents from over 20,000 private GitHub repositories from major companies like Google and IBM. Originally public, these repositories were set to private after developers recognized they contained sensitive data. AI security firm Lasso discovered that, due to Bing's caching mechanism, previously public repositories remain accessible through Copilot. Despite Microsoft's attempts to rectify the issue following Lasso's report in November, concerns about the potential for data breaches and unauthorized access one more highlight the ongoing security challenges in AI development.
After realizing that any data on GitHub, even if public for just a moment, can be indexed and potentially exposed by tools like Copilot, we were struck by how easily this information could be accessed.
Determined to understand the full extent of the issue, we set out to automate the process of identifying zombie repositories (repositories that were once public and are now private) and validate our findings.
After finding in January that Copilot continued to store private repositories and make them available, Lasso set out to measure how big the problem really was.
After Lasso reported the problem in November, Microsoft introduced changes designed to fix it.
Collection
[
|
...
]