Docker Hub – developer repository or confidential information dump?
Researchers in Germany have found that thousands of public Docker images contain highly sensitive data.
University researchers RWTH Aachen recently in Germany published a study which revealed that tens of thousands of container images hosted on Docker Hub contain sensitive data such as private keys and API-secrets. This poses a huge threat to the security of software, online platforms and end users.
Docker Hub is a cloud-based repository for the Docker community where developers can store, distribute, and share images. These container templates include all the necessary code, runtime, libraries, environment variables, and configuration files to easily deploy an application to Docker.
Scheme for creating a Docker image
German researchers analyzed 337,171 images from Docker Hub and thousands of private repositories and found that approximately 8.5% of them contained sensitive data such as private keys and API secrets.
As it turns out, many of the public keys are actively used, undermining the security of items that depend on them, such as hundreds of different certificates.
The study collected a huge dataset of 1,647,300 layers from 337,171 Docker images, using the latest images from each repository whenever possible.
Data analysis using regular expressions to find specific secrets revealed 52,107 valid private keys and 3,158 different API secrets in 28,621 Docker images.
These figures were confirmed by the researchers by excluding test keys, API secret examples, and invalid matches.
Most of the secrets revealed, 95% for private keys and 90% for API secrets, were in individual user images, indicating that they were most likely leaked through negligence.
As a result of the analysis, it turned out that on Docker Hub the percentage of secret disclosure was 9%, while for images from private repositories this figure was 6.3%.
This difference may indicate that Docker Hub users generally have a weaker understanding of container security than those who set up private repositories.
The researchers then had to determine the actual use of the revealed secrets in order to assess the scale of the potential attack. Experts found 22,082 compromised certificates relying on disclosed private keys, including 7,546 certificates signed by private CAand 1,060 certificates signed by public CAs.
CA-signed certificates are especially dangerous because they are commonly used by a large number of users.
At the time of the study, only 141 CA-signed certificates were valid, which slightly reduces the risk, but this does not negate the fact that there is a potential threat.
To further determine real-world use of exposed secrets, the researchers used 15 months of Internet measurement data provided by the Censys database and found 275,269 hosts that rely on compromised keys.
- 8,674 hosts MQTT and 19 hosts AMQPwhich potentially transmit privacy-sensitive IoT data (IoT).
- 6,672 copies FTP426 copies PostgreSQL3 copies elasticsearch and 3 copies MySQLwhich serve potentially sensitive data.
- 216 hosts SIPused for telephony.
- 8,165 servers SMTP1,516 servers POP3 and 1,798 servers IMAPused for email.
- 240 servers SSH and 24 copies Kubernetesthat use leaked keys, which can lead to remote shell access, botnet installations, or access to sensitive data.
This level of disclosure highlights a huge problem in container security and high sloppiness in creating images without first clearing secrets.
Developers should be aware of the risks and follow security best practices to prevent sensitive data from being leaked. Container security and protection of digital keys remain a priority for ensuring secure work in modern information technologies.