MongoDB Database Exposed 188 Million Records: ResearchersData Apparently Originated in a GitHub Repository
Security researchers have found yet another unsecured database that left personal data exposed to the internet. In this latest case, a MongoDB database containing about 188 million records, mostly culled from websites and search engines, was exposed, researchers say.
The exposed MongoDB database, which was discovered on June 18, included information from searches conducted on Pipl.com and LexisNexis, according to security researcher Bob Diachenko, who discovered the database along with the privacy advocacy firm Comparitech. Their findings were published in a blog Tuesday.
Pipl is an Idaho-based people search engine that provides data-broking services; LexisNexis is a legal search engine based in New York.
The exposed records from Pipl included first and last name, email address, date of birth, phone number, social media profile links, race and religion, the researchers say. Approximately exposed 800,000 records from LexisNexis contained names, addresses, gender and family members.
Latest in Series of Incidents
This is the second MongoDB data leak reported in the last three months. In May, 275 million personal records in India were exposed to the internet (see: Passwordless MongoDB Database Exposes 275 Million Records).
This week, a pair of security researchers described how they found customer data in an unsecured database belonging to Fieldwork Software (see: Fieldwork Software Database Exposed Customer Data: Report).
One the bigger discoveries, however, belonged to UpGuard. That company's researchers found an unsecured Amazon S3 buckets belonging to IT services firm Attunity that left at least 1 TB of data exposed (see: UpGuard: Unsecured Amazon S3 Buckets Exposed 1 TB of Data).
Diachenko says he traced the origin of the personal data to a GitHub repository connected to a search API belonging to a company by the name of Thedatarepo. A close inspection of this repository found that while the data was exposed to the internet, it does not appear that anyone breached LexisNexis or Pipl to obtain it, Diachenko says.
"Judging by the 'dataSource' fields in the database, it looks like the creators of the API either scraped or purchased the data from Pipl and LexisNexus, and it does not seem likely that Pipl and LexisNexus were actually breached," the blog notes.
Thedatarepo shut down access to the database on July 3, and Diachenko was not able to determine if anyone had accessed it before he first discovered it in June. Thedatarepo closed down its website by the time the blog published on Tuesday, the researcher notes.
"Exposures like that appear from time to time, no matter who manages the data," Diachenko tells Information Security Media Group. "It is still cyber hygiene rules that are not followed properly - no password policy set in this case. Still, information like this is considered to be sensitive. We see many cases when similar data then appears to have been used in a form of public digital impersonation."
Lack of Authentication
One of the primary reasons MongoDB databases are left exposed is that MongoDB, by default, has no password mechanism in place, according to security researchers.
"The root cause has been the lack of security in MongoDB, which allows anyone to access it remotely," says Sachin Raste, a researcher at the security firm seeScan. "Organizations using it need to add layers of authentication to make data secure."
But Ben Wolfson, a MongoDB spokesperson, tells ISMG that a data exposure such as the latest incident is highly unlikely because MongoDB's default security configuration restricts network access from the internet. Instead, the issue lies with its "free to download and use" community and not the MongoDB database, he says.
"To be clear - this instance does not involve a MongoDB customer but a user of the free to download and free to use community version," Wolfson says. "To be exposed in this manner, an administrator would have to change the default security configuration to allow unrestricted internet traffic in and out."
While MongoDB provides education and guidance to ensure security configuration best practices are easily set up and deployed, Wolfson says that these best practices are sometimes ignored, which can result in a misconfigured database being exposed.
Steps to Follow
Diachenko advises organizations to follow better security precautions when creating databases and uploading data to various cloud platforms. For example, he recommends that organizations:
- Create new passwords rather than reusing old ones;
- Read documentation for your database software;
- Add firewall rules and access control lists so that external and non-authorized devices can't access your data from the internet;
- Implement data retention limitation policies. Stor only what the business needs and purge the rest.
(Managing Editor Scott Ferguson contributed to this report.)