Trove included more than 1.8 billion posts spanning eight years, many from US people.
A Pentagon contractor left a vast archive of social-media posts on a publicly accessible Amazon account in what appears to be a military-sponsored intelligence-gathering operation that targeted people in the US and other parts of the world.
The three cloud-based storage buckets contained at least 1.8 billion scraped online posts spanning eight years, researchers from security firm UpGuard’s Cyber Risk Team said in a blog post published Friday. The cache included many posts that appeared to be benign, and in many cases those involved from people in the US, a finding that raises privacy and civil-liberties questions. Facebook was one of the sites that originally hosted the scraped content. Other venues included soccer discussion groups and video game forums. Topics in the scraped content were extremely wide ranging and included Arabic language posts mocking ISIS and Pashto language comments made on the official Facebook page of Pakistani politician Imran Khan.
The scrapings were left in three Amazon Web Servers S3 cloud storage buckets that were configured to allow access to anyone with a freely available AWS account. It’s only the latest trove of sensitive documents left unsecured on Amazon. In recent months, UpGuard has also found private data belonging to Viacom, security firm TigerSwan, and defense contractor Booz Allen Hamilton similarly exposed. In Friday’s post, UpGuard analyst Dan O’Sullivan wrote:
Massive in scale, it is difficult to state exactly how or why these particular posts were collected over the course of almost a decade. Given the enormous size of these data stores, a cursory search reveals a number of foreign-sourced posts that either appear entirely benign, with no apparent ties to areas of concern for US intelligence agencies, or ones that originate from American citizens, including a vast quantity of Facebook and Twitter posts, some stating political opinions. Among the details collected are the web addresses of targeted posts, as well as other background details on the authors which provide further confirmation of their origins from American citizens.
Who is VendorX?
Settings inside one of the three exposed buckets indicated it was scraped and analyzed by a company called VendorX. The settings table included details about the company employees given privileges to run software that processed the data. The buckets were titled centcom-backup, centcom-archive, and pacom-archive. Internet searches revealed multiple people who work for VendorX describing work they did for the US Central Command, based in Tampa, Florida. The project was called Outpost and was described as a “multi-lingual platform designed to positively influence change in high-risk youth in unstable regions of the world.”
Besides raising questions about the collection of data from people located in the US, the UpGuard finding also exposes security practices so lax they’re hard to fathom.
“A single permission settings change would have meant the difference between these data repositories being revealed to the wider Internet, or remaining secured,” O’Sullivan wrote. “If critical information of a highly sensitive nature cannot be secured by the government—or by third-party vendors entrusted with the information—the consequences will affect not only whatever government organizations and contractors that are responsible, but anybody whose information or Internet posts were targeted.”