NSA and CIA helped commercial firms develop data mining capabilities



Andrew Leonard over at Salon confirms what some of us already suspected:  The National Security Agency (NSA) is not merely an end user of data collected by internet firms. NSA and the Central Intelligence Agency (CIA) were directly involved in developing the capabilities of those firms to amass that data in the first place.Leonard describes how an open source computer program called Hadoop “effectively enabled the surveillance state” by making it possible to make sense of large bodies of data, thus encouraging the collection of even more data. He describes as “incestuous” the “intertwining of the intelligence agencies with the larger open source software community.”

The NSA doesn’t just use Hadoop. NSA programmers have improved and extended Hadoop and donated their changes and additions back to the larger community. The CIA actively invests in start-ups that are commercializing Hadoop and other open source projects. [Salon]

Thus, government surveillance was built into software that influenced or directly enabled internet services provided by companies like Google, Yahoo, Netflix, Facebook and many more.

In 2008, a group of Yahoo employees that eventually included Doug Cutting formed a start-up designed to commercialize Hadoop called Cloudera. The CIA, through its In-Q-Tel (named after James Bond’s Q character) venture capital arm, was an early investor in, and customer of, Cloudera. The NSA built a significant piece of software that works “on top” of Hadoop called Accumulo designed to add sophisticated security controls managing how data could be accessed, and then promptly donated that code to the Apache Software Foundation. Later, a group of NSA software engineers formed another spinoff company, Sqrrl, to commercialize Accumulo. [Salon]

Leonard calls this collaboration “a new military industrial open source Big Data complex,” but concludes:

 [I]n principle, there is nothing necessarily wrong going on here. There is no one to blame.”

Skip Graham, Chief Strategy Officer for Digital Net Agency, came to a different conclusion. He sent out an email to “a large, private industry email list” calling out the industry as complicit in the scandal by “making the public collection and use of personal information seem harmless, permissible, inevitable, and sometimes even desired.” (Violet Blue/ZDNet)

Blue describes how companies like Google incorporated keyword matching into products like Gmail, giving consumers free use in exchange for permission to monitor the content of correspondence and include related ads. (Indeed, since developing the products Ad Words and Ad Sense, Google’s revenue has skyrocketed.)

Initial public reaction was “extremely negative,” however, so businesses sought ways to make consumers feel comfortable and safe in allowing use of their data.  Enter the advertising industry, tasked with turning wariness into trust that many are now regretting. And, despite Edward Snowden’s revelations of government data mining, companies like Facebook continue developing ways to further “fatten up user data profiles with keyword targeting…[as]…seen in the new #hashtag implementation at Facebook.”

Blue offers this chilling assessment:

The dark shadow of modern tech is its data-grabbing arms race: so-called “people finder” data registries, the advertising and marketing industry, spammers, corporations (such as Google and Facebook) and the U.S. government, all comprising a frenzy to collect as much private and personal information on as many people as possible — up to, and sometimes exceeding, the limit of the law. (Blue/ZDNet)

All of this is scandalous and diabolical–a stunning violation of public trust that helps us to better understand why Edward Snowden was upset enough to disclose details of NSA data collection that are rocking the world.

Photo by Mike Licht, NotionsCapital.com at Flickr.com