Enable javascript in your browser for better experience. Need to know to enable it? Go here.
Blogs Banner

Big Data Reality Causes Privacy Concerns

Twice a year, Thoughtworks publishes the “Technology Radar”—our view on the technology trends that are important in the industry right now, and the trends that will be important in the near future.

It’s a unique perspective from Thoughtworks and our 2,500 consultants around the world, based on first-hand experiences delivering real software for our clients. Third parties cannot pay to have themselves featured on the Radar and the report is entirely independent in which technologies we include and what we say about them. The latest edition of the Radar was published last week.

One of the large themes we have been tracking over the past couple of years is around Big Data and Analytics. We think the “big” part of Big Data is over-hyped; most of the time you don’t actually need a massive cluster of machines to process your data. But the sheer variety, or “messiness” of all of this data presents new challenges, and there’s a real opportunity to use Advanced Analytics—statistical modeling, machine learning and so on—to gain new insight into your business and into customer behavior. An important trend we note in the Radar is the accessibility of all of these new Analytics techniques. If you do truly have lots of data you can simply go rent a portion of the cloud to process it, with SaaS offerings from Amazon, Google, Rackspace and others. If you want to analyze your data you can do it with point-and-click tools or open-source offerings such as the amazing D3.js JavaScript library.[1] Open-source is a huge democratizing factor here—you no longer need to pay for an expensive “big iron” solution for data processing and analysis.

We’re excited about the increased awareness around data because software systems can use data and analytics to provide significantly better end-user experiences, as well as delivering increased value to businesses. As has already happened with unit-testing, we expect it to become every developer’s job to understand the importance of data and what can be done with it. That’s not to say every developer needs a statistics degree or a PhD, but we’re expecting data engineering and analysis to become a bread-and-butter part of a developer’s job rather than some weird thing “those data science people” do in a corner.

While there’s much to be gained from better retention, analysis and understanding of data, it comes with a darker side. Companies employing advanced analytics have quickly realized that they need to avoid being too accurate with their insights or people feel unnerved, even violated. One way to avoid spooking people is to deliberately include less-relevant offerings and advertisements to a customer, so they don’t feel targeted. The strategy is to get right up to the “spookiness” line but not to cross it.

As we’ve seen over the past few months, any digital trail can potentially be considered an indelible record. Responsible organizations need to look at these revelations, as well as the weekly news of private-sector security breaches, and consider their response. In Europe, many companies are adopting a strategy of Datensparsamkeit[2], a term that roughly translates as “data austerity” or “data parsimony.” The method originates in Germany where data privacy laws are significantly stricter than in the US. Rather than taking an approach of storing and logging every possible scrap of information about a customer and their interactions, Datensparsamkeit advocates only storing the data you absolutely need in order to provide your service to that customer. This way their privacy is maintained even in the unfortunate event of a data breach.

Society is increasingly driven by technology, and changing at an ever increasing pace. As technologists it’s our responsibility not just to consider what we can do with our new tools, but whether it’s the right thing to do. Ethics are not the sole purview of philosophers, lawyers and politicians: we must all do our part.

This blog was originally posted on CitzenTekk.

[1] http://d3js.org/

[2] http://martinfowler.com/bliki/Datensparsamkeit.html

Disclaimer: The statements and opinions expressed in this article are those of the author(s) and do not necessarily reflect the positions of Thoughtworks.

Keep up to date with our latest insights