O.R./Analytics at Work Blog
To start, I want to say that I am not attempting for my headline to be sensationalist a la Business Insider, mostly because I think the essence of the title will ring true. I am optimistic that society will continually work to stop the misuse of metadata, so I won’t talk much on that side of the argument.
However, I do hold the belief that there needs to be much more effort placed in educating the public on important topics in internet privacy, and convincing them why they should join the discussion at all.
Achievements in the field of Big Data continue to grow at a rapid rate, and that rate is not forecasted to slow down in the next 10 years given the recent influx of money into the tech sector. According to an IDC forecast for 2012-2015, “the market for Big Data technology and services … is expected to grow from $3.2 billion in 2010 to $16.9 billion in 2015. Storage is expected to grow most, at 61% CAGR.” (ref. 1)
So given that Big Data is growing, it’s a logical effect that people are more exposed to threats relating to their internet privacy. Making sure that the typical internet surfer truly understands these threats is important to the future of the Big Data field. In fact, education of the entire community should be at the forefront of the ethics discussion.
Truth be told, people don’t like what they don’t understand. It’s my experience that people don’t really understand much when it comes to how their internet metadata is being collected. After working in an IT Data Center for the last 9 months, I’ve realized that even I, as a college student in a STEM program, had almost zero concept of how the internet worked at all.
Like most others, I thought that the internet magically permeated the air around us, or else I gave it no thought at all. In reality, the internet and the data its users create are incredibly complex.
A major study has shown that college students spend on average 9.5 hours on internet-connected devices. Shouldn’t we do more to make sure that people have more knowledge of internet privacy, given how heavily the internet factors in their daily lives?
Without a doubt, the Edward Snowden situation has shed an important light on the topic. Withholding personal opinion on the case, it was clearly an important step in beginning the discussion of the consequences of data mining to the average internet user.
True, the first rule of the internet is that everything is public, sort of the antithesis of the Las Vegas motto: what happens here, could end up going viral.
So why will the internet privacy discussion be so different in the near future than it has been recently?
Well, the vast quantities of personal informationhasalways been out there, but we are just getting around to unlocking the full potential of using all that information. And the people who will continually unlock all the potential uses of big data are ourselves: the certified analytics professionals, the data scientists, the INFORMS member base, the students of computer science, statistics, so on. We have an ethical responsibility to be as transparent as possible in our practices.
For the generation that is currently under 10 years old, I believe the ethics of internet privacy are going to be presented very similarly to the way sexual education was presented to teenagers of the 1990s and early 2000s.
This education should start when kids get their first computer, laptop, smartphone, or tablet. In the near future, parents are going to have to sit their kids down and have “the talk.” Not the sexual education talk, the internet privacy talk -- parents expressing to their children the idea that although anything on the internet was never really private, nowadays it’s really, really never private.
It’s not acceptable now to give kids as young as 14 free reign of the internet without properly educating them on the lasting consequences of their internet usage.
Internet education should be more present in schools, too.
There should be general school assemblies (hence the sex ed. comparison) that really educate kids on the possible misuses of the internet metadata we all create. Maybe there just hasn’t been much need for this discussion since we’re only beginning to realize the potential mishandling of metadata, but I believe it will be necessary soon.
In truth, I can’t help but feel that blaming the data scientist without educating the internet user is comparable to parents trying to shelter their children from all the harsh realities of life. Eventually the general computer user, like a sheltered child, has to find out what the world is really like. That time for the internet user should be early on. It’s too late to start caringafterour information has been compromised.
As a member of the IT team that is works in two data centers 10 core network buildings on a large college campus, I’ve been more privy than most to the workings of the internet and data mining. However, as scary as this may sound, I still have no idea where all the collected user information ends up. And if I have no clue as to who sees all the metadata and how that data is being used, I could only imagine what the casual internet user knows and believes.
So I’ve began to understand that all of the misconceptions about data mining are simply due to people being uninformed. And it’s very clear that all the misconceptions are hurting the reputation of Big Data. This reputation is hurting the overall tech sector in many ways, even stunting some area’s growth forecasts. According to reports, “technology giants IBM and Cisco … have seen sales slump by more than $1.7bn year-on-year in the important Asia-Pacific region” since the Snowden revelations. (ref. 3)
Luckily, the Big Data field is more than capable of restoring its reputation and projected growth. I am certain that the level of the general public’s internet awareness will be a key factor in that restoration in the years to come.
However, there will always be those in the field who hold the opinion that it doesn’t really matter how much the average internet user knows about what goes on behind the scenes – the idea that internet ignorance is bliss.
So, to those working in Big Data or related fields:
What is your personal assessment of the average internet user’s knowledge of the internet and Big Data? Do you think it matters how much a person knows who is far removed from any tech field? What do you see being done in the future about educating the public on the byproducts of internet use?
A few references: