Media outlets have published several stories of late that point to both the innovative aspects of harnessing "big data" and the risks to personal privacy associated with it.
There was the story about the Target Corp. analytics program that revealed a teenaged girl's pregnancy to her family, much to their chagrin. Then, there was the article about online travel agency Orbitz, which began up-charging Apple users after data-crunching revealed that they are generally willing to pay more for travel.
Experts agree that companies like Target and Orbitz are being innovative and setting viable business goals as they collect large amounts of data in an effort to market to customers more effectively. But doing so also brings up questions of privacy, data ownership and information ethics.
Defining "big data"
Stamford, Conn.-based IT research firm Gartner Inc. defines "big data" as "high-volume, velocity and/or variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making and process automation." Other experts point out that big data might include unstructured textual information from social media sites, machine-generated log data and a host of other information collected by cloud applications, on-premises applications and websites.
"The opportunities that big data offer to impact social, cultural, political change in our lives are promising. There a lots of people who are very excited about that," said Kord Davis, a former analyst with Paris-based consultancy Cap Gemini and author of Ethics of Big Data. "The challenge on the flipside of that is the risk of unintended consequences."
The mainstream stories about big data have triggered more than a few passionate reactions, according to Davis. Some believe that big data-crunching leads to "creepy" privacy issues. Others feel that the benefits outweigh the risks. But regardless of which side of the debate one lands on, Davis said, it will be difficult to formulate a standard set of ethics around big data if individual moral standards are used as a guide. What one person finds invasive might seem fair to another, but that type of argument is what Davis is looking to avoid.
Governments investigate privacy concerns
Privacy issues related to big data analytics and the collection of consumer information have been hot topics in political circles of late. Governments are now working to force a conversation between businesses and the public, at least from a legal standpoint.
U.S. Reps. Edward Markey (D-Mass.) and Joe Barton (R-Texas) are investigating what they call "data brokers" -- companies that collect consumer information and then sell it to other companies. They have also been highly critical of vendors and applications that choose to ignore consumers who select the "do not track" option on their Web browsers. While the "do not track" option is available on many browsers, it is not legally binding, and therefore applications providers are free to violate it, something the Digital Advertising Alliance instructed its members to do in a statement issued in October.
U.S. Sen. Jay Rockefeller (D-W.Va.) has opened an investigation into data brokers, duplicating the congressional efforts. The European Union has called on Google to be more transparent about how it collects data with the threat of antitrust legislation looming.
"There will certainly be more pressure to establish greater regulations around this consumer data, and that can be defined as a true consumer or as businesses consuming that data," said Jeff Kaplan, managing director of Wellesley, Mass.-based consultancy THINKstrategies. Kaplan explained that there are a number of companies that offer discounted or free services because they want to have an ancillary business in selling data collected from the use of those services. Businesses that collect data also use powerful analytics tools to gain insight into customers, increasing sales. Any legal action would have the potential to deeply affect these types of activities.
"If their ability to do that is in any way compromised, they probably will have to rethink their business models," Kaplan said.
Developing big data standards no easy task
The opportunities that big data offer to impact social, cultural, political change in our lives are promising. There are lots of people who are very excited about that. The challenge on the flipside of that is the risk of unintended consequences.
Kord Davis, former analyst at Paris-based consultancy Cap Gemini
Davis is advocating for more discussion on what the rules should be regarding big data -- and he also wants that discussion to grow beyond the realm of privacy. To understand the issues, he believes, it's important for businesses, legislators and consumers to agree on rules regarding privacy, identity, ownership and, from a business perspective, reputation.
"I think it's a fairly evolutionary process. We're going to have to learn how to have those conversations in environments where we haven't typically had to have them," he said.
While Davis believes that governments pushing the privacy issue will force businesses to have these discussions internally, he doesn't see a clear set of rules coming easily.
"It's a frustrating circumstance for everyone, myself included," he said. "Being able to come up with a broad-based, global set of guidelines for big data-handling in an ethical fashion is going to be difficult."
Davis also doesn't see a need for the conversation to be focused exclusively on big data. In his opinion, what's considered big data today will probably be just "data" in the future.
Many applications lack transparency
Applications, whether they're used by individual consumers or businesses, and whether they're on-premises or in the cloud, are often less than transparent about what they do with data they collect, according to experts.
"The lack of transparency as to what's being done post-data generation causes a lot of fear," Davis said.
Davis praises some organizations for giving detailed information to users about how data is handled. He said some major data gatherers are less transparent and for a variety of reasons -- including the design of privacy controls and what constitutes consent.
Largely because of their wide user bases and pioneering approaches to big data, Facebook and Google are often cited as examples when pundits examine issues of privacy and data ownership. Looking through that lens, Wayne Eckerson, a business intelligence consultant and research director for TechTarget Inc., the company that owns this website, describes several privacy problems facing businesses and users.
"Facebook has been the poster boy for data privacy, and maybe unfairly, considering who is gathering that amount of data," he said, adding that one of the reasons they have gained that image is privacy controls. "It's never been that easy to access those privacy controls. They've been hidden a little bit, and they don't necessarily explain what's going on. And sometimes the user doesn't pay enough attention."
Eckerson believes many of the negative reactions from users about having data collected are "knee-jerk." When the same tools are used to deliver something the users actually want, he argued, they don't complain.
"When it works, it works, and people don't even know it," he said.
Kaplan agrees with Eckerson's double-edged sword argument.
"The perfect example [is Amazon]," he said. "You buy yourself a book and the first thing that pops up is a statement that says, 'Hey, you liked this book; here's another book over here.'"
What will be done?
Davis has no timeline for when he expects major data ethics issues to be answered. He doesn't even purport to have the answers. But he does hope that companies begin to discuss these issues, with a focus on their own internal values and their reputations.
He expects data ownership questions to drive legislative agendas "sooner rather than later" and believes businesses would be smart to get out in front of the issue by aligning data policy with organizational culture and values. He doesn't necessarily think businesses will do that without being required but believes consumers will respond to companies that do.
"My intuition is that the question of data ownership is going to drive a lot of legislative agendas, and the reason is because people are going to want to protect their business models," he said.
Kaplan believes that eventually, businesses will need to make it clear up front how consumer data is being used. While this is already done with Terms of Service agreements, they are often changed after a user first consents to them and are cumbersome for laypeople to understand.
"It will be interesting to see how [businesses] pull this off," he said.