Big Data and Social Enterprise

Big_data_Don_Quichot

In an era of big data, and data analytics, where does the social innovation sector fit in?

Despite big data and data analytics becoming a hot topic among practitioners and organizations alike, its ultimate use and role in the social innovation sector is still uncertain. Social enterprises (SE’s) could benefit hugely from big data, changing the way the sector looks at social impact measurement, scaling up, and receiving financial backing. But the role of big data in the social innovation sector still faces a number of core questions and concerns before it reaches a consensus on how it can be used. Key issues of access, usefulness, representativeness, privacy, and safety continue to dog the big data debate.

But the age of big data is also one of inevitability, and the state and the private sector continue to collect our personal data at lightning pace both with and without our permission. What does this mean for the future of the social innovation sector? How will it confront these new found challenges and opportunities?

In this three part series, Aaron Wytze Wilson and Ivan Peng of SURGE Taiwan have an informative discussion about the relationship between “big data” and “social enterprise”. Their conversation will be grouped around three key themes over three pieces: ‘Access and Availability’, ‘Use and Representativeness’, and finally, ‘Safety and Potential Abuse of Big Data’.


Part 1: Access and Availability

Aaron: A lot of non-profits and social enterprises still don’t have access to curated data sets from the corporate and government sectors. For companies in the corporate sector, they consider this information propriety data, with the companies prime valuation based on the kind of data they are able to collect from the public. We’re still in a transition period where we need to see a larger consecration of organizations in both sectors giving out curated data sets, or tools that allow the non-profit and social innovation sector to create their own data sets.

There also are not enough people within the non-profit sector who are aware of data analytics and can make the appropriate breakthroughs in using big data. This is leaving a big gap in the potential for big data. Although there is information out there that can be used by social businesses to target underprivileged groups, or solve environmental problems, there is a lack of technical manpower within the sector to analyze and tackle these problems, and I think approaching this gap will be difficult.

In the near future, I think we will see more partnerships between social enterprise and the private sector. SE’s need to build more channels with data companies, building secure relationships, and show that businesses also benefit when they provide consumer data to non-profit and social business sectors. The private sector moves at a much faster pace in collecting critical information about the public, and SE’s could benefit hugely if data companies opened their doors to them.

However, I think organizations that have double and triple bottom lines also need to tread carefully when using data collected from the government or data companies. The recent backlash against the US government and big companies like Google, and Facebook for collecting massive amounts of private information in America and abroad has cast an ugly light on data collection practices. Without legislation in place to safeguard our basic rights to privacy, it could be risky for non-profit groups to cooperate with companies and government entities, potentially exposing groups to harassment or other dangers.

Ivan: I think for now, the people who are utilizing big data are at organizations like the World Bank that collect data on poverty, income inequality, infant mortality rates, and the effects of war on economies. It’s difficult for social businesses to utilize the data the World Bank publishes. I think these inference techniques on data will remain an academic exercise for think tanks, PhDs, and will be difficult in the near term to transfer to applicable projects for social businesses.

The government is looking at open data transparency, they are opening their data on health care data, energy, military spending, in the hopes that bright minds will just take it and run with it. There’s a case in Haiti that show’s the potential for big data. Haiti doesn’t have a sewage system yet, but their mobile penetration rate is to the closer end of 90%. When they have larger smartphone penetration, we could then use this data to track cholera outbreaks from emergency telephone calls. You could use GPS data paired with the emergency call data information to track the movement of cholera in the country.

Also, for big data to be useful it has to be structured data. Data that the World Bank releases is on a macro scale and hasn’t reached a micro scale. The question of when we get to the micro scale goes back to the issue of manpower. When we get to a stage where we have set methods to collect and organize this data on a micro scale we can re-visit this conversation. I don’t think the non-profit or social innovation sectors should try to jump on a bandwagon just because big companies are.


Part 2: Use and Representativeness

Aaron: A big issue with the data being collected from companies and government entities is that it does not create a complete picture of the problem. It’s collecting information of a segmented group of people. For example, if we’re collecting information from Twitter, we’re getting tweets from a specific age group, ethnicity, and economic level. And if we look at the world picture, we get an even smaller representation of what Twitter shows, and excludes countries like China from the picture (China uses Weibo instead of Twitter). So the sample data we have can actually be quite small, even though the number of people sampled can be quite large. The problem with that is that the data released is very unstructured, and might not be in the form that a company or a non-profit is looking for.

Ivan: It comes down to a lot of sample bias. With a lot of polls, if you have some sort of bias in how you collect data, it can skew the information greatly. With Twitter for example, you’re already working with a sample bias. You can’t infer what most 42 year olds are going to buy or how they will react from Twitter because that isn’t Twitter’s demographic. I think this will be a big challenge for social businesses that want to take a data tabulation route. It’s difficult to see what inferences we can make from such a small sample size. Data collection that uses randomized samples still collects the most accurate picture in my opinion.

Aaron: I’m also not sure big data can be useful across the entire non-profit and social innovation sector. If big data is used to target specific groups through data collection, what does a social enterprise do if their target market is “off the map”? For people who have their SE based in America or Canada where the penetration of social media and smartphone usage is very high then it would be quite useful. But in a place like rural Taiwan where the penetration of smartphone usage might be lower, how useful would big data be?

Ivan: I think big data could be useful in rural Taiwan to understand things like the market for local goods. What is the supply and demand in the market at these points of time? If data points are taken throughout a time span, farmers could use the data to decide where their goods can be sold to reap maximum benefits. Farmers could factor in other things as well, like weather, and transportation costs. But it’s very difficult to find a structured way to get this directly to the farmer to make it useful.


Part 3: Safety and Potential Abuse of Big Data

Aaron: I think there’s an ugly truth about big data that many of us haven’t faced yet. The data collection will never stop. Additionally, there is little legislation stopping big companies like Google, Facebook, and Apple from collecting our personal information, other than a tacit agreement we give these companies by using their products. These companies have the biggest repositories of our personal data are pushing hard to keep their activity un-legislated.

Much of what I’ve read about the concerns for big data is this power to discriminate by identifying certain groups based on age group, ethnicity, religion, and gender. For social businesses and non-profits that looking to affect the lives of vulnerable groups, having these data sets freely available is a frightening prospect causing more harm than good. A vulnerable group could be abused even more because of big data releases. We have to be careful about who is getting this data, and what is this data being used for.

At the same time, I’m not sure it’s necessarily a bad thing for social enterprises to be able to target their impact to one specific group through the use of data analytics. Say for example, there’s an SE that wants to help single mothers open their own businesses or find more employment opportunities. If they had data sets that could target all the single mothers in Taiwan, Canada, or America, to notify them about non-profits or SE’s that want to reach out to them, that’s a tremendous opportunity.

Ivan: There’s a degree of trust between companies and end users that needs to be reached, and there is some cruelty to big data in that you exist only as a data point. You are, after all, only one point in a pool of millions. It’s not good, it’s not bad, it’s just how it is. However, in terms of social innovation, this isn’t necessarily bad either. A lot of social businesses want to target a specific group, or even a specific person. These businesses don’t want to see you as a data point, but as a real person. So there is a conflicting mindset that people aren’t wrapping their head around. This also comes down to having bright people working in your security department to ensure that your databases are not going to be accessed by unwanted individuals.

At the same time, there’s a lot of good that can be done with this data. For example, there’s the example of Orange, the mobile network company in the Ivory Coast that released a dataset of all the text messages released over a span of a month. From that, academics took the data, and simulated a contagion model. If there ever were an epidemic or disaster, they can see how the communication network sprawls out, and which centres are the most vulnerable. Trained data models can look for that.

Aaron: I think those models are interesting but they also deal with something quite dangerous, which is track-back. They’ve shown that even when these datasets are made fairly anonymous, removing all personal information, they still have a very good chance to track information back to a specific person, with a 90% chance to successfully track back to the individual.

Ivan: This comes back to data curating, and the responsibility of the people who own the data sets. It also comes down to, unfortunately, blind trust.

Aaron: We’re still in a pre-pubescent stage where we don’t actually know where big data and social enterprises are going to go. The value of big data has not been well applied to the social innovation sector yet, but the era of big data pervading every part of our lives will happen sooner rather than later. People in the social innovation sector are taking a very cautious approach to big data, but I think people should prepare sooner for the power of big data rather than later.

Ivan: I’m an optimist, when governments and corporations get their accountability problems sorted out, and once smartphones have a higher penetration rate, more data collection can really begin. Afterwards, we can really start to talk about new trends in health, energy, and in environment.


Editor’s Note: This articles series featured on SocialFinance.ca in partnership with SURGE Taiwan. Image by Don Quichot. 

Big Data and Social Enterprise and Innovation is one key area that i-genius will be exploring over the next two years through its European Union project, Web-COSI. To find out more about Web-COSI, click here!

Share:  


Leave a Reply

Your email address will not be published. Required fields are marked *

*
*