Archives/Subscribe | Advertise | cmsa.org | cmsatoday.org March 2016

Zettabytes and Other Interesting "Big Data" Facts

Print Print this Article | Send to Colleague

Pat Stricker, RN, MEd
Senior Vice President
TCS Healthcare Technologies

While researching last month’s article on Healthcare Trends and Predictions for 2016, I was amazed at the statistics and the overall breadth of "Big Data". For example, the data generated in 2015 was equal to 120,000 times the total of all previously written words in history. Or that more data has been created in the past two years than in the entire previous history of the human race. I found these statistics totally amazing! So I decided to do more research on this phenomenon.

Big Data is definitely a buzzword that can be looked at positively, as a powerful research tool, or negatively, as an overwhelming technological privacy and security threat. Big Data has a great impact on business decisions, clinical treatment and outcomes, quality improvement, financial savings, and a myriad of other important facets in healthcare, but my main goal for this article is to merely explain what Big Data is and how prevalent it is in today’s society. Its impact on healthcare and how it could or should be managed may be addressed in a future article. 

If you do a Google search on "big data," you will get a list with 399,000,000 results! If you search try to filter that by looking for just "big data healthcare", you’ll still get 51,500,000 results! So I guess there is no doubt that Big Data is a big issue, but what exactly is it?

Wikipedia describes Big Data as a collection of data sets so large and complex that it becomes difficult to process using traditional data processing applications. A 2011 McKinsey study defined it as data sets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze. And Gartner defined it as high-volume, high-velocity and/or high-variety information that requires cost-effective, innovative forms of information processing to analyze the data. The "3 Vs" in their definition are described as:

  • Volume: Enormous amounts of generated and stored data from machines, networks and human interaction on systems, such as social media.
  • Velocity: The pace at which real-time, massive and continuous data is created.
  • Variety: Numerous sources and types of structured and unstructured data; Inconsistency of this varied data can create problems with storage, mining and analyzing data.

Other leaders in the field added three more "Vs": Veracity, Validity, and Volatility

  • Veracity: "Clean", meaningful, accurate, and high quality data that is free of biases and "dirty data".
  • Validity: Correct and accurate data for the intended use. This is critical!
  • Volatility: The length of time data is valid and relevant to the current analysis, and how it should be stored.

And the University of Wisconsin added yet three more "Vs":

  • Variability: Data constantly changes and words frequently have several meanings, so it must be analyzed using sophisticated programs that understand the context and meaning of data.
  • Visualization: The creation of complex graphs that "tell the story". They transform data into information, information into insight, insight into knowledge, and knowledge into advantage.
  • Value: The secrets within Big Data can be a goldmine of opportunity for decision-making, better outcomes, and resultant savings.

While researching for this article, there just happened to be a program on PBS entitled "The Human Face of Big Data." How lucky was that? The film describes the promise and peril of Big Data and focuses on the human side of the data story. It likens Big Data to a global nervous system to which each of us has become the nerves that transmit and receive data. There are several examples of how Big Data has been used in medical research, e.g. identifying flu outbreaks and epidemics in real-time, identifying signs of infections in ICU preemies before they exhibited outward symptoms, predicting depression 2 days before symptoms are identified, and determining options for personalized medicine. The film states that Big Data is likely to have a thousand times more impact on our lives than the Internet! If you are interested in this topic and have an opportunity to view it, I would highly recommend it.

These are some of the facts I found most interesting:

  • We are exposed to more information each day than our 15th century ancestors were exposed to in a lifetime.
  • Every two days we generate as much data as was generated from the dawn of humanity to 2003.
  • Almost everything we do leaves a digital trail that is recorded and saved forever.
  • We have done more data processing in the past 2 years than we have in the past 2000 years.
  • Digital data volume is doubling in size every 2 years, so by the year 2020, it will increase from today’s 4.4 zettabytes to 44 zettabytes. You may be wondering what a zettabyte is. (I know I sure did). Or you may be thinking that an increase from 4.4 to 44 doesn’t seem to be that big, but it is HUGE!

               - A zettabyte is 1,000,000,000,000,000,000,000 bytes (yes, that’s 21 zeroes) or 1 trillion gigabytes. To put it in perspective, one zettabyte is equivalent to the data on about 250 billion DVDs.

               - This means by 2020 the digital volume of 44 zettabytes will be equal to about 44 trillion gigabytes. That is almost as many stars as we have in the universe and 75 times more that the total grains of sand on our planet. Do you agree 44 zettabytes is HUGE?

  • Most data is not used and is meaningless until someone asks a question and begins to analyze it

The following articles contain other interesting or unusual facts about Big Data. If you want to see all the facts or get more detail, you can review the entire article by clicking on the link.

10 Powerful Facts About Big Data

This slideshow examines the challenges and capabilities of Big Data.

  • Data-sharing devices, collectively known as the Internet of Things are predicted to soar from 9.1 billion (in 2013) to 28.1 billion by 2020.
  • Data volume is going to grow 50 times year-over-year between now and 2020.
  • 85% of the data is coming from internet data sources, such as mobile, social media, etc.
  • Most companies estimate that they are only analyzing 12% of their data. The reasons given for this are: lack of big data analytic tools, data silos, and the inability to know which information is really valuable.

Big Data: 20 Mind-Boggling Facts Everyone Must Read

This Forbes article by Barnard Marr is a great read if you are interested in some mind-boggling facts about Big Data. Here are a few that really stood out for me:

  • More data has been created in the past two years than in the entire previous history of the human race.
  • By 2020, about 1.7 megabytes of new information will be created every second for every human being on the planet.
  • 40,000 Google search queries are created every second. That amounts to 1.2 trillion searches per year.
  • Facebook: In August 2015, over 1 billion people used Facebook in a single day. Users send an average of 31.25 million messages and view 2.77 million videos every minute.
  • YouTube: Up to 300 hours of videos are uploaded every minute.
  • In 2015, over 1.4 billion smartphones, capable of collecting data, were shipped. By 2020, there will be over 6.1 billion smartphones.
  • In addition, within five years there will be over 50 billion smart-connected devices in the world.
  • Estimates suggest that healthcare could save as much as $300 billion a year by better integrating big data. That’s a savings of $1000 a year for every man, woman, and child.
  • Amazingly, even given the astounding volume of data generated, less than 0.5% of the data is ever analyzed and used! Think of the potential that is there and just waiting to be used.

The Big-data Revolution in US Health Care: Accelerating Value and Innovation

This McKinsey article explains the potential impact of big data on health care in the U.S.

  • Healthcare expenses represent 17.6 percent of our gross domestic product (GDP), which is about $600 billion more than the expected benchmark for a nation with the size and wealth of the United States.
  • Based on innovative, initial studies using mobile devices and applications that capture daily activity or patient-reported outcomes, it is estimated that if these programs were used systemwide, they would provide significant savings:

                   -$300 billion to $450 billion in reduced health-care spending

                   -12-17% of the $2.6 trillion baseline in US health-care costs

Based on these statistics, it certainly seems like Big Data has the potential to provide significant impact if we could just learn how to manage the Key Challenges in Big Data:

  • Information Strategy: Harnessing the power of information; finding new ways to leverage information.
  • Data Analytics: Gaining more insight from Big Data to predict future behaviors, trends, and outcomes.
  • Enterprise Information Management: Managing access to massive information management requirements and driving innovation to process the information.

Industry leaders predicted that the buzz of Big Data would fade because it’s "just data". However, that has not happened. Big Data holds a wealth of information that, if harnessed and managed correctly, could drastically change and improve the quality and efficiency of healthcare, improve treatment options and outcomes, and provide significant savings. What a challenge! But how exciting!

Pat Stricker, RN, MEd, is senior vice president of Clinical Services at TCS Healthcare Technologies. She can be reached at pstricker@tcshealthcare.com.
 
 

Share Share on Facebook Share on Twitter Share on LinkedIn

The leading membership association providing professional
collaboration across the health care continuum.


6301 Ranch Drive | Little Rock, AR 72223 | Phone: (501) 225-2229 | Toll-Free: (800) 216-2672 | Fax:(501) 221-9608
Secure Fax Line for Credit Cards: (501) 421-2135 | Email: cmsa@cmsa.org | Website: www.cmsa.org