While researching last month’s article on Healthcare
Trends and Predictions for 2016, I was amazed at the
statistics and the overall breadth of "Big Data". For example, the data generated in 2015 was
equal to 120,000 times the total of all previously written words in
history. Or that more
data has been created in the past two years than
in the entire previous history of the human race. I found these statistics totally amazing! So I decided to do more research on this
phenomenon.
Big
Data is definitely a buzzword that can be looked at positively, as a powerful
research tool, or negatively, as an overwhelming technological privacy
and security threat. Big Data has a great impact
on business decisions, clinical treatment and outcomes, quality improvement,
financial savings, and a myriad of other important facets in healthcare, but my
main goal for this article is to merely explain what Big Data is and how
prevalent it is in today’s society. Its impact on healthcare and how it could
or should be managed may be addressed in a future article.
If you do a Google search on "big data," you will get a list with 399,000,000
results! If you search try to filter
that by looking for just "big data healthcare", you’ll still get 51,500,000
results! So I guess there is no doubt that Big
Data is a big issue, but what exactly is it?
Wikipedia
describes Big Data as a collection of data sets so large and complex that it
becomes difficult to process using traditional data processing applications. A 2011
McKinsey study defined it as data sets whose size is beyond the
ability of typical database software tools to capture, store, manage, and analyze.
And Gartner defined it as high-volume,
high-velocity and/or high-variety information that requires cost-effective,
innovative forms of information processing to analyze the data. The "3 Vs" in
their definition are described as:
- Volume: Enormous amounts of generated and stored data from machines, networks
and human interaction on systems, such as social media.
- Velocity: The pace at which real-time, massive and continuous data is created.
- Variety: Numerous sources and types of structured and unstructured data; Inconsistency of this
varied data can create problems with storage, mining and analyzing data.
Other leaders in
the field added three more "Vs": Veracity, Validity, and Volatility
- Veracity: "Clean", meaningful,
accurate, and high quality data that is free of biases and "dirty data".
- Validity: Correct and accurate data for the intended use. This is critical!
- Volatility: The length of time
data is valid and relevant to the current analysis, and how it should be stored.
And the University of Wisconsin added yet three more "Vs":
- Variability: Data constantly
changes and words frequently have several meanings, so it must be analyzed
using sophisticated programs that understand the context and meaning of data.
- Visualization: The creation of complex
graphs that "tell the story". They transform data into information, information
into insight, insight into knowledge, and knowledge into advantage.
- Value: The secrets within Big Data can be a goldmine
of opportunity for decision-making, better outcomes, and resultant savings.
While researching for this article, there
just happened to be a program on PBS entitled "The Human Face of Big Data." How lucky was that? The film describes
the promise and peril of Big Data and focuses on the human side of the data
story. It likens Big Data to a global nervous system to which each of us has
become the nerves that transmit and receive data. There are several examples of
how Big Data has been used in medical research, e.g. identifying flu outbreaks
and epidemics in real-time, identifying signs of infections in ICU preemies
before they exhibited outward symptoms, predicting depression 2 days before
symptoms are identified, and determining options for personalized medicine. The
film states that Big Data is likely to have a thousand times more impact on our
lives than the Internet! If you are
interested in this topic and have an opportunity to view it, I would highly recommend it.
These are some of the facts I found
most interesting:
- We are exposed to more information each
day than our 15th century ancestors were exposed to in a lifetime.
- Every
two days we generate as much data as was generated from the dawn of humanity to
2003.
- Almost
everything we do leaves a digital trail that is recorded and saved forever.
- We
have done more data processing in the past 2 years than we have in the past
2000 years.
- Digital data volume is doubling in size every 2
years, so by the year 2020, it will increase from today’s 4.4 zettabytes to 44 zettabytes.
You may be wondering what a zettabyte is. (I know I sure did). Or you may be
thinking that an increase from 4.4 to 44 doesn’t seem to be that big, but it is
HUGE!
- A zettabyte is 1,000,000,000,000,000,000,000
bytes (yes, that’s 21 zeroes) or 1 trillion gigabytes. To put it in
perspective, one zettabyte is equivalent to the data on about 250
billion DVDs.
- This
means by 2020 the digital volume of 44 zettabytes will be equal to about 44 trillion gigabytes. That is almost as
many stars as we have in the universe and 75 times more that the total grains
of sand on our planet. Do you agree 44 zettabytes is HUGE?
- Most data is not used and is meaningless until someone asks a
question and begins to analyze it
The following articles contain other
interesting or unusual facts about Big Data. If you want to see all the facts
or get more detail, you can review the entire article by clicking on the link.
This slideshow
examines the challenges and capabilities of Big Data.
- Data-sharing devices, collectively known as the Internet of Things
are predicted to soar from 9.1 billion (in 2013) to 28.1 billion by 2020.
- Data volume is going to grow 50 times
year-over-year between now and 2020.
- 85% of the data is coming from internet data
sources, such as mobile, social media, etc.
- Most companies estimate that they are only
analyzing 12% of their data. The reasons
given for this are: lack of big data analytic tools, data silos, and the
inability to know which information is really
valuable.
This
Forbes article by Barnard Marr is a great read if you are interested in some
mind-boggling facts about Big Data. Here
are a few that really stood out for me:
- More data has been
created in the past two years than in the entire previous history of the human
race.
- By 2020, about 1.7 megabytes of new information will be created
every second for every human being on the planet.
- 40,000 Google search
queries are created every second. That amounts to 1.2 trillion searches per
year.
- Facebook: In August 2015, over 1 billion people used
Facebook in a single day. Users send an average of 31.25 million messages and view
2.77 million videos every
minute.
- YouTube: Up to 300 hours of videos are uploaded every
minute.
- In 2015, over 1.4 billion smartphones, capable of collecting
data, were shipped. By 2020, there will be over 6.1 billion smartphones.
- In addition, within
five years there will be over 50 billion smart-connected devices in the world.
- Estimates suggest that
healthcare could save as much as $300 billion a year by
better integrating big data. That’s a savings of $1000 a year for every man,
woman, and child.
- Amazingly, even given
the astounding volume of data generated, less than 0.5% of the data is ever analyzed and used!
Think of the potential that is there and just waiting to be used.
This McKinsey article explains the
potential impact of big data on health care in the U.S.
- Healthcare expenses represent 17.6 percent of our gross domestic product
(GDP), which is about $600 billion more than the expected benchmark for a
nation with the size and wealth of the United States.
- Based on innovative, initial studies using mobile devices and
applications that capture daily activity or patient-reported outcomes, it is
estimated that if these programs were used systemwide, they would provide
significant savings:
-$300 billion to $450 billion in reduced health-care spending
-12-17% of the $2.6 trillion baseline in US health-care costs
Based on these statistics, it certainly seems like Big Data has the
potential to provide significant impact if we could just learn how to manage
the Key
Challenges in Big Data:
- Information
Strategy: Harnessing the power of information; finding new
ways to leverage information.
- Data Analytics: Gaining more insight from Big Data to predict
future behaviors, trends, and outcomes.
- Enterprise Information Management: Managing access to massive
information management requirements and driving innovation to process the
information.
Industry leaders predicted that the buzz of Big Data would fade
because it’s "just data". However, that has not happened. Big Data holds a
wealth of information that, if harnessed and managed correctly, could
drastically change and improve the quality and efficiency of healthcare, improve
treatment options and outcomes, and provide significant savings. What a
challenge! But how exciting!