24 Aug

What is Big Data

If you are reading this blog, you are probably aware what big data is. If not you should be. In either case, most people are at a loss as to what big data really is, other than the fact that it seems all the rage these days. Big data is not just a large set of data, but rather is defined  in the industry  as data  which has at least three attributes: Volume, Velocity, and Verity. Let’s look at each one briefly:

1. Volume: This is the shear amount of data. We are talking about  terabytes of data.  There is no concrete magic number which offers a threshold, but a as a rule of thumb, most of this data is unstructured or has complex structure.

2. Velocity: There are varying definitions to the term. But I think, it should be indicative of how quickly new data is getting generated or how quickly does a company need to access data. Enterprises are moving towards ‘Real-Time’. Which means that terabytes of data should be available for analysis instantly.

3. Verity: Data can be available from several sources and data from each source could have a different format. With the growth in data generated from social media, sensors of all kinds, photos, transactions, etc, there is a wide Verity of data sources.

So, large amounts of unstructured / structured data that comes from different sources and has different semantics can roughly be defined as ‘Big Data’

