What is Big Data
If you are reading this blog, you are probably aware what big data is. If not you should be. In either case, most people are at a loss as to what big data really is, other than the fact that it seems all the rage these days. Big data is not just a large set of data, but rather is defined in the industry as data which has at least three attributes: Volume, Velocity, and Verity. Let’s look at each one briefly:
1. Volume: This is the shear amount of data. We are talking about terabytes of data. There is no concrete magic number which offers a threshold, but a as a rule of thumb, most of this data is unstructured or has complex structure.
2. Velocity: There are varying definitions to the term. But I think, it should be indicative of how quickly new data is getting generated or how quickly does a company need to access data. Enterprises are moving towards ‘Real-Time’. Which means that terabytes of data should be available for analysis instantly.
3. Verity: Data can be available from several sources and data from each source could have a different format. With the growth in data generated from social media, sensors of all kinds, photos, transactions, etc, there is a wide Verity of data sources.
So, large amounts of unstructured / structured data that comes from different sources and has different semantics can roughly be defined as ‘Big Data’