Big Data

Big Data is a massive collection of data that continues to increase dramatically over time. It is a data set that is so huge and complicated that no typical data management technologies can effectively store or process it. Big data is like regular data, except it is much larger.

The categories of Big Data are as follows:

  • Structured
  • Unstructured
  • Semi-structured

Structured data is any data that can be stored, retrieved, and processed in a predetermined way. Over time, computer science expertise has become more successful in inventing strategies for working with such material (when the format is fully understood in advance) and extracting value from it. However, we are already anticipating problems when the bulk of such data expands to enormous proportions; typical quantities are in the tens of zettabytes. A database table named ‘Employee’ is an example of Structured Data.

Unstructured data is any data that has an undetermined shape or organization. Unstructured data presents various obstacles in terms of processing to derive value from it, in addition to its enormous quantity. A heterogeneous data source including a mix of basic text files, photos, videos, and other types of unstructured data is a good example. Organizations nowadays have a plethora of data at their disposal, but they don’t know how to extract value from it because the data is in its raw form or unstructured format. An example of Unstructured Data includes ‘Google Search’ returns a list of results.

Both types of data can be found in semi-structured data. Semi-structured data seems to be structured, but it is not specified by a table definition in a relational database management system. A data set contained in an XML file is an example of semi-structured data.

Scroll to Top