What is Big Data? Introduction, Types, Characteristics, Examples (2024)

By : David Taylor

UpdatedDecember 30, 2023

Before we go to introduction to Big Data, you first need to know

What is Data?

The quantities, characters, or symbols on which operations are performed by a computer, which may be stored and transmitted in the form of electrical signals and recorded on magnetic, optical, or mechanical recording media.

Now, let’s learn Big Data definition

What is Big Data?

Big Data is a collection of data that is huge in volume, yet growing exponentially with time. It is a data with so large size and complexity that none of traditional data management tools can store it or process it efficiently. Big data is also a data but with huge size.

What is an Example of Big Data?

Following are some of the Big Data examples-

The New York Stock Exchange is an example of Big Data that generates about one terabyte of new trade data per day.

Social Media

The statistic shows that 500+terabytes of new data get ingested into the databases of social media site Facebook, every day. This data is mainly generated in terms of photo and video uploads, message exchanges, putting comments etc.

A single Jet engine can generate 10+terabytes of data in 30 minutes of flight time. With many thousand flights per day, generation of data reaches up to many Petabytes.

Types Of Big Data

Following are the types of Big Data:

Structured
Unstructured
Semi-structured

Structured

Any data that can be stored, accessed and processed in the form of fixed format is termed as a ‘structured’ data. Over the period of time, talent in computer science has achieved greater success in developing techniques for working with such kind of data (where the format is well known in advance) and also deriving value out of it. However, nowadays, we are foreseeing issues when a size of such data grows to a huge extent, typical sizes are being in the rage of multiple zettabytes.

Do you know? 10²¹ bytes equal to 1 zettabyte or one billion terabytes forms a zettabyte.

Looking at these figures one can easily understand why the name Big Data is given and imagine the challenges involved in its storage and processing.

Employee_ID	Employee_Name	Gender	Department	Salary_In_lacs
2365	Rajesh Kulkarni	Male	Finance	650000
3398	Pratibha Joshi	Female	Admin	650000
7465	Shushil Roy	Male	Admin	500000
7500	Shubhojit Das	Male	Finance	500000
7699	Priya Sane	Female	Finance	550000

Unstructured

Any data with unknown form or the structure is classified as unstructured data. In addition to the size being huge, un-structured data poses multiple challenges in terms of its processing for deriving value out of it. A typical example of unstructured data is a heterogeneous data source containing a combination of simple text files, images, videos etc. Now day organizations have wealth of data available with them but unfortunately, they don’t know how to derive value out of it since this data is in its raw form or unstructured format.

Examples Of Un-structured Data

The output returned by ‘Google Search’

Semi-structured

Semi-structured data can contain both the forms of data. We can see semi-structured data as a structured in form but it is actually not defined with e.g. a table definition in relational DBMS. Example of semi-structured data is a data represented in an XML file.

Examples Of Semi-structured Data

Personal data stored in an XML file-

<rec><name>Prashant Rao</name><sex>Male</sex><age>35</age></rec><rec><name>Seema R.</name><sex>Female</sex><age>41</age></rec><rec><name>Satish Mane</name><sex>Male</sex><age>29</age></rec><rec><name>Subrato Roy</name><sex>Male</sex><age>26</age></rec><rec><name>Jeremiah J.</name><sex>Male</sex><age>35</age></rec>

Data Growth over the years

Please note that web application data, which is unstructured, consists of log files, transaction history files etc. OLTP systems are built to work with structured data wherein data is stored in relations (tables).

Characteristics Of Big Data

Big data can be described by the following characteristics:

Volume
Variety
Velocity
Variability

(i) Volume – The name Big Data itself is related to a size which is enormous. Size of data plays a very crucial role in determining value out of data. Also, whether a particular data can actually be considered as a Big Data or not, is dependent upon the volume of data. Hence, ‘Volume’ is one characteristic which needs to be considered while dealing with Big Data solutions.

(ii) Variety – The next aspect of Big Data is its variety.

Variety refers to heterogeneous sources and the nature of data, both structured and unstructured. During earlier days, spreadsheets and databases were the only sources of data considered by most of the applications. Nowadays, data in the form of emails, photos, videos, monitoring devices, PDFs, audio, etc. are also being considered in the analysis applications. This variety of unstructured data poses certain issues for storage, mining and analyzing data.

(iii) Velocity – The term ‘velocity’ refers to the speed of generation of data. How fast the data is generated and processed to meet the demands, determines real potential in the data.

Big Data Velocity deals with the speed at which data flows in from sources like business processes, application logs, networks, and social media sites, sensors, Mobile devices, etc. The flow of data is massive and continuous.

(iv) Variability – This refers to the inconsistency which can be shown by the data at times, thus hampering the process of being able to handle and manage the data effectively.

Advantages Of Big Data Processing

Ability to process Big Data in DBMS brings in multiple benefits, such as-

Businesses can utilize outside intelligence while taking decisions

Access to social data from search engines and sites like Facebook, Twitter are enabling organizations to fine tune their business strategies.

Improved customer service

Traditional customer feedback systems are getting replaced by new systems designed with Big Data technologies. In these new systems, Big Data and natural language processing technologies are being used to read and evaluate consumer responses.

Early identification of risk to the product/services, if any
Better operational efficiency

Big Data technologies can be used for creating a staging area or landing zone for new data before identifying what data should be moved to the data warehouse. In addition, such integration of Big Data technologies and data warehouse helps an organization to offload infrequently accessed data.

Summary

Big Data definition : Big Data meaning a data that is huge in size. Bigdata is a term used to describe a collection of data that is huge in size and yet growing exponentially with time.
Big Data analytics examples includes stock exchanges, social media sites, jet engines, etc.
Big Data could be 1) Structured, 2) Unstructured, 3) Semi-structured
Volume, Variety, Velocity, and Variability are few Big Data characteristics
Improved customer service, better operational efficiency, Better Decision Making are few advantages of Bigdata

You Might Like:

Big Data Hadoop Tutorial for Beginners: Learn Basics in 3 Days!
Big Data Testing Tutorial: What is, Strategy, How to test Hadoop
Top 60 Hadoop Interview Questions and Answers (2024)
Top 15 Big Data Tools and Software (Open Source) 2024
Talend Tutorial – What is Talend ETL Tool?
Hadoop Tutorial PDF for Beginners (Download FREE Chapter)
Top 30 Talend Interview Questions and Answers (2024)

FAQs

What is Big Data? Introduction, Types, Characteristics, Examples? ›

Big data is defined as a complex and voluminous set of information comprising structured, unstructured, and semi-structured

semi-structured
Semi-structured data is a form of structured data that does not obey the tabular structure of data models associated with relational databases or other forms of data tables, but nonetheless contains tags or other markers to separate semantic elements and enforce hierarchies of records and fields within the data.
https://en.wikipedia.org › wiki › Semi-structured_data
Semi-structured data - Wikipedia

datasets, which is challenging to manage using traditional data processing tools. It requires additional infrastructure to govern, analyze, and convert into insights.

Know More ›

What are big data introduction types, characteristics, and examples? ›

Big data can be classified into structured, semi-structured, and unstructured data. Structured data is highly organized and fits neatly into traditional databases. Semi-structured data, like JSON or XML, is partially organized, while unstructured data, such as text or multimedia, lacks a predefined structure.

Discover More Details ›

What is introduction to big data? ›

The definition of big data is data that contains greater variety, arriving in increasing volumes and with more velocity. This is also known as the three “Vs.” Put simply, big data is larger, more complex data sets, especially from new data sources.

Get More Info ›

Which is an example of big data characteristic? ›

Three characteristics define Big Data: volume, variety, and velocity. Together, these characteristics define “Big Data”.

Get More Info Here ›

What are the 5 characteristics of big data? ›

Big data is a collection of data from many different sources and is often describe by five characteristics: volume, value, variety, velocity, and veracity.

Get More Info Here ›

What are the 5 characters of big data? ›

The 5 V's of big data -- velocity, volume, value, variety and veracity -- are the five main and innate characteristics of big data.

What is big data explain its types? ›

Big data is defined as a complex and voluminous set of information comprising structured, unstructured, and semi-structured datasets, which is challenging to manage using traditional data processing tools. It requires additional infrastructure to govern, analyze, and convert into insights.

What is big data and its characteristics? ›

Big data is a combination of structured, semi-structured and unstructured data that organizations collect, analyze and mine for information and insights. It's used in machine learning projects, predictive modeling and other advanced analytics applications.

What is a data example? ›

Data Examples

The number of visitors to a website in one month. Inventory levels in a warehouse on a specific date. Individual satisfaction scores on a customer service survey. The price of a competitors' product.

Get More Info ›

What are the three characteristics of big data? ›

There are three defining properties that can help break down the term. Dubbed the three Vs; volume, velocity, and variety, these are key to understanding how we can measure big data and just how very different 'big data' is to old fashioned data.

Discover More Details ›

What are the four common characteristics of big data? ›

Most people determine data is “big” if it has the four Vs—volume, velocity, variety and veracity.

Find Out More ›

What are examples of data characteristics? ›

There are five traits that you'll find within data quality: accuracy, completeness, reliability, relevance, and timeliness – read on to learn more.

Explore More ›

Why is big data characteristics important? ›

Big Data is characterized by its volume, variety, and velocity. It can be utilized to speed up product development, cut down on time and resources, and ensure customer satisfaction. Some of the popular tools available for Big Data include Apache Hadoop, Apache Hive, and Tableau.

Keep Reading ›

What is big data structure with example? ›

Structured data in big data has a defined data model that is formatted to store data structures before it is moved to data storage. It is stored in its original format and not processed until it is executed. The data is stored in tabular format and requires less storage space. Examples – excel, sheets, or SQL database.

What is big data and what are its characteristics? ›

Big Data contains a large amount of data that is not being processed by traditional data storage or the processing unit. It is used by many multinational companies to process the data and business of many organizations. The data flow would exceed 150 exabytes per day before replication.

Get More Info Here ›

What is data type introduction? ›

A data type is an attribute associated with a piece of data that tells a computer system how to interpret its value. Understanding data types ensures that data is collected in the preferred format and the value of each property is as expected.

Know More ›

What are the four common characteristics of big data and provide two examples? ›

Volume. Volume refers to how much data is actually collected. ...
Veracity. Veracity relates to how reliable data is. ...
Velocity. Velocity in big data refers to how fast data can be generated, gathered and analyzed. ...
Variety. Variety refers to how many points of reference are used to collect data.

Get More Info ›

What is big data What are the four characteristics of big data? ›

Most people determine data is “big” if it has the four Vs—volume, velocity, variety and veracity.

Learn More Now ›