Fact is we all are involved in it. Therefore, in this article we will talk about almost everything related to Big Data & Big Data Analytics. It is meant to be a simple & easy guide providing what everyone needs to know about it. That includes the following sections:
- The famous question “What is Big Data?”
- How can the world get use of it?
- Can Big Data affect you personally?
- How big data solution looks like
- Who is working in that field?
- Studying & Working in this field
What is big data analytics?
It is a common repeated question these days. … Many asks: Is it just a large sized Data? A type of Software? A process, Or something else? let us take it a step by step.
How big is big data?
Big Data …. By its name, part of it is about dealing with big volumes of data. In our computers & servers, we work with Gigabytes and maybe few Terabytes. On the other and, Imagine having data of Petabytes (1000 Terabytes), or Exabyte (1000 Petabytes), or Zettabytes (1000 Exabyte). Definitely, you will need different ways to store and process this huge amount of data. the previous image (from UNECE) shows the increase in Data with time.
Another perspective is the rate of data flow. For example, your mailbox may receive say 20 new emails per day. in busy days, let us say will get a 100. For Twitter, they receive 500 million tweets per day. That makes it 350,000 tweets per minute. In Aug 2015, around 1 billion people used Facebook in a single day.
Another thing is the data format. Imagine you have a variety of data formats including database tables, word documents, images, and videos. That represents valuable content for processing and analysis.
Based on the above the points, there is a common concept named 3 Vs. that stands for Volume, Velocity and Variety. some adds Veracity (the messiness or trustworthiness of the data) and Value. that makes them 5 Vs defining how Data can look like.
BIG data definition
Now we can answer the question. BIG data is the process of dealing with data of previous characteristics. That include data capturing, processing, storage, analysis, and visualization. Doing that requires special software and applications for handling data of large Volume, high Velocity, and Variety of format.
Big Data Analytics
Data analysis were there even before Big Data technology appears. When it appeared, doors opened for new levels of data analysis that answers complicated questions. Therefore, the term Big Data Analytics appeared to represent the analysis process for the large data sets and building analytics models to answer the different business questions.
Business Intelligence versus Big Data Analytics
Actually, this is a common question: What is the difference between Business Intelligence & Big Data?
Business Intelligence (BI) is an excellent tool for reporting the previous & current status. Therefore, we call this the Descriptive Analytics. Business Intelligence uses structured data to generate advanced reports and charts. For example, users are able to drill down or across the data. Accordingly they can generate multiple reports to help managers measure the performance in the different departments.
Big Data includes the previous capabilities of Business Intelligence. in addition, Big Data analytics can build models for answering what if scenarios. for example, while BI can provide reports about the current sales figures, Big Data analytics can answer questions like: What will be the sales volume if we changed this item? … also we Big Data analytics help in the optimization. this is because it answers questions like: What should we change to reach this target? … that is why we call this Predictive Analytics.
So, if you want a short answer about difference between BI & Big Data Analytics, first one is about descriptive Analytics while the other is about predictive Analytics.
Many factors came together and resulted in appear of Big Data.
One of factors is the breakthrough in storage technology. We now are able to store huge amount of data. That encouraged individuals and entities to keep everything. Accordingly, the content expanded exponentially.
Business is an important drive. Companies discovered they have a treasure of data. Banks have tons of data about their customer and their historical transactions. Retail stores have thousands of purchasing history about who bought what and when. Accordingly, business entities needed new tools to analyze their BIG Data and extract trends and recommendations to improve their business.
We cannot forget the previously appeared technologies such as data mining and business intelligence. They participated as well in BIG Data.
Big Data Applications & Business
Almost all sectors use Big Data.
Retail stores apply Big Data analytics to identify the link between products. In brief, they analyze transactions history to detect that most customer who bought product A, also bought product B. that is the “Affinity Analysis” or “Basket Analysis”. Then, they use that outcome to increase sales. For online stores like Amazon, When you check something you get a list of related products that other users bought with the item you are checking. If you are at physical retail store, you will find Product A is located just next to product B. As a result, they increase their sales figure.
Additionally, they identify their best customers, and find the best offers for them. That is why sometime you receive SMS offers for specific products through your cell phone. Actually, they pick these products by analyzing your previous history and understanding your interests.
Banks are main users for BIG Data. they use for detecting potential illegal or fraud transactions such as Credit Card frauds. that help them to protect their customer.
Mobile operators apply Big Data analytics to find the suitable package for their customer. In addition, they use it to minimize churn rates by detecting angry customers and find solutions for retaining them.
Government entities use Big Data to analyze the community needs and the priority areas that needs attention.
Health organizations use Big Data to analyze the spread of specific diseases and try to detects the factors affecting the distribution of infection to new zones.
In sports too … data scientists records performance attributes of football players, analyze it. Then, coaches get reports about the exact areas for potential progress for each player.
Police officers utilize Big Data to analyze crime history attribute. They find if there is a link between crimes and specific location, time or people. Accordingly, they detect the spots of high future probability of crime.
That was a quick sample. The actual uses are many and you can get more details here.
How we are involved
As we mentioned in the beginning, we all are involved in Big Data.
In our daily life, we are continuously generating Data that other entities can use. When you visit google and search for something, you are generating data. Accordingly, google analyze our search inputs and comes out with Google Trends. Moreover, google understands your interests. Then, your browser displays ads related to your interests.
Similarly, when you post something on social media, some entities analyze our posts to find new trends.
If you do nothing, your cell phones keeps sending signal with some info to mobile operators. They can use it to build awareness about where their customer are. Then, they optimize their network accordingly.
Generally, we are generating data all the time. On the other hand, we are potential customer. Therefore, business entities analyze us through our data so that they can give us better offers.
How to be a Data Scientist
Data Scientist is the one who works with Big Data & Data Analytics. He is responsible of managing, interpreting and analyzing data and visualizing the results to the management.
Technically, Data scientist needs to have a mix of skills and knowledge about
- Statistics and mathematical models.
- Programming and writing scripts
- Hardware capabilities
- Writing Queries
- Charts & visualization techniques
In addition, Data scientist should build a business understanding. Analysis and visualization processes should serve the actual business need.
Studying to be a data scientist has several options. Many of the universities around the world provide programs for BIG Data. For academic degrees, there are Bachelor’s degrees, diplomas & Master degrees. Additionally, there are many courses for Big Data aside from the academic degrees.
For mangers, there are short courses about Big Data & Data science. Goal is to give them a high-level orientation about BIG Data. Consequently, they learn how they can embed that technology in their business.
Luckily, some of degrees are courses are available online.
Below link provides URLs for a number of Big Data degrees & courses in several countries and URLs for online course as well.
Big Data is in a fast growing phase. According to a Gartner Survey, More Than 75 Percent of Companies Are Investing or Planning to Invest in Big Data in the Next Two Years. Hence, lot of companies are currently looking for data scientists. Generally, the different sectors need Data Scientists: Government, banking, Telecom, Health, Retail …etc.
For the salary, the average Salary of a Data scientist exceeds USD 100,000 per year (around 118,000). However, there is a wide range there. Some salaries are around $50K while others may reach $130K.
In big enterprises, data scientists usually report to a CDO (Chief Data Officer), who reports to the CEO. However, this is not the case in all organization.
How does a big data solution look like?
We will not dive into any technical details. However, it is nice to understand the high level architecture of Big Data solution.
In a simplified view, big data architecture has main layers:
- Hardware: servers or machines
- File system: for storing data of different formats (database, videos, images, documents …etc.). A famous brand here is Hadoop file system.
- Analysis: software layer for accessing and analyzing data.
- Applications: friendly for end users. The visualize analysis in a friendly & easy interface.
Who are the market players?
Generally, most of big names are providing solutions for Big Data. Additionally, many startups and entrepreneurs invade the market as well. Literally, you will find hundreds of companies working in that field. No one provides everything. Some companies focus on data storage while others focus on analysis and applications, and so on.
For the big names, you will find Microsoft, IBM, Oracle, Teradata, Google, SAP and others.
For a clear overview, a nice representation of the market is in the below image of Big Data landscape, 2016. It shows the companies and the solution they provide.
You may read : World Cup … Why Machine Learning Failed
5 Case studies about merging BIG data with GIS.
A writer & GIS consultant … Studied the Management of Technology … dreaming of a better world.