mirror of
https://github.com/linkedin/school-of-sre
synced 2026-01-21 07:58:03 +00:00
Merge branch 'main' into BigData
This commit is contained in:
@@ -1,38 +1,35 @@
|
||||
# School of SRE: Big Data
|
||||
# Big Data
|
||||
|
||||
## Pre - Reads
|
||||
## Prerequisites
|
||||
|
||||
- Basics of Linux File systems.
|
||||
- Basic understanding of System Design.
|
||||
|
||||
## Target Audience
|
||||
|
||||
The concept of Big Data has been around for years; most organizations now understand that if they capture all the data that streams into their businesses, they can apply analytics and get significant value from it.
|
||||
This training material covers the basics of Big Data(using Hadoop) for beginners, who would like to quickly get started and get their hands dirty in this domain.
|
||||
|
||||
## What to expect from this training
|
||||
## What to expect from this course
|
||||
|
||||
This course covers the basics of Big Data and how it has evolved to become what it is today. We will take a look at a few realistic scenarios where Big Data would be a perfect fit. An interesting assignment on designing a Big Data system is followed by understanding the architecture of Hadoop and the tooling around it.
|
||||
|
||||
## What is not covered under this training
|
||||
## What is not covered under this course
|
||||
|
||||
Writing programs to draw analytics from data.
|
||||
|
||||
## TOC:
|
||||
## Course Content
|
||||
|
||||
1. Overview of Big Data
|
||||
2. Usage of Big Data techniques
|
||||
3. Evolution of Hadoop
|
||||
4. Architecture of hadoop
|
||||
### Table of Contents
|
||||
|
||||
1. [Overview of Big Data](https://linkedin.github.io/school-of-sre/big_data/intro/#overview-of-big-data)
|
||||
2. [Usage of Big Data techniques](https://linkedin.github.io/school-of-sre/big_data/intro/#usage-of-big-data-techniques)
|
||||
3. [Evolution of Hadoop](https://linkedin.github.io/school-of-sre/big_data/evolution/)
|
||||
4. [Architecture of hadoop](https://linkedin.github.io/school-of-sre/big_data/evolution/#architecture-of-hadoop)
|
||||
1. HDFS
|
||||
2. Yarn
|
||||
5. MapReduce framework
|
||||
6. Other tooling around hadoop
|
||||
5. [MapReduce framework](https://linkedin.github.io/school-of-sre/big_data/evolution/#mapreduce-framework)
|
||||
6. [Other tooling around hadoop](https://linkedin.github.io/school-of-sre/big_data/evolution/#other-tooling-around-hadoop)
|
||||
1. Hive
|
||||
2. Pig
|
||||
3. Spark
|
||||
4. Presto
|
||||
7. Data Serialisation and storage
|
||||
7. [Data Serialisation and storage](https://linkedin.github.io/school-of-sre/big_data/evolution/#data-serialisation-and-storage)
|
||||
|
||||
|
||||
# Overview of Big Data
|
||||
@@ -50,7 +47,7 @@ Writing programs to draw analytics from data.
|
||||
4. Examples of Big Data generation include stock exchanges, social media sites, jet engines, etc.
|
||||
|
||||
|
||||
# Usage of Big Data techniques
|
||||
# Usage of Big Data Techniques
|
||||
|
||||
1. Take the example of the traffic lights problem.
|
||||
1. There are more than 300,000 traffic lights in the US as of 2018.
|
||||
@@ -59,4 +56,5 @@ Writing programs to draw analytics from data.
|
||||
4. How would you go about processing that and telling me how many of the signals were “green” at 10:45 am on a particular day?
|
||||
2. Consider the next example on Unified Payments Interface (UPI) transactions:
|
||||
1. We had about 1.15 billion UPI transactions in the month of October, 2019 in India.
|
||||
12. If we try to extrapolate this data to about a year and try to find out some common payments that were happening through a particular UPI ID, how do you suggest we go about that?
|
||||
12. If we try to extrapolate this data to about a year and try to find out some common payments that were happening through a particular UPI ID, how do you suggest we go about that?
|
||||
|
||||
Reference in New Issue
Block a user