mirror of
https://github.com/linkedin/school-of-sre
synced 2026-01-20 15:38:03 +00:00
Deployed 6d74e6c with MkDocs version: 1.1.2
This commit is contained in:
@@ -610,7 +610,7 @@
|
||||
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="../overview/" class="md-nav__link">
|
||||
<a href="../overview.md" class="md-nav__link">
|
||||
Overview of Big Data
|
||||
</a>
|
||||
</li>
|
||||
@@ -622,7 +622,7 @@
|
||||
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="../usage/" class="md-nav__link">
|
||||
<a href="../usage.md" class="md-nav__link">
|
||||
Usage of Big Data techniques
|
||||
</a>
|
||||
</li>
|
||||
@@ -646,7 +646,7 @@
|
||||
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="../architecture/" class="md-nav__link">
|
||||
<a href="../architecture.md" class="md-nav__link">
|
||||
Architecture of Hadoop
|
||||
</a>
|
||||
</li>
|
||||
@@ -946,23 +946,56 @@
|
||||
<h2 id="course-content">Course Content</h2>
|
||||
<h3 id="table-of-contents">Table of Contents</h3>
|
||||
<ol>
|
||||
<li><a href="https://linkedin.github.io/school-of-sre/big_data/overview/">Overview of Big Data</a></li>
|
||||
<li><a href="https://linkedin.github.io/school-of-sre/big_data/overview/">Usage of Big Data techniques</a></li>
|
||||
<li><a href="https://linkedin.github.io/school-of-sre/big_data/intro/#overview-of-big-data">Overview of Big Data</a></li>
|
||||
<li><a href="https://linkedin.github.io/school-of-sre/big_data/intro/#usage-of-big-data-techniques">Usage of Big Data techniques</a></li>
|
||||
<li><a href="https://linkedin.github.io/school-of-sre/big_data/evolution/">Evolution of Hadoop</a></li>
|
||||
<li><a href="https://linkedin.github.io/school-of-sre/big_data/architecture/">Architecture of hadoop</a><ol>
|
||||
<li><a href="https://linkedin.github.io/school-of-sre/big_data/evolution/#architecture-of-hadoop">Architecture of hadoop</a><ol>
|
||||
<li>HDFS</li>
|
||||
<li>Yarn</li>
|
||||
</ol>
|
||||
</li>
|
||||
<li><a href="https://linkedin.github.io/school-of-sre/big_data/architecture/#mapreduce-framework">MapReduce framework</a></li>
|
||||
<li><a href="https://linkedin.github.io/school-of-sre/big_data/architecture/#other-tooling-around-hadoop">Other tooling around hadoop</a><ol>
|
||||
<li><a href="https://linkedin.github.io/school-of-sre/big_data/evolution/#mapreduce-framework">MapReduce framework</a></li>
|
||||
<li><a href="https://linkedin.github.io/school-of-sre/big_data/evolution/#other-tooling-around-hadoop">Other tooling around hadoop</a><ol>
|
||||
<li>Hive</li>
|
||||
<li>Pig</li>
|
||||
<li>Spark</li>
|
||||
<li>Presto</li>
|
||||
</ol>
|
||||
</li>
|
||||
<li><a href="https://linkedin.github.io/school-of-sre/big_data/architecture/#data-serialisation-and-storage">Data Serialisation and storage</a></li>
|
||||
<li><a href="https://linkedin.github.io/school-of-sre/big_data/evolution/#data-serialisation-and-storage">Data Serialisation and storage</a></li>
|
||||
</ol>
|
||||
<h1 id="overview-of-big-data">Overview of Big Data</h1>
|
||||
<ol>
|
||||
<li>Big Data is a collection of large datasets that cannot be processed using traditional computing techniques. It is not a single technique or a tool, rather it has become a complete subject, which involves various tools, techniques and frameworks.</li>
|
||||
<li>Big Data could consist of<ol>
|
||||
<li>Structured data</li>
|
||||
<li>Unstructured data</li>
|
||||
<li>Semi-structured data</li>
|
||||
</ol>
|
||||
</li>
|
||||
<li>Characteristics of Big Data:<ol>
|
||||
<li>Volume</li>
|
||||
<li>Variety</li>
|
||||
<li>Velocity</li>
|
||||
<li>Variability</li>
|
||||
</ol>
|
||||
</li>
|
||||
<li>Examples of Big Data generation include stock exchanges, social media sites, jet engines, etc.</li>
|
||||
</ol>
|
||||
<h1 id="usage-of-big-data-techniques">Usage of Big Data Techniques</h1>
|
||||
<ol>
|
||||
<li>Take the example of the traffic lights problem.<ol>
|
||||
<li>There are more than 300,000 traffic lights in the US as of 2018.</li>
|
||||
<li>Let us assume that we placed a device on each of them to collect metrics and send it to a central metrics collection system.</li>
|
||||
<li>If each of the IOT devices sends 10 events per minute, we have 300000x10x60x24 = 432x10^7 events per day.</li>
|
||||
<li>How would you go about processing that and telling me how many of the signals were “green” at 10:45 am on a particular day?</li>
|
||||
</ol>
|
||||
</li>
|
||||
<li>Consider the next example on Unified Payments Interface (UPI) transactions:<ol>
|
||||
<li>We had about 1.15 billion UPI transactions in the month of October, 2019 in India.</li>
|
||||
<li>If we try to extrapolate this data to about a year and try to find out some common payments that were happening through a particular UPI ID, how do you suggest we go about that?</li>
|
||||
</ol>
|
||||
</li>
|
||||
</ol>
|
||||
|
||||
|
||||
@@ -997,13 +1030,13 @@
|
||||
</a>
|
||||
|
||||
|
||||
<a href="../overview/" class="md-footer-nav__link md-footer-nav__link--next" rel="next">
|
||||
<a href="../evolution/" class="md-footer-nav__link md-footer-nav__link--next" rel="next">
|
||||
<div class="md-footer-nav__title">
|
||||
<div class="md-ellipsis">
|
||||
<span class="md-footer-nav__direction">
|
||||
Next
|
||||
</span>
|
||||
Overview of Big Data
|
||||
Evolution of Hadoop
|
||||
</div>
|
||||
</div>
|
||||
<div class="md-footer-nav__button md-icon">
|
||||
|
||||
Reference in New Issue
Block a user