mirror of
https://github.com/linkedin/school-of-sre
synced 2026-01-07 17:18:03 +00:00
Deployed 52e7ed5 with MkDocs version: 1.1.2
This commit is contained in:
@@ -1301,12 +1301,12 @@
|
||||
</ul>
|
||||
<h3 id="building-and-testing-locally">Building and testing locally</h3>
|
||||
<p>Run the following commands to build and view the site locally before opening a PR.</p>
|
||||
<div class="highlight"><pre><span></span><code>python3 -m venv .venv
|
||||
<span class="nb">source</span> .venv/bin/activate
|
||||
<pre><code>python3 -m venv .venv
|
||||
source .venv/bin/activate
|
||||
pip install -r requirements.txt
|
||||
mkdocs build
|
||||
mkdocs serve
|
||||
</code></pre></div>
|
||||
</code></pre>
|
||||
<h3 id="opening-a-pr">Opening a PR</h3>
|
||||
<p>Follow the <a href="https://guides.github.com/introduction/flow/">GitHub PR workflow</a> for your contributions.</p>
|
||||
<p>Fork this repo, create a feature branch, commit your changes and open a PR to this repo.</p>
|
||||
|
||||
@@ -1291,11 +1291,10 @@ What is the output of running the pig queries in the right column against the da
|
||||
</ol>
|
||||
<p><img alt="Pig Example" src="../images/pig_example.png" /></p>
|
||||
<p>Output:
|
||||
<div class="highlight"><pre><span></span><code>7,Komal,Nayak,24,9848022334,trivendram
|
||||
<code>7,Komal,Nayak,24,9848022334,trivendram
|
||||
8,Bharathi,Nambiayar,24,9848022333,Chennai
|
||||
5,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar
|
||||
6,Archana,Mishra,23,9848022335,Chennai
|
||||
</code></pre></div></p>
|
||||
6,Archana,Mishra,23,9848022335,Chennai</code></p>
|
||||
</li>
|
||||
<li>
|
||||
<p><a href="https://spark.apache.org/"><strong>Spark</strong></a></p>
|
||||
@@ -1307,10 +1306,9 @@ What is the output of running the pig queries in the right column against the da
|
||||
<li>Presto is a high performance, distributed SQL query engine for Big Data.</li>
|
||||
<li>Its architecture allows users to query a variety of data sources such as Hadoop, AWS S3, Alluxio, MySQL, Cassandra, Kafka, and MongoDB.</li>
|
||||
<li>Example presto query:
|
||||
<div class="highlight"><pre><span></span><code><span class="n">use</span> <span class="n">studentDB</span><span class="p">;</span>
|
||||
<span class="k">show</span> <span class="n">tables</span><span class="p">;</span>
|
||||
<span class="k">SELECT</span> <span class="n">roll_no</span><span class="p">,</span> <span class="n">name</span> <span class="k">FROM</span> <span class="n">studentDB</span><span class="p">.</span><span class="n">studentDetails</span> <span class="k">where</span> <span class="n">section</span><span class="o">=</span><span class="err">’</span><span class="n">A</span><span class="err">’</span> <span class="k">limit</span> <span class="mi">5</span><span class="p">;</span>
|
||||
</code></pre></div>
|
||||
<code>use studentDB;
|
||||
show tables;
|
||||
SELECT roll_no, name FROM studentDB.studentDetails where section=’A’ limit 5;</code>
|
||||
</br></li>
|
||||
</ol>
|
||||
</li>
|
||||
|
||||
@@ -1493,17 +1493,17 @@ Web caching
|
||||
<h3 id="hashing">Hashing</h3>
|
||||
<p>A hash function is a function that maps one piece of data—typically describing some kind of object, often of arbitrary size—to another piece of data, typically an integer, known as <em>hash code</em>, or simply <em>hash</em>. In a partitioned database, it is important to consistently map a key to a server/replica. </p>
|
||||
<p>For ex: you can use a very simple hash as a modulo function.</p>
|
||||
<div class="highlight"><pre><span></span><code>_p = k mod n_
|
||||
</code></pre></div>
|
||||
<pre><code>_p = k mod n_
|
||||
</code></pre>
|
||||
<p>Where </p>
|
||||
<div class="highlight"><pre><span></span><code>p -> partition,
|
||||
<pre><code>p -> partition,
|
||||
|
||||
|
||||
k -> primary key
|
||||
|
||||
|
||||
n -> no of nodes
|
||||
</code></pre></div>
|
||||
</code></pre>
|
||||
<p>The downside of this simple hash is that, whenever the cluster topology changes, the data distribution also changes. When you are dealing with memory caches, it will be easy to distribute partitions around. Whenever a node joins/leaves a topology, partitions can reorder themselves, a cache miss can be re-populated from backend DB. However when you look at persistent data, it is not possible as the new node doesn’t have the data needed to serve it. This brings us to consistent hashing.</p>
|
||||
<h4 id="consistent-hashing">Consistent Hashing</h4>
|
||||
<p>Consistent hashing is a distributed hashing scheme that operates independently of the number of servers or objects in a distributed <em>hash table</em> by assigning them a position on an abstract circle, or <em>hash ring</em>. This allows servers and objects to scale without affecting the overall system.</p>
|
||||
|
||||
@@ -1298,8 +1298,8 @@
|
||||
<p><a href="https://stackify.com/what-are-crud-operations/">CRUD operations</a> - create, read, update, delete queries</p>
|
||||
<p>Management operations - create DBs/tables/indexes etc, backup, import/export, users, access controls</p>
|
||||
<p>Exercise: Classify the below queries into the four types - DDL (definition), DML(manipulation), DCL(control) and TCL(transactions) and explain in detail.</p>
|
||||
<div class="highlight"><pre><span></span><code>insert, create, drop, delete, update, commit, rollback, truncate, alter, grant, revoke
|
||||
</code></pre></div>
|
||||
<pre><code>insert, create, drop, delete, update, commit, rollback, truncate, alter, grant, revoke
|
||||
</code></pre>
|
||||
<p>You can practise these in the <a href="https://linkedin.github.io/school-of-sre/databases_sql/lab/">lab section</a>.</p>
|
||||
</li>
|
||||
<li>
|
||||
|
||||
@@ -1207,7 +1207,7 @@
|
||||
<p><strong>Setup</strong></p>
|
||||
<p>Create a working directory named sos or something similar, and cd into it.</p>
|
||||
<p>Enter the following into a file named my.cnf under a directory named custom.</p>
|
||||
<div class="highlight"><pre><span></span><code>sos $ cat custom/my.cnf
|
||||
<pre><code>sos $ cat custom/my.cnf
|
||||
[mysqld]
|
||||
# These settings apply to MySQL server
|
||||
# You can set port, socket path, buffer size etc.
|
||||
@@ -1215,76 +1215,76 @@
|
||||
slow_query_log=1
|
||||
slow_query_log_file=/var/log/mysqlslow.log
|
||||
long_query_time=0.1
|
||||
</code></pre></div>
|
||||
</code></pre>
|
||||
<p>Start a container and enable slow query log with the following:</p>
|
||||
<div class="highlight"><pre><span></span><code>sos $ docker run --name db -v custom:/etc/mysql/conf.d -e <span class="nv">MYSQL_ROOT_PASSWORD</span><span class="o">=</span>realsecret -d mysql:8
|
||||
sos $ docker cp custom/mysqld.cnf <span class="k">$(</span>docker ps -qf <span class="s2">"name=db"</span><span class="k">)</span>:/etc/mysql/conf.d/custom.cnf
|
||||
sos $ docker restart <span class="k">$(</span>docker ps -qf <span class="s2">"name=db"</span><span class="k">)</span>
|
||||
</code></pre></div>
|
||||
<pre><code>sos $ docker run --name db -v custom:/etc/mysql/conf.d -e MYSQL_ROOT_PASSWORD=realsecret -d mysql:8
|
||||
sos $ docker cp custom/mysqld.cnf $(docker ps -qf "name=db"):/etc/mysql/conf.d/custom.cnf
|
||||
sos $ docker restart $(docker ps -qf "name=db")
|
||||
</code></pre>
|
||||
<p>Import a sample database</p>
|
||||
<div class="highlight"><pre><span></span><code>sos $ git clone git@github.com:datacharmer/test_db.git
|
||||
sos $ docker cp test_db <span class="k">$(</span>docker ps -qf <span class="s2">"name=db"</span><span class="k">)</span>:/home/test_db/
|
||||
sos $ docker <span class="nb">exec</span> -it <span class="k">$(</span>docker ps -qf <span class="s2">"name=db"</span><span class="k">)</span> bash
|
||||
root@3ab5b18b0c7d:/# <span class="nb">cd</span> /home/test_db/
|
||||
<pre><code>sos $ git clone git@github.com:datacharmer/test_db.git
|
||||
sos $ docker cp test_db $(docker ps -qf "name=db"):/home/test_db/
|
||||
sos $ docker exec -it $(docker ps -qf "name=db") bash
|
||||
root@3ab5b18b0c7d:/# cd /home/test_db/
|
||||
root@3ab5b18b0c7d:/# mysql -uroot -prealsecret mysql < employees.sql
|
||||
root@3ab5b18b0c7d:/etc# touch /var/log/mysqlslow.log
|
||||
root@3ab5b18b0c7d:/etc# chown mysql:mysql /var/log/mysqlslow.log
|
||||
</code></pre></div>
|
||||
</code></pre>
|
||||
<p><em>Workshop 1: Run some sample queries</em>
|
||||
Run the following
|
||||
<div class="highlight"><pre><span></span><code>$ mysql -uroot -prealsecret mysql
|
||||
Run the following</p>
|
||||
<pre><code>$ mysql -uroot -prealsecret mysql
|
||||
mysql>
|
||||
|
||||
<span class="c1"># inspect DBs and tables</span>
|
||||
<span class="c1"># the last 4 are MySQL internal DBs</span>
|
||||
# inspect DBs and tables
|
||||
# the last 4 are MySQL internal DBs
|
||||
|
||||
mysql> show databases<span class="p">;</span>
|
||||
mysql> show databases;
|
||||
+--------------------+
|
||||
<span class="p">|</span> Database <span class="p">|</span>
|
||||
| Database |
|
||||
+--------------------+
|
||||
<span class="p">|</span> employees <span class="p">|</span>
|
||||
<span class="p">|</span> information_schema <span class="p">|</span>
|
||||
<span class="p">|</span> mysql <span class="p">|</span>
|
||||
<span class="p">|</span> performance_schema <span class="p">|</span>
|
||||
<span class="p">|</span> sys <span class="p">|</span>
|
||||
| employees |
|
||||
| information_schema |
|
||||
| mysql |
|
||||
| performance_schema |
|
||||
| sys |
|
||||
+--------------------+
|
||||
|
||||
> use employees<span class="p">;</span>
|
||||
mysql> show tables<span class="p">;</span>
|
||||
> use employees;
|
||||
mysql> show tables;
|
||||
+----------------------+
|
||||
<span class="p">|</span> Tables_in_employees <span class="p">|</span>
|
||||
| Tables_in_employees |
|
||||
+----------------------+
|
||||
<span class="p">|</span> current_dept_emp <span class="p">|</span>
|
||||
<span class="p">|</span> departments <span class="p">|</span>
|
||||
<span class="p">|</span> dept_emp <span class="p">|</span>
|
||||
<span class="p">|</span> dept_emp_latest_date <span class="p">|</span>
|
||||
<span class="p">|</span> dept_manager <span class="p">|</span>
|
||||
<span class="p">|</span> employees <span class="p">|</span>
|
||||
<span class="p">|</span> salaries <span class="p">|</span>
|
||||
<span class="p">|</span> titles <span class="p">|</span>
|
||||
| current_dept_emp |
|
||||
| departments |
|
||||
| dept_emp |
|
||||
| dept_emp_latest_date |
|
||||
| dept_manager |
|
||||
| employees |
|
||||
| salaries |
|
||||
| titles |
|
||||
+----------------------+
|
||||
|
||||
<span class="c1"># read a few rows</span>
|
||||
mysql> <span class="k">select</span> * from employees limit <span class="m">5</span><span class="p">;</span>
|
||||
# read a few rows
|
||||
mysql> select * from employees limit 5;
|
||||
|
||||
<span class="c1"># filter data by conditions</span>
|
||||
mysql> <span class="k">select</span> count<span class="o">(</span>*<span class="o">)</span> from employees where <span class="nv">gender</span> <span class="o">=</span> <span class="s1">'M'</span> limit <span class="m">5</span><span class="p">;</span>
|
||||
# filter data by conditions
|
||||
mysql> select count(*) from employees where gender = 'M' limit 5;
|
||||
|
||||
<span class="c1"># find count of particular data</span>
|
||||
mysql> <span class="k">select</span> count<span class="o">(</span>*<span class="o">)</span> from employees where <span class="nv">first_name</span> <span class="o">=</span> <span class="s1">'Sachin'</span><span class="p">;</span>
|
||||
</code></pre></div></p>
|
||||
<p><em>Workshop 2: Use explain and explain analyze to profile a query, identify and add indexes required for improving performance</em>
|
||||
<div class="highlight"><pre><span></span><code><span class="c1"># View all indexes on table </span>
|
||||
<span class="c1">#(\G is to output horizontally, replace it with a ; to get table output)</span>
|
||||
mysql> show index from employees from employees<span class="se">\G</span>
|
||||
*************************** <span class="m">1</span>. row ***************************
|
||||
# find count of particular data
|
||||
mysql> select count(*) from employees where first_name = 'Sachin';
|
||||
</code></pre>
|
||||
<p><em>Workshop 2: Use explain and explain analyze to profile a query, identify and add indexes required for improving performance</em></p>
|
||||
<pre><code># View all indexes on table
|
||||
#(\G is to output horizontally, replace it with a ; to get table output)
|
||||
mysql> show index from employees from employees\G
|
||||
*************************** 1. row ***************************
|
||||
Table: employees
|
||||
Non_unique: <span class="m">0</span>
|
||||
Non_unique: 0
|
||||
Key_name: PRIMARY
|
||||
Seq_in_index: <span class="m">1</span>
|
||||
Seq_in_index: 1
|
||||
Column_name: emp_no
|
||||
Collation: A
|
||||
Cardinality: <span class="m">299113</span>
|
||||
Cardinality: 299113
|
||||
Sub_part: NULL
|
||||
Packed: NULL
|
||||
Null:
|
||||
@@ -1294,28 +1294,28 @@ Index_comment:
|
||||
Visible: YES
|
||||
Expression: NULL
|
||||
|
||||
<span class="c1"># This query uses an index, idenitfied by 'key' field</span>
|
||||
<span class="c1"># By prefixing explain keyword to the command, </span>
|
||||
<span class="c1"># we get query plan (including key used)</span>
|
||||
mysql> explain <span class="k">select</span> * from employees where emp_no < <span class="m">10005</span><span class="se">\G</span>
|
||||
*************************** <span class="m">1</span>. row ***************************
|
||||
id: <span class="m">1</span>
|
||||
# This query uses an index, idenitfied by 'key' field
|
||||
# By prefixing explain keyword to the command,
|
||||
# we get query plan (including key used)
|
||||
mysql> explain select * from employees where emp_no < 10005\G
|
||||
*************************** 1. row ***************************
|
||||
id: 1
|
||||
select_type: SIMPLE
|
||||
table: employees
|
||||
partitions: NULL
|
||||
type: range
|
||||
possible_keys: PRIMARY
|
||||
key: PRIMARY
|
||||
key_len: <span class="m">4</span>
|
||||
key_len: 4
|
||||
ref: NULL
|
||||
rows: <span class="m">4</span>
|
||||
filtered: <span class="m">100</span>.00
|
||||
rows: 4
|
||||
filtered: 100.00
|
||||
Extra: Using where
|
||||
|
||||
<span class="c1"># Compare that to the next query which does not utilize any index</span>
|
||||
mysql> explain <span class="k">select</span> first_name, last_name from employees where <span class="nv">first_name</span> <span class="o">=</span> <span class="s1">'Sachin'</span><span class="se">\G</span>
|
||||
*************************** <span class="m">1</span>. row ***************************
|
||||
id: <span class="m">1</span>
|
||||
# Compare that to the next query which does not utilize any index
|
||||
mysql> explain select first_name, last_name from employees where first_name = 'Sachin'\G
|
||||
*************************** 1. row ***************************
|
||||
id: 1
|
||||
select_type: SIMPLE
|
||||
table: employees
|
||||
partitions: NULL
|
||||
@@ -1324,64 +1324,64 @@ possible_keys: NULL
|
||||
key: NULL
|
||||
key_len: NULL
|
||||
ref: NULL
|
||||
rows: <span class="m">299113</span>
|
||||
filtered: <span class="m">10</span>.00
|
||||
rows: 299113
|
||||
filtered: 10.00
|
||||
Extra: Using where
|
||||
|
||||
<span class="c1"># Let's see how much time this query takes</span>
|
||||
mysql> explain analyze <span class="k">select</span> first_name, last_name from employees where <span class="nv">first_name</span> <span class="o">=</span> <span class="s1">'Sachin'</span><span class="se">\G</span>
|
||||
*************************** <span class="m">1</span>. row ***************************
|
||||
EXPLAIN: -> Filter: <span class="o">(</span>employees.first_name <span class="o">=</span> <span class="s1">'Sachin'</span><span class="o">)</span> <span class="o">(</span><span class="nv">cost</span><span class="o">=</span><span class="m">30143</span>.55 <span class="nv">rows</span><span class="o">=</span><span class="m">29911</span><span class="o">)</span> <span class="o">(</span>actual <span class="nv">time</span><span class="o">=</span><span class="m">28</span>.284..3952.428 <span class="nv">rows</span><span class="o">=</span><span class="m">232</span> <span class="nv">loops</span><span class="o">=</span><span class="m">1</span><span class="o">)</span>
|
||||
-> Table scan on employees <span class="o">(</span><span class="nv">cost</span><span class="o">=</span><span class="m">30143</span>.55 <span class="nv">rows</span><span class="o">=</span><span class="m">299113</span><span class="o">)</span> <span class="o">(</span>actual <span class="nv">time</span><span class="o">=</span><span class="m">0</span>.095..1996.092 <span class="nv">rows</span><span class="o">=</span><span class="m">300024</span> <span class="nv">loops</span><span class="o">=</span><span class="m">1</span><span class="o">)</span>
|
||||
# Let's see how much time this query takes
|
||||
mysql> explain analyze select first_name, last_name from employees where first_name = 'Sachin'\G
|
||||
*************************** 1. row ***************************
|
||||
EXPLAIN: -> Filter: (employees.first_name = 'Sachin') (cost=30143.55 rows=29911) (actual time=28.284..3952.428 rows=232 loops=1)
|
||||
-> Table scan on employees (cost=30143.55 rows=299113) (actual time=0.095..1996.092 rows=300024 loops=1)
|
||||
|
||||
|
||||
<span class="c1"># Cost(estimated by query planner) is 30143.55</span>
|
||||
<span class="c1"># actual time=28.284ms for first row, 3952.428 for all rows</span>
|
||||
<span class="c1"># Now lets try adding an index and running the query again</span>
|
||||
mysql> create index idx_firstname on employees<span class="o">(</span>first_name<span class="o">)</span><span class="p">;</span>
|
||||
Query OK, <span class="m">0</span> rows affected <span class="o">(</span><span class="m">1</span>.25 sec<span class="o">)</span>
|
||||
Records: <span class="m">0</span> Duplicates: <span class="m">0</span> Warnings: <span class="m">0</span>
|
||||
# Cost(estimated by query planner) is 30143.55
|
||||
# actual time=28.284ms for first row, 3952.428 for all rows
|
||||
# Now lets try adding an index and running the query again
|
||||
mysql> create index idx_firstname on employees(first_name);
|
||||
Query OK, 0 rows affected (1.25 sec)
|
||||
Records: 0 Duplicates: 0 Warnings: 0
|
||||
|
||||
mysql> explain analyze <span class="k">select</span> first_name, last_name from employees where <span class="nv">first_name</span> <span class="o">=</span> <span class="s1">'Sachin'</span><span class="p">;</span>
|
||||
mysql> explain analyze select first_name, last_name from employees where first_name = 'Sachin';
|
||||
+--------------------------------------------------------------------------------------------------------------------------------------------+
|
||||
<span class="p">|</span> EXPLAIN <span class="p">|</span>
|
||||
| EXPLAIN |
|
||||
+--------------------------------------------------------------------------------------------------------------------------------------------+
|
||||
<span class="p">|</span> -> Index lookup on employees using idx_firstname <span class="o">(</span><span class="nv">first_name</span><span class="o">=</span><span class="s1">'Sachin'</span><span class="o">)</span> <span class="o">(</span><span class="nv">cost</span><span class="o">=</span><span class="m">81</span>.20 <span class="nv">rows</span><span class="o">=</span><span class="m">232</span><span class="o">)</span> <span class="o">(</span>actual <span class="nv">time</span><span class="o">=</span><span class="m">0</span>.551..2.934 <span class="nv">rows</span><span class="o">=</span><span class="m">232</span> <span class="nv">loops</span><span class="o">=</span><span class="m">1</span><span class="o">)</span>
|
||||
<span class="p">|</span>
|
||||
| -> Index lookup on employees using idx_firstname (first_name='Sachin') (cost=81.20 rows=232) (actual time=0.551..2.934 rows=232 loops=1)
|
||||
|
|
||||
+--------------------------------------------------------------------------------------------------------------------------------------------+
|
||||
<span class="m">1</span> row <span class="k">in</span> <span class="nb">set</span> <span class="o">(</span><span class="m">0</span>.01 sec<span class="o">)</span>
|
||||
1 row in set (0.01 sec)
|
||||
|
||||
<span class="c1"># Actual time=0.551ms for first row</span>
|
||||
<span class="c1"># 2.934ms for all rows. A huge improvement!</span>
|
||||
<span class="c1"># Also notice that the query involves only an index lookup,</span>
|
||||
<span class="c1"># and no table scan (reading all rows of table)</span>
|
||||
<span class="c1"># ..which vastly reduces load on the DB.</span>
|
||||
</code></pre></div></p>
|
||||
<p><em>Workshop 3: Identify slow queries on a MySQL server</em>
|
||||
<div class="highlight"><pre><span></span><code><span class="c1"># Run the command below in two terminal tabs to open two shells into the container.</span>
|
||||
docker <span class="nb">exec</span> -it <span class="k">$(</span>docker ps -qf <span class="s2">"name=db"</span><span class="k">)</span> bash
|
||||
# Actual time=0.551ms for first row
|
||||
# 2.934ms for all rows. A huge improvement!
|
||||
# Also notice that the query involves only an index lookup,
|
||||
# and no table scan (reading all rows of table)
|
||||
# ..which vastly reduces load on the DB.
|
||||
</code></pre>
|
||||
<p><em>Workshop 3: Identify slow queries on a MySQL server</em></p>
|
||||
<pre><code># Run the command below in two terminal tabs to open two shells into the container.
|
||||
docker exec -it $(docker ps -qf "name=db") bash
|
||||
|
||||
<span class="c1"># Open a mysql prompt in one of them and execute this command</span>
|
||||
<span class="c1"># We have configured to log queries that take longer than 1s,</span>
|
||||
<span class="c1"># so this sleep(3) will be logged</span>
|
||||
# Open a mysql prompt in one of them and execute this command
|
||||
# We have configured to log queries that take longer than 1s,
|
||||
# so this sleep(3) will be logged
|
||||
mysql -uroot -prealsecret mysql
|
||||
mysql> sleep<span class="o">(</span><span class="m">3</span><span class="o">)</span><span class="p">;</span>
|
||||
mysql> sleep(3);
|
||||
|
||||
<span class="c1"># Now, in the other terminal, tail the slow log to find details about the query</span>
|
||||
# Now, in the other terminal, tail the slow log to find details about the query
|
||||
root@62c92c89234d:/etc# tail -f /var/log/mysqlslow.log
|
||||
/usr/sbin/mysqld, Version: <span class="m">8</span>.0.21 <span class="o">(</span>MySQL Community Server - GPL<span class="o">)</span>. started with:
|
||||
Tcp port: <span class="m">3306</span> Unix socket: /var/run/mysqld/mysqld.sock
|
||||
/usr/sbin/mysqld, Version: 8.0.21 (MySQL Community Server - GPL). started with:
|
||||
Tcp port: 3306 Unix socket: /var/run/mysqld/mysqld.sock
|
||||
Time Id Command Argument
|
||||
<span class="c1"># Time: 2020-11-26T14:53:44.822348Z</span>
|
||||
<span class="c1"># User@Host: root[root] @ localhost [] Id: 9</span>
|
||||
<span class="c1"># Query_time: 5.404938 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 1</span>
|
||||
use employees<span class="p">;</span>
|
||||
<span class="c1"># Time: 2020-11-26T14:53:58.015736Z</span>
|
||||
<span class="c1"># User@Host: root[root] @ localhost [] Id: 9</span>
|
||||
<span class="c1"># Query_time: 10.000225 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 1</span>
|
||||
SET <span class="nv">timestamp</span><span class="o">=</span><span class="m">1606402428</span><span class="p">;</span>
|
||||
<span class="k">select</span> sleep<span class="o">(</span><span class="m">3</span><span class="o">)</span><span class="p">;</span>
|
||||
</code></pre></div></p>
|
||||
# Time: 2020-11-26T14:53:44.822348Z
|
||||
# User@Host: root[root] @ localhost [] Id: 9
|
||||
# Query_time: 5.404938 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 1
|
||||
use employees;
|
||||
# Time: 2020-11-26T14:53:58.015736Z
|
||||
# User@Host: root[root] @ localhost [] Id: 9
|
||||
# Query_time: 10.000225 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 1
|
||||
SET timestamp=1606402428;
|
||||
select sleep(3);
|
||||
</code></pre>
|
||||
<p>These were simulated examples with minimal complexity. In real life, the queries would be much more complex and the explain/analyze and slow query logs would have more details.</p>
|
||||
|
||||
|
||||
|
||||
@@ -1271,145 +1271,145 @@
|
||||
<p>Coming back to our local repo which has two commits. So far, what we have is a single line of history. Commits are chained in a single line. But sometimes you may have a need to work on two different features in parallel in the same repo. Now one option here could be making a new folder/repo with the same code and use that for another feature development. But there's a better way. Use <em>branches.</em> Since git follows tree like structure for commits, we can use branches to work on different sets of features. From a commit, two or more branches can be created and branches can also be merged.</p>
|
||||
<p>Using branches, there can exist multiple lines of histories and we can checkout to any of them and work on it. Checking out, as we discussed earlier, would simply mean replacing contents of the directory (repo) with the snapshot at the checked out version.</p>
|
||||
<p>Let's create a branch and see how it looks like:</p>
|
||||
<div class="highlight"><pre><span></span><code>$ git branch b1
|
||||
<pre><code class="language-bash">$ git branch b1
|
||||
$ git log --oneline --graph
|
||||
* 7f3b00e <span class="o">(</span>HEAD -> master, b1<span class="o">)</span> adding file <span class="m">2</span>
|
||||
* df2fb7a adding file <span class="m">1</span>
|
||||
</code></pre></div>
|
||||
* 7f3b00e (HEAD -> master, b1) adding file 2
|
||||
* df2fb7a adding file 1
|
||||
</code></pre>
|
||||
<p>We create a branch called <code>b1</code>. Git log tells us that b1 also points to the last commit (7f3b00e) but the <code>HEAD</code> is still pointing to master. If you remember, HEAD points to the commit/reference wherever you are checkout to. So if we checkout to <code>b1</code>, HEAD should point to that. Let's confirm:</p>
|
||||
<div class="highlight"><pre><span></span><code>$ git checkout b1
|
||||
Switched to branch <span class="s1">'b1'</span>
|
||||
<pre><code class="language-bash">$ git checkout b1
|
||||
Switched to branch 'b1'
|
||||
$ git log --oneline --graph
|
||||
* 7f3b00e <span class="o">(</span>HEAD -> b1, master<span class="o">)</span> adding file <span class="m">2</span>
|
||||
* df2fb7a adding file <span class="m">1</span>
|
||||
</code></pre></div>
|
||||
* 7f3b00e (HEAD -> b1, master) adding file 2
|
||||
* df2fb7a adding file 1
|
||||
</code></pre>
|
||||
<p><code>b1</code> still points to the same commit but HEAD now points to <code>b1</code>. Since we create a branch at commit <code>7f3b00e</code>, there will be two lines of histories starting this commit. Depending on which branch you are checked out on, the line of history will progress.</p>
|
||||
<p>At this moment, we are checked out on branch <code>b1</code>, so making a new commit will advance branch reference <code>b1</code> to that commit and current <code>b1</code> commit will become its parent. Let's do that.</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="c1"># Creating a file and making a commit</span>
|
||||
$ <span class="nb">echo</span> <span class="s2">"I am a file in b1 branch"</span> > b1.txt
|
||||
<pre><code class="language-bash"># Creating a file and making a commit
|
||||
$ echo "I am a file in b1 branch" > b1.txt
|
||||
$ git add b1.txt
|
||||
$ git commit -m <span class="s2">"adding b1 file"</span>
|
||||
<span class="o">[</span>b1 872a38f<span class="o">]</span> adding b1 file
|
||||
<span class="m">1</span> file changed, <span class="m">1</span> insertion<span class="o">(</span>+<span class="o">)</span>
|
||||
create mode <span class="m">100644</span> b1.txt
|
||||
$ git commit -m "adding b1 file"
|
||||
[b1 872a38f] adding b1 file
|
||||
1 file changed, 1 insertion(+)
|
||||
create mode 100644 b1.txt
|
||||
|
||||
<span class="c1"># The new line of history</span>
|
||||
# The new line of history
|
||||
$ git log --oneline --graph
|
||||
* 872a38f <span class="o">(</span>HEAD -> b1<span class="o">)</span> adding b1 file
|
||||
* 7f3b00e <span class="o">(</span>master<span class="o">)</span> adding file <span class="m">2</span>
|
||||
* df2fb7a adding file <span class="m">1</span>
|
||||
* 872a38f (HEAD -> b1) adding b1 file
|
||||
* 7f3b00e (master) adding file 2
|
||||
* df2fb7a adding file 1
|
||||
$
|
||||
</code></pre></div>
|
||||
</code></pre>
|
||||
<p>Do note that master is still pointing to the old commit it was pointing to. We can now checkout to master branch and make commits there. This will result in another line of history starting from commit 7f3b00e.</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="c1"># checkout to master branch</span>
|
||||
<pre><code class="language-bash"># checkout to master branch
|
||||
$ git checkout master
|
||||
Switched to branch <span class="s1">'master'</span>
|
||||
Switched to branch 'master'
|
||||
|
||||
<span class="c1"># Creating a new commit on master branch</span>
|
||||
$ <span class="nb">echo</span> <span class="s2">"new file in master branch"</span> > master.txt
|
||||
# Creating a new commit on master branch
|
||||
$ echo "new file in master branch" > master.txt
|
||||
$ git add master.txt
|
||||
$ git commit -m <span class="s2">"adding master.txt file"</span>
|
||||
<span class="o">[</span>master 60dc441<span class="o">]</span> adding master.txt file
|
||||
<span class="m">1</span> file changed, <span class="m">1</span> insertion<span class="o">(</span>+<span class="o">)</span>
|
||||
create mode <span class="m">100644</span> master.txt
|
||||
$ git commit -m "adding master.txt file"
|
||||
[master 60dc441] adding master.txt file
|
||||
1 file changed, 1 insertion(+)
|
||||
create mode 100644 master.txt
|
||||
|
||||
<span class="c1"># The history line</span>
|
||||
# The history line
|
||||
$ git log --oneline --graph
|
||||
* 60dc441 <span class="o">(</span>HEAD -> master<span class="o">)</span> adding master.txt file
|
||||
* 7f3b00e adding file <span class="m">2</span>
|
||||
* df2fb7a adding file <span class="m">1</span>
|
||||
</code></pre></div>
|
||||
* 60dc441 (HEAD -> master) adding master.txt file
|
||||
* 7f3b00e adding file 2
|
||||
* df2fb7a adding file 1
|
||||
</code></pre>
|
||||
<p>Notice how branch b1 is not visible here since we are on the master. Let's try to visualize both to get the whole picture:</p>
|
||||
<div class="highlight"><pre><span></span><code>$ git log --oneline --graph --all
|
||||
* 60dc441 <span class="o">(</span>HEAD -> master<span class="o">)</span> adding master.txt file
|
||||
<span class="p">|</span> * 872a38f <span class="o">(</span>b1<span class="o">)</span> adding b1 file
|
||||
<span class="p">|</span>/
|
||||
* 7f3b00e adding file <span class="m">2</span>
|
||||
* df2fb7a adding file <span class="m">1</span>
|
||||
</code></pre></div>
|
||||
<pre><code class="language-bash">$ git log --oneline --graph --all
|
||||
* 60dc441 (HEAD -> master) adding master.txt file
|
||||
| * 872a38f (b1) adding b1 file
|
||||
|/
|
||||
* 7f3b00e adding file 2
|
||||
* df2fb7a adding file 1
|
||||
</code></pre>
|
||||
<p>Above tree structure should make things clear. Notice a clear branch/fork on commit 7f3b00e. This is how we create branches. Now they both are two separate lines of history on which feature development can be done independently.</p>
|
||||
<p><strong>To reiterate, internally, git is just a tree of commits. Branch names (human readable) are pointers to those commits in the tree. We use various git commands to work with the tree structure and references. Git accordingly modifies contents of our repo.</strong></p>
|
||||
<h2 id="merges">Merges</h2>
|
||||
<p>Now say the feature you were working on branch <code>b1</code> is complete and you need to merge it on master branch, where all the final version of code goes. So first you will checkout to branch master and then you pull the latest code from upstream (eg: GitHub). Then you need to merge your code from <code>b1</code> into master. There could be two ways this can be done.</p>
|
||||
<p>Here is the current history:</p>
|
||||
<div class="highlight"><pre><span></span><code>$ git log --oneline --graph --all
|
||||
* 60dc441 <span class="o">(</span>HEAD -> master<span class="o">)</span> adding master.txt file
|
||||
<span class="p">|</span> * 872a38f <span class="o">(</span>b1<span class="o">)</span> adding b1 file
|
||||
<span class="p">|</span>/
|
||||
* 7f3b00e adding file <span class="m">2</span>
|
||||
* df2fb7a adding file <span class="m">1</span>
|
||||
</code></pre></div>
|
||||
<pre><code class="language-bash">$ git log --oneline --graph --all
|
||||
* 60dc441 (HEAD -> master) adding master.txt file
|
||||
| * 872a38f (b1) adding b1 file
|
||||
|/
|
||||
* 7f3b00e adding file 2
|
||||
* df2fb7a adding file 1
|
||||
</code></pre>
|
||||
<p><strong>Option 1: Directly merge the branch.</strong> Merging the branch b1 into master will result in a new merge commit. This will merge changes from two different lines of history and create a new commit of the result.</p>
|
||||
<div class="highlight"><pre><span></span><code>$ git merge b1
|
||||
Merge made by the <span class="s1">'recursive'</span> strategy.
|
||||
b1.txt <span class="p">|</span> <span class="m">1</span> +
|
||||
<span class="m">1</span> file changed, <span class="m">1</span> insertion<span class="o">(</span>+<span class="o">)</span>
|
||||
create mode <span class="m">100644</span> b1.txt
|
||||
<pre><code class="language-bash">$ git merge b1
|
||||
Merge made by the 'recursive' strategy.
|
||||
b1.txt | 1 +
|
||||
1 file changed, 1 insertion(+)
|
||||
create mode 100644 b1.txt
|
||||
$ git log --oneline --graph --all
|
||||
* 8fc28f9 <span class="o">(</span>HEAD -> master<span class="o">)</span> Merge branch <span class="s1">'b1'</span>
|
||||
<span class="p">|</span><span class="se">\</span>
|
||||
<span class="p">|</span> * 872a38f <span class="o">(</span>b1<span class="o">)</span> adding b1 file
|
||||
* <span class="p">|</span> 60dc441 adding master.txt file
|
||||
<span class="p">|</span>/
|
||||
* 7f3b00e adding file <span class="m">2</span>
|
||||
* df2fb7a adding file <span class="m">1</span>
|
||||
</code></pre></div>
|
||||
* 8fc28f9 (HEAD -> master) Merge branch 'b1'
|
||||
|\
|
||||
| * 872a38f (b1) adding b1 file
|
||||
* | 60dc441 adding master.txt file
|
||||
|/
|
||||
* 7f3b00e adding file 2
|
||||
* df2fb7a adding file 1
|
||||
</code></pre>
|
||||
<p>You can see a new merge commit created (8fc28f9). You will be prompted for the commit message. If there are a lot of branches in the repo, this result will end-up with a lot of merge commits. Which looks ugly compared to a single line of history of development. So let's look at an alternative approach</p>
|
||||
<p>First let's <a href="https://git-scm.com/docs/git-reset">reset</a> our last merge and go to the previous state.</p>
|
||||
<div class="highlight"><pre><span></span><code>$ git reset --hard 60dc441
|
||||
<pre><code class="language-bash">$ git reset --hard 60dc441
|
||||
HEAD is now at 60dc441 adding master.txt file
|
||||
$ git log --oneline --graph --all
|
||||
* 60dc441 <span class="o">(</span>HEAD -> master<span class="o">)</span> adding master.txt file
|
||||
<span class="p">|</span> * 872a38f <span class="o">(</span>b1<span class="o">)</span> adding b1 file
|
||||
<span class="p">|</span>/
|
||||
* 7f3b00e adding file <span class="m">2</span>
|
||||
* df2fb7a adding file <span class="m">1</span>
|
||||
</code></pre></div>
|
||||
* 60dc441 (HEAD -> master) adding master.txt file
|
||||
| * 872a38f (b1) adding b1 file
|
||||
|/
|
||||
* 7f3b00e adding file 2
|
||||
* df2fb7a adding file 1
|
||||
</code></pre>
|
||||
<p><strong>Option 2: Rebase.</strong> Now, instead of merging two branches which has a similar base (commit: 7f3b00e), let us rebase branch b1 on to current master. <strong>What this means is take branch <code>b1</code> (from commit 7f3b00e to commit 872a38f) and rebase (put them on top of) master (60dc441).</strong></p>
|
||||
<div class="highlight"><pre><span></span><code><span class="c1"># Switch to b1</span>
|
||||
<pre><code class="language-bash"># Switch to b1
|
||||
$ git checkout b1
|
||||
Switched to branch <span class="s1">'b1'</span>
|
||||
Switched to branch 'b1'
|
||||
|
||||
<span class="c1"># Rebase (b1 which is current branch) on master</span>
|
||||
# Rebase (b1 which is current branch) on master
|
||||
$ git rebase master
|
||||
First, rewinding head to replay your work on top of it...
|
||||
Applying: adding b1 file
|
||||
|
||||
<span class="c1"># The result</span>
|
||||
# The result
|
||||
$ git log --oneline --graph --all
|
||||
* 5372c8f <span class="o">(</span>HEAD -> b1<span class="o">)</span> adding b1 file
|
||||
* 60dc441 <span class="o">(</span>master<span class="o">)</span> adding master.txt file
|
||||
* 7f3b00e adding file <span class="m">2</span>
|
||||
* df2fb7a adding file <span class="m">1</span>
|
||||
</code></pre></div>
|
||||
* 5372c8f (HEAD -> b1) adding b1 file
|
||||
* 60dc441 (master) adding master.txt file
|
||||
* 7f3b00e adding file 2
|
||||
* df2fb7a adding file 1
|
||||
</code></pre>
|
||||
<p>You can see <code>b1</code> which had 1 commit. That commit's parent was <code>7f3b00e</code>. But since we rebase it on master (<code>60dc441</code>). That becomes the parent now. As a side effect, you also see it has become a single line of history. Now if we were to merge <code>b1</code> into <code>master</code>, it would simply mean change <code>master</code> to point to <code>5372c8f</code> which is <code>b1</code>. Let's try it:</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="c1"># checkout to master since we want to merge code into master</span>
|
||||
<pre><code class="language-bash"># checkout to master since we want to merge code into master
|
||||
$ git checkout master
|
||||
Switched to branch <span class="s1">'master'</span>
|
||||
Switched to branch 'master'
|
||||
|
||||
<span class="c1"># the current history, where b1 is based on master</span>
|
||||
# the current history, where b1 is based on master
|
||||
$ git log --oneline --graph --all
|
||||
* 5372c8f <span class="o">(</span>b1<span class="o">)</span> adding b1 file
|
||||
* 60dc441 <span class="o">(</span>HEAD -> master<span class="o">)</span> adding master.txt file
|
||||
* 7f3b00e adding file <span class="m">2</span>
|
||||
* df2fb7a adding file <span class="m">1</span>
|
||||
* 5372c8f (b1) adding b1 file
|
||||
* 60dc441 (HEAD -> master) adding master.txt file
|
||||
* 7f3b00e adding file 2
|
||||
* df2fb7a adding file 1
|
||||
|
||||
|
||||
<span class="c1"># Performing the merge, notice the "fast-forward" message</span>
|
||||
# Performing the merge, notice the "fast-forward" message
|
||||
$ git merge b1
|
||||
Updating 60dc441..5372c8f
|
||||
Fast-forward
|
||||
b1.txt <span class="p">|</span> <span class="m">1</span> +
|
||||
<span class="m">1</span> file changed, <span class="m">1</span> insertion<span class="o">(</span>+<span class="o">)</span>
|
||||
create mode <span class="m">100644</span> b1.txt
|
||||
b1.txt | 1 +
|
||||
1 file changed, 1 insertion(+)
|
||||
create mode 100644 b1.txt
|
||||
|
||||
<span class="c1"># The Result</span>
|
||||
# The Result
|
||||
$ git log --oneline --graph --all
|
||||
* 5372c8f <span class="o">(</span>HEAD -> master, b1<span class="o">)</span> adding b1 file
|
||||
* 5372c8f (HEAD -> master, b1) adding b1 file
|
||||
* 60dc441 adding master.txt file
|
||||
* 7f3b00e adding file <span class="m">2</span>
|
||||
* df2fb7a adding file <span class="m">1</span>
|
||||
</code></pre></div>
|
||||
* 7f3b00e adding file 2
|
||||
* df2fb7a adding file 1
|
||||
</code></pre>
|
||||
<p>Now you see both <code>b1</code> and <code>master</code> are pointing to the same commit. Your code has been merged to the master branch and it can be pushed. Also we have clean line of history! :D</p>
|
||||
|
||||
|
||||
|
||||
@@ -1487,162 +1487,162 @@
|
||||
<p>Though you might be aware already, let's revisit why we need a version control system. As the project grows and multiple developers start working on it, an efficient method for collaboration is warranted. Git helps the team collaborate easily and also maintains the history of the changes happening with the codebase.</p>
|
||||
<h3 id="creating-a-git-repo">Creating a Git Repo</h3>
|
||||
<p>Any folder can be converted into a git repository. After executing the following command, we will see a <code>.git</code> folder within the folder, which makes our folder a git repository. <strong>All the magic that git does, <code>.git</code> folder is the enabler for the same.</strong></p>
|
||||
<div class="highlight"><pre><span></span><code><span class="c1"># creating an empty folder and changing current dir to it</span>
|
||||
$ <span class="nb">cd</span> /tmp
|
||||
<pre><code class="language-bash"># creating an empty folder and changing current dir to it
|
||||
$ cd /tmp
|
||||
$ mkdir school-of-sre
|
||||
$ <span class="nb">cd</span> school-of-sre/
|
||||
$ cd school-of-sre/
|
||||
|
||||
<span class="c1"># initialize a git repo</span>
|
||||
# initialize a git repo
|
||||
$ git init
|
||||
Initialized empty Git repository <span class="k">in</span> /private/tmp/school-of-sre/.git/
|
||||
</code></pre></div>
|
||||
Initialized empty Git repository in /private/tmp/school-of-sre/.git/
|
||||
</code></pre>
|
||||
<p>As the output says, an empty git repo has been initialized in our folder. Let's take a look at what is there.</p>
|
||||
<div class="highlight"><pre><span></span><code>$ ls .git/
|
||||
<pre><code class="language-bash">$ ls .git/
|
||||
HEAD config description hooks info objects refs
|
||||
</code></pre></div>
|
||||
</code></pre>
|
||||
<p>There are a bunch of folders and files in the <code>.git</code> folder. As I said, all these enables git to do its magic. We will look into some of these folders and files. But for now, what we have is an empty git repository.</p>
|
||||
<h3 id="tracking-a-file">Tracking a File</h3>
|
||||
<p>Now as you might already know, let us create a new file in our repo (we will refer to the folder as <em>repo</em> now.) And see git status</p>
|
||||
<div class="highlight"><pre><span></span><code>$ <span class="nb">echo</span> <span class="s2">"I am file 1"</span> > file1.txt
|
||||
<pre><code class="language-bash">$ echo "I am file 1" > file1.txt
|
||||
$ git status
|
||||
On branch master
|
||||
|
||||
No commits yet
|
||||
|
||||
Untracked files:
|
||||
<span class="o">(</span>use <span class="s2">"git add <file>..."</span> to include <span class="k">in</span> what will be committed<span class="o">)</span>
|
||||
(use "git add <file>..." to include in what will be committed)
|
||||
|
||||
file1.txt
|
||||
|
||||
nothing added to commit but untracked files present <span class="o">(</span>use <span class="s2">"git add"</span> to track<span class="o">)</span>
|
||||
</code></pre></div>
|
||||
nothing added to commit but untracked files present (use "git add" to track)
|
||||
</code></pre>
|
||||
<p>The current git status says <code>No commits yet</code> and there is one untracked file. Since we just created the file, git is not tracking that file. We explicitly need to ask git to track files and folders. (also checkout <a href="https://git-scm.com/docs/gitignore">gitignore</a>) And how we do that is via <code>git add</code> command as suggested in the above output. Then we go ahead and create a commit.</p>
|
||||
<div class="highlight"><pre><span></span><code>$ git add file1.txt
|
||||
<pre><code class="language-bash">$ git add file1.txt
|
||||
$ git status
|
||||
On branch master
|
||||
|
||||
No commits yet
|
||||
|
||||
Changes to be committed:
|
||||
<span class="o">(</span>use <span class="s2">"git rm --cached <file>..."</span> to unstage<span class="o">)</span>
|
||||
(use "git rm --cached <file>..." to unstage)
|
||||
|
||||
new file: file1.txt
|
||||
|
||||
$ git commit -m <span class="s2">"adding file 1"</span>
|
||||
<span class="o">[</span>master <span class="o">(</span>root-commit<span class="o">)</span> df2fb7a<span class="o">]</span> adding file <span class="m">1</span>
|
||||
<span class="m">1</span> file changed, <span class="m">1</span> insertion<span class="o">(</span>+<span class="o">)</span>
|
||||
create mode <span class="m">100644</span> file1.txt
|
||||
</code></pre></div>
|
||||
$ git commit -m "adding file 1"
|
||||
[master (root-commit) df2fb7a] adding file 1
|
||||
1 file changed, 1 insertion(+)
|
||||
create mode 100644 file1.txt
|
||||
</code></pre>
|
||||
<p>Notice how after adding the file, git status says <code>Changes to be committed:</code>. What it means is whatever is listed there, will be included in the next commit. Then we go ahead and create a commit, with an attached messaged via <code>-m</code>.</p>
|
||||
<h3 id="more-about-a-commit">More About a Commit</h3>
|
||||
<p>Commit is a snapshot of the repo. Whenever a commit is made, a snapshot of the current state of repo (the folder) is taken and saved. Each commit has a unique ID. (<code>df2fb7a</code> for the commit we made in the previous step). As we keep adding/changing more and more contents and keep making commits, all those snapshots are stored by git. Again, all this magic happens inside the <code>.git</code> folder. This is where all this snapshot or versions are stored <em>in an efficient manner.</em></p>
|
||||
<h3 id="adding-more-changes">Adding More Changes</h3>
|
||||
<p>Let us create one more file and commit the change. It would look the same as the previous commit we made.</p>
|
||||
<div class="highlight"><pre><span></span><code>$ <span class="nb">echo</span> <span class="s2">"I am file 2"</span> > file2.txt
|
||||
<pre><code class="language-bash">$ echo "I am file 2" > file2.txt
|
||||
$ git add file2.txt
|
||||
$ git commit -m <span class="s2">"adding file 2"</span>
|
||||
<span class="o">[</span>master 7f3b00e<span class="o">]</span> adding file <span class="m">2</span>
|
||||
<span class="m">1</span> file changed, <span class="m">1</span> insertion<span class="o">(</span>+<span class="o">)</span>
|
||||
create mode <span class="m">100644</span> file2.txt
|
||||
</code></pre></div>
|
||||
$ git commit -m "adding file 2"
|
||||
[master 7f3b00e] adding file 2
|
||||
1 file changed, 1 insertion(+)
|
||||
create mode 100644 file2.txt
|
||||
</code></pre>
|
||||
<p>A new commit with ID <code>7f3b00e</code> has been created. You can issue <code>git status</code> at any time to see the state of the repository.</p>
|
||||
<div class="highlight"><pre><span></span><code> **IMPORTANT: Note that commit IDs are long string (SHA) but we can refer to a commit by its initial few (8 or more) characters too. We will interchangeably using shorter and longer commit IDs.**
|
||||
</code></pre></div>
|
||||
<pre><code> **IMPORTANT: Note that commit IDs are long string (SHA) but we can refer to a commit by its initial few (8 or more) characters too. We will interchangeably using shorter and longer commit IDs.**
|
||||
</code></pre>
|
||||
<p>Now that we have two commits, let's visualize them:</p>
|
||||
<div class="highlight"><pre><span></span><code>$ git log --oneline --graph
|
||||
* 7f3b00e <span class="o">(</span>HEAD -> master<span class="o">)</span> adding file <span class="m">2</span>
|
||||
* df2fb7a adding file <span class="m">1</span>
|
||||
</code></pre></div>
|
||||
<pre><code class="language-bash">$ git log --oneline --graph
|
||||
* 7f3b00e (HEAD -> master) adding file 2
|
||||
* df2fb7a adding file 1
|
||||
</code></pre>
|
||||
<p><code>git log</code>, as the name suggests, prints the log of all the git commits. Here you see two additional arguments, <code>--oneline</code> prints the shorter version of the log, ie: the commit message only and not the person who made the commit and when. <code>--graph</code> prints it in graph format.</p>
|
||||
<p><strong>Now at this moment the commits might look like just one in each line but all commits are stored as a tree like data structure internally by git. That means there can be two or more children commits of a given commit. And not just a single line of commits. We will look more into this part when we get to the Branches section. For now this is our commit history:</strong></p>
|
||||
<div class="highlight"><pre><span></span><code> <span class="nv">df2fb7a</span> <span class="o">===</span>> 7f3b00e
|
||||
</code></pre></div>
|
||||
<pre><code class="language-bash"> df2fb7a ===> 7f3b00e
|
||||
</code></pre>
|
||||
<h3 id="are-commits-really-linked">Are commits really linked?</h3>
|
||||
<p>As I just said, the two commits we just made are linked via tree like data structure and we saw how they are linked. But let's actually verify it. Everything in git is an object. Newly created files are stored as an object. Changes to file are stored as an objects and even commits are objects. To view contents of an object we can use the following command with the object's ID. We will take a look at the contents of the second commit</p>
|
||||
<div class="highlight"><pre><span></span><code>$ git cat-file -p 7f3b00e
|
||||
<pre><code class="language-bash">$ git cat-file -p 7f3b00e
|
||||
tree ebf3af44d253e5328340026e45a9fa9ae3ea1982
|
||||
parent df2fb7a61f5d40c1191e0fdeb0fc5d6e7969685a
|
||||
author Sanket Patel <spatel1@linkedin.com> <span class="m">1603273316</span> -0700
|
||||
committer Sanket Patel <spatel1@linkedin.com> <span class="m">1603273316</span> -0700
|
||||
author Sanket Patel <spatel1@linkedin.com> 1603273316 -0700
|
||||
committer Sanket Patel <spatel1@linkedin.com> 1603273316 -0700
|
||||
|
||||
adding file <span class="m">2</span>
|
||||
</code></pre></div>
|
||||
adding file 2
|
||||
</code></pre>
|
||||
<p>Take a note of <code>parent</code> attribute in the above output. It points to the commit id of the first commit we made. So this proves that they are linked! Additionally you can see the second commit's message in this object. As I said all this magic is enabled by <code>.git</code> folder and the object to which we are looking at also is in that folder.</p>
|
||||
<div class="highlight"><pre><span></span><code>$ ls .git/objects/7f/3b00eaa957815884198e2fdfec29361108d6a9
|
||||
<pre><code class="language-bash">$ ls .git/objects/7f/3b00eaa957815884198e2fdfec29361108d6a9
|
||||
.git/objects/7f/3b00eaa957815884198e2fdfec29361108d6a9
|
||||
</code></pre></div>
|
||||
</code></pre>
|
||||
<p>It is stored in <code>.git/objects/</code> folder. All the files and changes to them as well are stored in this folder.</p>
|
||||
<h3 id="the-version-control-part-of-git">The Version Control part of Git</h3>
|
||||
<p>We already can see two commits (versions) in our git log. One thing a version control tool gives you is ability to browse back and forth in history. For example: some of your users are running an old version of code and they are reporting an issue. In order to debug the issue, you need access to the old code. The one in your current repo is the latest code. In this example, you are working on the second commit (7f3b00e) and someone reported an issue with the code snapshot at commit (df2fb7a). This is how you would get access to the code at any older commit</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="c1"># Current contents, two files present</span>
|
||||
<pre><code class="language-bash"># Current contents, two files present
|
||||
$ ls
|
||||
file1.txt file2.txt
|
||||
|
||||
<span class="c1"># checking out to (an older) commit</span>
|
||||
# checking out to (an older) commit
|
||||
$ git checkout df2fb7a
|
||||
Note: checking out <span class="s1">'df2fb7a'</span>.
|
||||
Note: checking out 'df2fb7a'.
|
||||
|
||||
You are <span class="k">in</span> <span class="s1">'detached HEAD'</span> state. You can look around, make experimental
|
||||
changes and commit them, and you can discard any commits you make <span class="k">in</span> this
|
||||
You are in 'detached HEAD' state. You can look around, make experimental
|
||||
changes and commit them, and you can discard any commits you make in this
|
||||
state without impacting any branches by performing another checkout.
|
||||
|
||||
If you want to create a new branch to retain commits you create, you may
|
||||
<span class="k">do</span> so <span class="o">(</span>now or later<span class="o">)</span> by using -b with the checkout <span class="nb">command</span> again. Example:
|
||||
do so (now or later) by using -b with the checkout command again. Example:
|
||||
|
||||
git checkout -b <new-branch-name>
|
||||
|
||||
HEAD is now at df2fb7a adding file <span class="m">1</span>
|
||||
HEAD is now at df2fb7a adding file 1
|
||||
|
||||
<span class="c1"># checking contents, can verify it has old contents</span>
|
||||
# checking contents, can verify it has old contents
|
||||
$ ls
|
||||
file1.txt
|
||||
</code></pre></div>
|
||||
</code></pre>
|
||||
<p>So this is how we would get access to old versions/snapshots. All we need is a <em>reference</em> to that snapshot. Upon executing <code>git checkout ...</code>, what git does for you is use the <code>.git</code> folder, see what was the state of things (files and folders) at that version/reference and replace the contents of current directory with those contents. The then-existing content will no longer be present in the local dir (repo) but we can and will still get access to them because they are tracked via git commit and <code>.git</code> folder has them stored/tracked.</p>
|
||||
<h3 id="reference">Reference</h3>
|
||||
<p>I mention in the previous section that we need a <em>reference</em> to the version. By default, git repo is made of tree of commits. And each commit has a unique IDs. But the unique ID is not the only thing we can reference commits via. There are multiple ways to reference commits. For example: <code>HEAD</code> is a reference to current commit. <em>Whatever commit your repo is checked out at, <code>HEAD</code> will point to that.</em> <code>HEAD~1</code> is reference to previous commit. So while checking out previous version in section above, we could have done <code>git checkout HEAD~1</code>.</p>
|
||||
<p>Similarly, master is also a reference (to a branch). Since git uses tree like structure to store commits, there of course will be branches. And the default branch is called <code>master</code>. Master (or any branch reference) will point to the latest commit in the branch. Even though we have checked out to the previous commit in out repo, <code>master</code> still points to the latest commit. And we can get back to the latest version by checkout at <code>master</code> reference</p>
|
||||
<div class="highlight"><pre><span></span><code>$ git checkout master
|
||||
Previous HEAD position was df2fb7a adding file <span class="m">1</span>
|
||||
Switched to branch <span class="s1">'master'</span>
|
||||
<pre><code class="language-bash">$ git checkout master
|
||||
Previous HEAD position was df2fb7a adding file 1
|
||||
Switched to branch 'master'
|
||||
|
||||
<span class="c1"># now we will see latest code, with two files</span>
|
||||
# now we will see latest code, with two files
|
||||
$ ls
|
||||
file1.txt file2.txt
|
||||
</code></pre></div>
|
||||
</code></pre>
|
||||
<p>Note, instead of <code>master</code> in above command, we could have used commit's ID as well.</p>
|
||||
<h3 id="references-and-the-magic">References and The Magic</h3>
|
||||
<p>Let's look at the state of things. Two commits, <code>master</code> and <code>HEAD</code> references are pointing to the latest commit</p>
|
||||
<div class="highlight"><pre><span></span><code>$ git log --oneline --graph
|
||||
* 7f3b00e <span class="o">(</span>HEAD -> master<span class="o">)</span> adding file <span class="m">2</span>
|
||||
* df2fb7a adding file <span class="m">1</span>
|
||||
</code></pre></div>
|
||||
<pre><code class="language-bash">$ git log --oneline --graph
|
||||
* 7f3b00e (HEAD -> master) adding file 2
|
||||
* df2fb7a adding file 1
|
||||
</code></pre>
|
||||
<p>The magic? Let's examine these files:</p>
|
||||
<div class="highlight"><pre><span></span><code>$ cat .git/refs/heads/master
|
||||
<pre><code class="language-bash">$ cat .git/refs/heads/master
|
||||
7f3b00eaa957815884198e2fdfec29361108d6a9
|
||||
</code></pre></div>
|
||||
</code></pre>
|
||||
<p>Viola! Where master is pointing to is stored in a file. <strong>Whenever git needs to know where master reference is pointing to, or if git needs to update where master points, it just needs to update the file above.</strong> So when you create a new commit, a new commit is created on top of the current commit and the master file is updated with the new commit's ID.</p>
|
||||
<p>Similary, for <code>HEAD</code> reference:</p>
|
||||
<div class="highlight"><pre><span></span><code>$ cat .git/HEAD
|
||||
<pre><code class="language-bash">$ cat .git/HEAD
|
||||
ref: refs/heads/master
|
||||
</code></pre></div>
|
||||
</code></pre>
|
||||
<p>We can see <code>HEAD</code> is pointing to a reference called <code>refs/heads/master</code>. So <code>HEAD</code> will point where ever the <code>master</code> points.</p>
|
||||
<h3 id="little-adventure">Little Adventure</h3>
|
||||
<p>We discussed how git will update the files as we execute commands. But let's try to do it ourselves, by hand, and see what happens.</p>
|
||||
<div class="highlight"><pre><span></span><code>$ git log --oneline --graph
|
||||
* 7f3b00e <span class="o">(</span>HEAD -> master<span class="o">)</span> adding file <span class="m">2</span>
|
||||
* df2fb7a adding file <span class="m">1</span>
|
||||
</code></pre></div>
|
||||
<pre><code class="language-bash">$ git log --oneline --graph
|
||||
* 7f3b00e (HEAD -> master) adding file 2
|
||||
* df2fb7a adding file 1
|
||||
</code></pre>
|
||||
<p>Now let's change master to point to the previous/first commit.</p>
|
||||
<div class="highlight"><pre><span></span><code>$ <span class="nb">echo</span> df2fb7a61f5d40c1191e0fdeb0fc5d6e7969685a > .git/refs/heads/master
|
||||
<pre><code class="language-bash">$ echo df2fb7a61f5d40c1191e0fdeb0fc5d6e7969685a > .git/refs/heads/master
|
||||
$ git log --oneline --graph
|
||||
* df2fb7a <span class="o">(</span>HEAD -> master<span class="o">)</span> adding file <span class="m">1</span>
|
||||
* df2fb7a (HEAD -> master) adding file 1
|
||||
|
||||
<span class="c1"># RESETTING TO ORIGINAL</span>
|
||||
$ <span class="nb">echo</span> 7f3b00eaa957815884198e2fdfec29361108d6a9 > .git/refs/heads/master
|
||||
# RESETTING TO ORIGINAL
|
||||
$ echo 7f3b00eaa957815884198e2fdfec29361108d6a9 > .git/refs/heads/master
|
||||
$ git log --oneline --graph
|
||||
* 7f3b00e <span class="o">(</span>HEAD -> master<span class="o">)</span> adding file <span class="m">2</span>
|
||||
* df2fb7a adding file <span class="m">1</span>
|
||||
</code></pre></div>
|
||||
* 7f3b00e (HEAD -> master) adding file 2
|
||||
* df2fb7a adding file 1
|
||||
</code></pre>
|
||||
<p>We just edited the <code>master</code> reference file and now we can see only the first commit in git log. Undoing the change to the file brings the state back to original. Not so much of magic, is it?</p>
|
||||
|
||||
|
||||
|
||||
@@ -1281,23 +1281,23 @@
|
||||
</ul>
|
||||
<h2 id="hooks">Hooks</h2>
|
||||
<p>Git has another nice feature called hooks. Hooks are basically scripts which will be called when a certain event happens. Here is where hooks are located:</p>
|
||||
<div class="highlight"><pre><span></span><code>$ ls .git/hooks/
|
||||
<pre><code class="language-bash">$ ls .git/hooks/
|
||||
applypatch-msg.sample fsmonitor-watchman.sample pre-applypatch.sample pre-push.sample pre-receive.sample update.sample
|
||||
commit-msg.sample post-update.sample pre-commit.sample pre-rebase.sample prepare-commit-msg.sample
|
||||
</code></pre></div>
|
||||
</code></pre>
|
||||
<p>Names are self explanatory. These hooks are useful when you want to do certain things when a certain event happens. If you want to run tests before pushing code, you would want to setup <code>pre-push</code> hooks. Let's try to create a pre commit hook.</p>
|
||||
<div class="highlight"><pre><span></span><code>$ <span class="nb">echo</span> <span class="s2">"echo this is from pre commit hook"</span> > .git/hooks/pre-commit
|
||||
<pre><code class="language-bash">$ echo "echo this is from pre commit hook" > .git/hooks/pre-commit
|
||||
$ chmod +x .git/hooks/pre-commit
|
||||
</code></pre></div>
|
||||
</code></pre>
|
||||
<p>We basically create a file called <code>pre-commit</code> in hooks folder and make it executable. Now if we make a commit, we should see the message getting printed.</p>
|
||||
<div class="highlight"><pre><span></span><code>$ <span class="nb">echo</span> <span class="s2">"sample file"</span> > sample.txt
|
||||
<pre><code class="language-bash">$ echo "sample file" > sample.txt
|
||||
$ git add sample.txt
|
||||
$ git commit -m <span class="s2">"adding sample file"</span>
|
||||
this is from pre commit hook <span class="c1"># <===== THE MESSAGE FROM HOOK EXECUTION</span>
|
||||
<span class="o">[</span>master 9894e05<span class="o">]</span> adding sample file
|
||||
<span class="m">1</span> file changed, <span class="m">1</span> insertion<span class="o">(</span>+<span class="o">)</span>
|
||||
create mode <span class="m">100644</span> sample.txt
|
||||
</code></pre></div>
|
||||
$ git commit -m "adding sample file"
|
||||
this is from pre commit hook # <===== THE MESSAGE FROM HOOK EXECUTION
|
||||
[master 9894e05] adding sample file
|
||||
1 file changed, 1 insertion(+)
|
||||
create mode 100644 sample.txt
|
||||
</code></pre>
|
||||
|
||||
|
||||
|
||||
|
||||
@@ -1740,15 +1740,15 @@ online bash shell.</p>
|
||||
This command is very useful for many other purposes but we will discuss
|
||||
the simplest use case of creating a new file.</p>
|
||||
<p>General syntax of using touch command</p>
|
||||
<div class="highlight"><pre><span></span><code>touch <file_name>
|
||||
</code></pre></div>
|
||||
<pre><code>touch <file_name>
|
||||
</code></pre>
|
||||
<p><img alt="" src="../images/linux/commands/image9.png" /></p>
|
||||
<h3 id="mkdir-create-new-directories">mkdir (create new directories)</h3>
|
||||
<p>The mkdir command is used to create directories.You can use ls command
|
||||
to verify that the new directory is created.</p>
|
||||
<p>General syntax of using mkdir command</p>
|
||||
<div class="highlight"><pre><span></span><code>mkdir <directory_name>
|
||||
</code></pre></div>
|
||||
<pre><code>mkdir <directory_name>
|
||||
</code></pre>
|
||||
<p><img alt="" src="../images/linux/commands/image11.png" /></p>
|
||||
<h3 id="rm-delete-files-and-directories">rm (delete files and directories)</h3>
|
||||
<p>The rm command can be used to delete files and directories. It is very
|
||||
@@ -1757,8 +1757,8 @@ directories. It's almost impossible to recover these files and
|
||||
directories once you have executed rm command on them successfully. Do
|
||||
run this command with care.</p>
|
||||
<p>General syntax of using rm command:</p>
|
||||
<div class="highlight"><pre><span></span><code>rm <file_name>
|
||||
</code></pre></div>
|
||||
<pre><code>rm <file_name>
|
||||
</code></pre>
|
||||
<p>Let's try to understand the rm command with an example. We will try to
|
||||
delete the file and directory we created using touch and mkdir command
|
||||
respectively.</p>
|
||||
@@ -1769,8 +1769,8 @@ to another. Do note that the cp command doesn't do any change to the
|
||||
original files or directories. The original files or directories and
|
||||
their copy both co-exist after running cp command successfully.</p>
|
||||
<p>General syntax of using cp command:</p>
|
||||
<div class="highlight"><pre><span></span><code>cp <source_path> <destination_path>
|
||||
</code></pre></div>
|
||||
<pre><code>cp <source_path> <destination_path>
|
||||
</code></pre>
|
||||
<p>We are currently in the '/home/runner' directory. We will use the mkdir
|
||||
command to create a new directory named "test_directory". We will now
|
||||
try to copy the "_test_runner.py" file to the directory we created just
|
||||
@@ -1792,8 +1792,8 @@ location to another or it can be used to rename files or directories. Do
|
||||
note that moving files and copying them are very different. When you
|
||||
move the files or directories, the original copy is lost.</p>
|
||||
<p>General syntax of using mv command:</p>
|
||||
<div class="highlight"><pre><span></span><code>mv <source_path> <destination_path>
|
||||
</code></pre></div>
|
||||
<pre><code>mv <source_path> <destination_path>
|
||||
</code></pre>
|
||||
<p>In this example, we will use the mv command to move the
|
||||
"_test_runner.py" file to "test_directory". In this case, this file
|
||||
already exists in "test_directory". The mv command will just replace it.
|
||||
@@ -1924,8 +1924,8 @@ words in a text file. It will display all the lines in a file that
|
||||
contains a particular input. The word we want to search is provided as
|
||||
an input to the grep command.</p>
|
||||
<p>General syntax of using grep command:</p>
|
||||
<div class="highlight"><pre><span></span><code>grep <word_to_search> <file_name>
|
||||
</code></pre></div>
|
||||
<pre><code>grep <word_to_search> <file_name>
|
||||
</code></pre>
|
||||
<p>In this example, we are trying to search for a string "1" in this file.
|
||||
The grep command outputs the lines where it found this string.</p>
|
||||
<p><img alt="" src="../images/linux/commands/image5.png" /></p>
|
||||
@@ -1933,8 +1933,8 @@ The grep command outputs the lines where it found this string.</p>
|
||||
<p>The sed command in its simplest form can be used to replace a text in a
|
||||
file.</p>
|
||||
<p>General syntax of using the sed command for replacement:</p>
|
||||
<div class="highlight"><pre><span></span><code>sed <span class="s1">'s/<text_to_replace>/<replacement_text>/'</span> <file_name>
|
||||
</code></pre></div>
|
||||
<pre><code>sed 's/<text_to_replace>/<replacement_text>/' <file_name>
|
||||
</code></pre>
|
||||
<p>Let's try to replace each occurrence of "1" in the file with "3" using
|
||||
sed command.</p>
|
||||
<p><img alt="" src="../images/linux/commands/image31.png" /></p>
|
||||
|
||||
@@ -1269,63 +1269,64 @@
|
||||
|
||||
<h1 id="dns">DNS</h1>
|
||||
<p>Domain Names are the simple human-readable names for websites. The Internet understands only IP addresses, but since memorizing incoherent numbers is not practical, domain names are used instead. These domain names are translated into IP addresses by the DNS infrastructure. When somebody tries to open <a href="https://www.linkedin.com">www.linkedin.com</a> in the browser, the browser tries to convert <a href="https://www.linkedin.com">www.linkedin.com</a> to an IP Address. This process is called DNS resolution. A simple pseudocode depicting this process looks this</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="n">ip</span><span class="p">,</span> <span class="n">err</span> <span class="o">=</span> <span class="n">getIPAddress</span><span class="p">(</span><span class="n">domainName</span><span class="p">)</span>
|
||||
<span class="k">if</span> <span class="n">err</span><span class="p">:</span>
|
||||
<span class="nb">print</span><span class="p">(</span><span class="err">“</span><span class="n">unknown</span> <span class="n">Host</span> <span class="ne">Exception</span> <span class="k">while</span> <span class="n">trying</span> <span class="n">to</span> <span class="n">resolve</span><span class="p">:</span><span class="o">%</span><span class="n">s</span><span class="err">”</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">domainName</span><span class="p">))</span>
|
||||
</code></pre></div>
|
||||
<pre><code class="language-python">ip, err = getIPAddress(domainName)
|
||||
if err:
|
||||
print(“unknown Host Exception while trying to resolve:%s”.format(domainName))
|
||||
</code></pre>
|
||||
<p>Now let’s try to understand what happens inside the getIPAddress function. The browser would have a DNS cache of its own where it checks if there is a mapping for the domainName to an IP Address already available, in which case the browser uses that IP address. If no such mapping exists, the browser calls gethostbyname syscall to ask the operating system to find the IP address for the given domainName</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">getIPAddress</span><span class="p">(</span><span class="n">domainName</span><span class="p">):</span>
|
||||
<span class="n">resp</span><span class="p">,</span> <span class="n">fail</span> <span class="o">=</span> <span class="n">lookupCache</span><span class="p">(</span><span class="n">domainName</span><span class="p">)</span>
|
||||
<span class="n">If</span> <span class="ow">not</span> <span class="n">fail</span><span class="p">:</span>
|
||||
<span class="k">return</span> <span class="n">resp</span>
|
||||
<span class="k">else</span><span class="p">:</span>
|
||||
<span class="n">resp</span><span class="p">,</span> <span class="n">err</span> <span class="o">=</span> <span class="n">gethostbyname</span><span class="p">(</span><span class="n">domainName</span><span class="p">)</span>
|
||||
<span class="k">if</span> <span class="n">err</span><span class="p">:</span>
|
||||
<span class="k">return</span> <span class="n">null</span><span class="p">,</span> <span class="n">err</span>
|
||||
<span class="k">else</span><span class="p">:</span>
|
||||
<span class="k">return</span> <span class="n">resp</span>
|
||||
</code></pre></div>
|
||||
<pre><code class="language-python">def getIPAddress(domainName):
|
||||
resp, fail = lookupCache(domainName)
|
||||
If not fail:
|
||||
return resp
|
||||
else:
|
||||
resp, err = gethostbyname(domainName)
|
||||
if err:
|
||||
return null, err
|
||||
else:
|
||||
return resp
|
||||
</code></pre>
|
||||
<p>Now lets understand what operating system kernel does when the <a href="https://man7.org/linux/man-pages/man3/gethostbyname.3.html">gethostbyname</a> function is called. The Linux operating system looks at the file <a href="https://man7.org/linux/man-pages/man5/nsswitch.conf.5.html">/etc/nsswitch.conf</a> file which usually has a line</p>
|
||||
<div class="highlight"><pre><span></span><code>hosts: files dns
|
||||
</code></pre></div>
|
||||
<pre><code class="language-bash">hosts: files dns
|
||||
</code></pre>
|
||||
<p>This line means the OS has to look up first in file (/etc/hosts) and then use DNS protocol to do the resolution if there is no match in /etc/hosts. </p>
|
||||
<p>The file /etc/hosts is of format</p>
|
||||
<p>IPAddress FQDN [FQDN].*</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="m">127</span>.0.0.1 localhost.localdomain localhost
|
||||
<pre><code class="language-bash">127.0.0.1 localhost.localdomain localhost
|
||||
::1 localhost.localdomain localhost
|
||||
</code></pre></div>
|
||||
</code></pre>
|
||||
<p>If a match exists for a domain in this file then that IP address is returned by the OS. Lets add a line to this file</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="m">127</span>.0.0.1 test.linkedin.com
|
||||
</code></pre></div>
|
||||
<pre><code class="language-bash">127.0.0.1 test.linkedin.com
|
||||
</code></pre>
|
||||
<p>And then do ping test.linkedin.com</p>
|
||||
<div class="highlight"><pre><span></span><code>ping test.linkedin.com -n
|
||||
</code></pre></div>
|
||||
<div class="highlight"><pre><span></span><code>PING test.linkedin.com <span class="o">(</span><span class="m">127</span>.0.0.1<span class="o">)</span> <span class="m">56</span><span class="o">(</span><span class="m">84</span><span class="o">)</span> bytes of data.
|
||||
<span class="m">64</span> bytes from <span class="m">127</span>.0.0.1: <span class="nv">icmp_seq</span><span class="o">=</span><span class="m">1</span> <span class="nv">ttl</span><span class="o">=</span><span class="m">64</span> <span class="nv">time</span><span class="o">=</span><span class="m">0</span>.047 ms
|
||||
<span class="m">64</span> bytes from <span class="m">127</span>.0.0.1: <span class="nv">icmp_seq</span><span class="o">=</span><span class="m">2</span> <span class="nv">ttl</span><span class="o">=</span><span class="m">64</span> <span class="nv">time</span><span class="o">=</span><span class="m">0</span>.036 ms
|
||||
<span class="m">64</span> bytes from <span class="m">127</span>.0.0.1: <span class="nv">icmp_seq</span><span class="o">=</span><span class="m">3</span> <span class="nv">ttl</span><span class="o">=</span><span class="m">64</span> <span class="nv">time</span><span class="o">=</span><span class="m">0</span>.037 ms
|
||||
</code></pre></div>
|
||||
<pre><code class="language-bash">ping test.linkedin.com -n
|
||||
</code></pre>
|
||||
<pre><code class="language-bash">PING test.linkedin.com (127.0.0.1) 56(84) bytes of data.
|
||||
64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.047 ms
|
||||
64 bytes from 127.0.0.1: icmp_seq=2 ttl=64 time=0.036 ms
|
||||
64 bytes from 127.0.0.1: icmp_seq=3 ttl=64 time=0.037 ms
|
||||
|
||||
</code></pre>
|
||||
<p>As mentioned earlier, if no match exists in /etc/hosts, the OS tries to do a DNS resolution using the DNS protocol. The linux system makes a DNS request to the first IP in /etc/resolv.conf. If there is no response, requests are sent to subsequent servers in resolv.conf. These servers in resolv.conf are called DNS resolvers. The DNS resolvers are populated by <a href="https://en.wikipedia.org/wiki/Dynamic_Host_Configuration_Protocol">DHCP</a> or statically configured by an administrator.
|
||||
<a href="https://linux.die.net/man/1/dig">Dig</a> is a userspace DNS system which creates and sends request to DNS resolvers and prints the response it receives to the console.</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="c1">#run this command in one shell to capture all DNS requests</span>
|
||||
sudo tcpdump -s <span class="m">0</span> -A -i any port <span class="m">53</span>
|
||||
<span class="c1">#make a dig request from another shell</span>
|
||||
<pre><code class="language-bash">#run this command in one shell to capture all DNS requests
|
||||
sudo tcpdump -s 0 -A -i any port 53
|
||||
#make a dig request from another shell
|
||||
dig linkedin.com
|
||||
</code></pre></div>
|
||||
<div class="highlight"><pre><span></span><code><span class="m">13</span>:19:54.432507 IP <span class="m">172</span>.19.209.122.56497 > <span class="m">172</span>.23.195.101.53: <span class="m">527</span>+ <span class="o">[</span>1au<span class="o">]</span> A? linkedin.com. <span class="o">(</span><span class="m">41</span><span class="o">)</span>
|
||||
....E..E....@.n....z...e...5.1.:... .........linkedin.com.......<span class="o">)</span>........
|
||||
<span class="m">13</span>:19:54.485131 IP <span class="m">172</span>.23.195.101.53 > <span class="m">172</span>.19.209.122.56497: <span class="m">527</span> <span class="m">1</span>/0/1 A <span class="m">108</span>.174.10.10 <span class="o">(</span><span class="m">57</span><span class="o">)</span>
|
||||
....E..U..@.<span class="p">|</span>. ....e...z.5...A...............linkedin.com..............3..l.
|
||||
</code></pre>
|
||||
<pre><code class="language-bash">13:19:54.432507 IP 172.19.209.122.56497 > 172.23.195.101.53: 527+ [1au] A? linkedin.com. (41)
|
||||
....E..E....@.n....z...e...5.1.:... .........linkedin.com.......)........
|
||||
13:19:54.485131 IP 172.23.195.101.53 > 172.19.209.122.56497: 527 1/0/1 A 108.174.10.10 (57)
|
||||
....E..U..@.|. ....e...z.5...A...............linkedin.com..............3..l.
|
||||
|
||||
..<span class="o">)</span>........
|
||||
</code></pre></div>
|
||||
..)........
|
||||
</code></pre>
|
||||
<p>The packet capture shows a request is made to 172.23.195.101:53 (this is the resolver in /etc/resolv.conf) for linkedin.com and a response is received from 172.23.195.101 with the IP address of linkedin.com 108.174.10.10</p>
|
||||
<p>Now let's try to understand how DNS resolver tries to find the IP address of linkedin.com. DNS resolver first looks at its cache. Since many devices in the network can query for the domain name linkedin.com, the name resolution result may already exist in the cache. If there is a cache miss, it starts the DNS resolution process. The DNS server breaks “linkedin.com” to “.”, “com.” and “linkedin.com.” and starts DNS resolution from “.”. The “.” is called root domain and those IPs are known to the DNS resolver software. DNS resolver queries the root domain Nameservers to find the right nameservers which could respond regarding details for "com.". The address of the authoritative nameserver of “com.” is returned. Now the DNS resolution service contacts the authoritative nameserver for “com.” to fetch the authoritative nameserver for “linkedin.com”. Once an authoritative nameserver of “linkedin.com” is known, the resolver contacts Linkedin’s nameserver to provide the IP address of “linkedin.com”. This whole process can be visualized by running </p>
|
||||
<div class="highlight"><pre><span></span><code>dig +trace linkedin.com
|
||||
</code></pre></div>
|
||||
<p><div class="highlight"><pre><span></span><code>linkedin.com. <span class="m">3600</span> IN A <span class="m">108</span>.174.10.10
|
||||
</code></pre></div>
|
||||
This DNS response has 5 fields where the first field is the request and the last field is the response. The second field is the Time to Live which says how long the DNS response is valid in seconds. In this case this mapping of linkedin.com is valid for 1 hour. This is how the resolvers and application(browser) maintain their cache. Any request for linkedin.com beyond 1 hour will be treated as a cache miss as the mapping has expired its TTL and the whole process has to be redone.
|
||||
<pre><code class="language-bash">dig +trace linkedin.com
|
||||
</code></pre>
|
||||
<pre><code class="language-bash">linkedin.com. 3600 IN A 108.174.10.10
|
||||
</code></pre>
|
||||
<p>This DNS response has 5 fields where the first field is the request and the last field is the response. The second field is the Time to Live which says how long the DNS response is valid in seconds. In this case this mapping of linkedin.com is valid for 1 hour. This is how the resolvers and application(browser) maintain their cache. Any request for linkedin.com beyond 1 hour will be treated as a cache miss as the mapping has expired its TTL and the whole process has to be redone.
|
||||
The 4th field says the type of DNS response/request. Some of the various DNS query types are
|
||||
A, AAAA, NS, TXT, PTR, MX and CNAME.
|
||||
- A record returns IPV4 address of the domain name
|
||||
@@ -1333,12 +1334,12 @@ A, AAAA, NS, TXT, PTR, MX and CNAME.
|
||||
- NS record returns the authoritative nameserver for the domain name
|
||||
- CNAME records are aliases to the domain names. Some domains point to other domain names and resolving the latter domain name gives an IP which is used as an IP for the former domain name as well. Example www.linkedin.com’s IP address is the same as 2-01-2c3e-005a.cdx.cedexis.net.
|
||||
- For the brevity we are not discussing other DNS record types, the RFC of each of these records are available <a href="https://en.wikipedia.org/wiki/List_of_DNS_record_types">here</a>.</p>
|
||||
<p><div class="highlight"><pre><span></span><code>dig A linkedin.com +short
|
||||
<span class="m">108</span>.174.10.10
|
||||
<pre><code class="language-bash">dig A linkedin.com +short
|
||||
108.174.10.10
|
||||
|
||||
|
||||
dig AAAA linkedin.com +short
|
||||
<span class="m">2620</span>:109:c002::6cae:a0a
|
||||
2620:109:c002::6cae:a0a
|
||||
|
||||
|
||||
dig NS linkedin.com +short
|
||||
@@ -1352,9 +1353,9 @@ ns3.p43.dynect.net.
|
||||
dns1.p09.nsone.net.
|
||||
|
||||
dig www.linkedin.com CNAME +short
|
||||
<span class="m">2</span>-01-2c3e-005a.cdx.cedexis.net.
|
||||
</code></pre></div>
|
||||
Armed with these fundamentals of DNS lets see usecases where DNS is used by SREs.</p>
|
||||
2-01-2c3e-005a.cdx.cedexis.net.
|
||||
</code></pre>
|
||||
<p>Armed with these fundamentals of DNS lets see usecases where DNS is used by SREs.</p>
|
||||
<h2 id="applications-in-sre-role">Applications in SRE role</h2>
|
||||
<p>This section covers some of the common solutions SRE can derive from DNS</p>
|
||||
<ol>
|
||||
|
||||
@@ -1226,112 +1226,113 @@
|
||||
<h1 id="http">HTTP</h1>
|
||||
<p>Till this point we have only got the IP address of linkedin.com. The HTML page of linkedin.com is served by HTTP protocol which the browser renders. Browser sends a HTTP request to the IP of the server determined above.
|
||||
Request has a verb GET, PUT, POST followed by a path and query parameters and lines of key value pair which gives information about the client and capabilities of the client like contents it can accept and a body (usually in POST or PUT)</p>
|
||||
<p><div class="highlight"><pre><span></span><code><span class="c1"># Eg run the following in your container and have a look at the headers </span>
|
||||
<pre><code class="language-bash"># Eg run the following in your container and have a look at the headers
|
||||
curl linkedin.com -v
|
||||
</code></pre></div>
|
||||
<div class="highlight"><pre><span></span><code>* Connected to linkedin.com <span class="o">(</span><span class="m">108</span>.174.10.10<span class="o">)</span> port <span class="m">80</span> <span class="o">(</span><span class="c1">#0)</span>
|
||||
</code></pre>
|
||||
<pre><code class="language-bash">* Connected to linkedin.com (108.174.10.10) port 80 (#0)
|
||||
> GET / HTTP/1.1
|
||||
> Host: linkedin.com
|
||||
> User-Agent: curl/7.64.1
|
||||
> Accept: */*
|
||||
>
|
||||
< HTTP/1.1 <span class="m">301</span> Moved Permanently
|
||||
< Date: Mon, <span class="m">09</span> Nov <span class="m">2020</span> <span class="m">10</span>:39:43 GMT
|
||||
< HTTP/1.1 301 Moved Permanently
|
||||
< Date: Mon, 09 Nov 2020 10:39:43 GMT
|
||||
< X-Li-Pop: prod-esv5
|
||||
< X-LI-Proto: http/1.1
|
||||
< Location: https://www.linkedin.com/
|
||||
< Content-Length: <span class="m">0</span>
|
||||
< Content-Length: 0
|
||||
<
|
||||
* Connection <span class="c1">#0 to host linkedin.com left intact</span>
|
||||
* Closing connection <span class="m">0</span>
|
||||
</code></pre></div></p>
|
||||
* Connection #0 to host linkedin.com left intact
|
||||
* Closing connection 0
|
||||
</code></pre>
|
||||
<p>Here, in the first line GET is the verb, / is the path and 1.1 is the HTTP protocol version. Then there are key value pairs which give client capabilities and some details to the server. The server responds back with HTTP version, <a href="https://en.wikipedia.org/wiki/List_of_HTTP_status_codes">Status Code and Status message</a>. Status codes 2xx means success, 3xx denotes redirection, 4xx denotes client side errors and 5xx server side errors.</p>
|
||||
<p>We will now jump in to see the difference between HTTP/1.0 and HTTP/1.1. </p>
|
||||
<div class="highlight"><pre><span></span><code><span class="c1">#On the terminal type</span>
|
||||
telnet www.linkedin.com <span class="m">80</span>
|
||||
<span class="c1">#Copy and paste the following with an empty new line at last in the telnet STDIN</span>
|
||||
<pre><code class="language-bash">#On the terminal type
|
||||
telnet www.linkedin.com 80
|
||||
#Copy and paste the following with an empty new line at last in the telnet STDIN
|
||||
GET / HTTP/1.1
|
||||
HOST:linkedin.com
|
||||
USER-AGENT: curl
|
||||
</code></pre></div>
|
||||
|
||||
</code></pre>
|
||||
<p>This would get server response and waits for next input as the underlying connection to www.linkedin.com can be reused for further queries. While going through TCP, we can understand the benefits of this. But in HTTP/1.0 this connection will be immediately closed after the response meaning new connection has to be opened for each query. HTTP/1.1 can have only one inflight request in an open connection but connection can be reused for multiple requests one after another. One of the benefits of HTTP/2.0 over HTTP/1.1 is we can have multiple inflight requests on the same connection. We are restricting our scope to generic HTTP and not jumping to the intricacies of each protocol version but they should be straight forward to understand post the course.</p>
|
||||
<p>HTTP is called <strong>stateless protocol</strong>. This section we will try to understand what stateless means. Say we logged in to linkedin.com, each request to linkedin.com from the client will have no context of the user and it makes no sense to prompt user to login for each page/resource. This problem of HTTP is solved by <em>COOKIE</em>. A user is created a session when a user logs in. This session identifier is sent to the browser via <em>SET-COOKIE</em> header. The browser stores the COOKIE till the expiry set by the server and sends the cookie for each request from hereon for linkedin.com. More details on cookies are available <a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Cookies">here</a>. Cookies are a critical piece of information like password and since HTTP is a plain text protocol, any man in the middle can capture either password or cookies and can breach the privacy of the user. Similarly as discussed during DNS a spoofed IP of linkedin.com can cause a phishing attack on users where an user can give linkedin’s password to login on the malicious site. To solve both problems HTTPs came in place and HTTPs has to be mandated.</p>
|
||||
<p>HTTPS has to provide server identification and encryption of data between client and server. The server administrator has to generate a private public key pair and certificate request. This certificate request has to be signed by a certificate authority which converts the certificate request to a certificate. The server administrator has to update the certificate and private key to the webserver. The certificate has details about the server (like domain name for which it serves, expiry date), public key of the server. The private key is a secret to the server and losing the private key loses the trust the server provides. When clients connect, the client sends a HELLO. The server sends its certificate to the client. The client checks the validity of the cert by seeing if it is within its expiry time, if it is signed by a trusted authority and the hostname in the cert is the same as the server. This validation makes sure the server is the right server and there is no phishing. Once that is validated, the client negotiates a symmetrical key and cipher with the server by encrypting the negotiation with the public key of the server. Nobody else other than the server who has the private key can understand this data. Once negotiation is complete, that symmetric key and algorithm is used for further encryption which can be decrypted only by client and server from thereon as they only know the symmetric key and algorithm. The switch to symmetric algorithm from asymmetric encryption algorithm is to not strain the resources of client devices as symmetric encryption is generally less resource intensive than asymmetric. </p>
|
||||
<p><div class="highlight"><pre><span></span><code><span class="c1">#Try the following on your terminal to see the cert details like Subject Name(domain name), Issuer details, Expiry date</span>
|
||||
<pre><code class="language-bash">#Try the following on your terminal to see the cert details like Subject Name(domain name), Issuer details, Expiry date
|
||||
curl https://www.linkedin.com -v
|
||||
</code></pre></div>
|
||||
<div class="highlight"><pre><span></span><code>* Connected to www.linkedin.com <span class="o">(</span><span class="m">13</span>.107.42.14<span class="o">)</span> port <span class="m">443</span> <span class="o">(</span><span class="c1">#0)</span>
|
||||
</code></pre>
|
||||
<pre><code class="language-bash">* Connected to www.linkedin.com (13.107.42.14) port 443 (#0)
|
||||
* ALPN, offering h2
|
||||
* ALPN, offering http/1.1
|
||||
* successfully <span class="nb">set</span> certificate verify locations:
|
||||
* successfully set certificate verify locations:
|
||||
* CAfile: /etc/ssl/cert.pem
|
||||
CApath: none
|
||||
* TLSv1.2 <span class="o">(</span>OUT<span class="o">)</span>, TLS handshake, Client hello <span class="o">(</span><span class="m">1</span><span class="o">)</span>:
|
||||
<span class="o">}</span> <span class="o">[</span><span class="m">230</span> bytes data<span class="o">]</span>
|
||||
* TLSv1.2 <span class="o">(</span>IN<span class="o">)</span>, TLS handshake, Server hello <span class="o">(</span><span class="m">2</span><span class="o">)</span>:
|
||||
<span class="o">{</span> <span class="o">[</span><span class="m">90</span> bytes data<span class="o">]</span>
|
||||
* TLSv1.2 <span class="o">(</span>IN<span class="o">)</span>, TLS handshake, Certificate <span class="o">(</span><span class="m">11</span><span class="o">)</span>:
|
||||
<span class="o">{</span> <span class="o">[</span><span class="m">3171</span> bytes data<span class="o">]</span>
|
||||
* TLSv1.2 <span class="o">(</span>IN<span class="o">)</span>, TLS handshake, Server key exchange <span class="o">(</span><span class="m">12</span><span class="o">)</span>:
|
||||
<span class="o">{</span> <span class="o">[</span><span class="m">365</span> bytes data<span class="o">]</span>
|
||||
* TLSv1.2 <span class="o">(</span>IN<span class="o">)</span>, TLS handshake, Server finished <span class="o">(</span><span class="m">14</span><span class="o">)</span>:
|
||||
<span class="o">{</span> <span class="o">[</span><span class="m">4</span> bytes data<span class="o">]</span>
|
||||
* TLSv1.2 <span class="o">(</span>OUT<span class="o">)</span>, TLS handshake, Client key exchange <span class="o">(</span><span class="m">16</span><span class="o">)</span>:
|
||||
<span class="o">}</span> <span class="o">[</span><span class="m">102</span> bytes data<span class="o">]</span>
|
||||
* TLSv1.2 <span class="o">(</span>OUT<span class="o">)</span>, TLS change cipher, Change cipher spec <span class="o">(</span><span class="m">1</span><span class="o">)</span>:
|
||||
<span class="o">}</span> <span class="o">[</span><span class="m">1</span> bytes data<span class="o">]</span>
|
||||
* TLSv1.2 <span class="o">(</span>OUT<span class="o">)</span>, TLS handshake, Finished <span class="o">(</span><span class="m">20</span><span class="o">)</span>:
|
||||
<span class="o">}</span> <span class="o">[</span><span class="m">16</span> bytes data<span class="o">]</span>
|
||||
* TLSv1.2 <span class="o">(</span>IN<span class="o">)</span>, TLS change cipher, Change cipher spec <span class="o">(</span><span class="m">1</span><span class="o">)</span>:
|
||||
<span class="o">{</span> <span class="o">[</span><span class="m">1</span> bytes data<span class="o">]</span>
|
||||
* TLSv1.2 <span class="o">(</span>IN<span class="o">)</span>, TLS handshake, Finished <span class="o">(</span><span class="m">20</span><span class="o">)</span>:
|
||||
<span class="o">{</span> <span class="o">[</span><span class="m">16</span> bytes data<span class="o">]</span>
|
||||
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
|
||||
} [230 bytes data]
|
||||
* TLSv1.2 (IN), TLS handshake, Server hello (2):
|
||||
{ [90 bytes data]
|
||||
* TLSv1.2 (IN), TLS handshake, Certificate (11):
|
||||
{ [3171 bytes data]
|
||||
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
|
||||
{ [365 bytes data]
|
||||
* TLSv1.2 (IN), TLS handshake, Server finished (14):
|
||||
{ [4 bytes data]
|
||||
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
|
||||
} [102 bytes data]
|
||||
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
|
||||
} [1 bytes data]
|
||||
* TLSv1.2 (OUT), TLS handshake, Finished (20):
|
||||
} [16 bytes data]
|
||||
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
|
||||
{ [1 bytes data]
|
||||
* TLSv1.2 (IN), TLS handshake, Finished (20):
|
||||
{ [16 bytes data]
|
||||
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
|
||||
* ALPN, server accepted to use h2
|
||||
* Server certificate:
|
||||
* subject: <span class="nv">C</span><span class="o">=</span>US<span class="p">;</span> <span class="nv">ST</span><span class="o">=</span>California<span class="p">;</span> <span class="nv">L</span><span class="o">=</span>Sunnyvale<span class="p">;</span> <span class="nv">O</span><span class="o">=</span>LinkedIn Corporation<span class="p">;</span> <span class="nv">CN</span><span class="o">=</span>www.linkedin.com
|
||||
* start date: Oct <span class="m">2</span> <span class="m">00</span>:00:00 <span class="m">2020</span> GMT
|
||||
* expire date: Apr <span class="m">2</span> <span class="m">12</span>:00:00 <span class="m">2021</span> GMT
|
||||
* subjectAltName: host <span class="s2">"www.linkedin.com"</span> matched cert<span class="s1">'s "www.linkedin.com"</span>
|
||||
<span class="s1">* issuer: C=US; O=DigiCert Inc; CN=DigiCert SHA2 Secure Server CA</span>
|
||||
<span class="s1">* SSL certificate verify ok.</span>
|
||||
<span class="s1">* Using HTTP2, server supports multi-use</span>
|
||||
<span class="s1">* Connection state changed (HTTP/2 confirmed)</span>
|
||||
<span class="s1">* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0</span>
|
||||
<span class="s1">* Using Stream ID: 1 (easy handle 0x7fb055808200)</span>
|
||||
<span class="s1">* Connection state changed (MAX_CONCURRENT_STREAMS == 100)!</span>
|
||||
<span class="s1"> 0 82117 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0</span>
|
||||
<span class="s1">* Connection #0 to host www.linkedin.com left intact</span>
|
||||
<span class="s1">HTTP/2 200 </span>
|
||||
<span class="s1">cache-control: no-cache, no-store</span>
|
||||
<span class="s1">pragma: no-cache</span>
|
||||
<span class="s1">content-length: 82117</span>
|
||||
<span class="s1">content-type: text/html; charset=utf-8</span>
|
||||
<span class="s1">expires: Thu, 01 Jan 1970 00:00:00 GMT</span>
|
||||
<span class="s1">set-cookie: JSESSIONID=ajax:2747059799136291014; SameSite=None; Path=/; Domain=.www.linkedin.com; Secure</span>
|
||||
<span class="s1">set-cookie: lang=v=2&lang=en-us; SameSite=None; Path=/; Domain=linkedin.com; Secure</span>
|
||||
<span class="s1">set-cookie: bcookie="v=2&70bd59e3-5a51-406c-8e0d-dd70befa8890"; domain=.linkedin.com; Path=/; Secure; Expires=Wed, 09-Nov-2022 22:27:42 GMT; SameSite=None</span>
|
||||
<span class="s1">set-cookie: bscookie="v=1&202011091050107ae9b7ac-fe97-40fc-830d-d7a9ccf80659AQGib5iXwarbY8CCBP94Q39THkgUlx6J"; domain=.www.linkedin.com; Path=/; Secure; Expires=Wed, 09-Nov-2022 22:27:42 GMT; HttpOnly; SameSite=None</span>
|
||||
<span class="s1">set-cookie: lissc=1; domain=.linkedin.com; Path=/; Secure; Expires=Tue, 09-Nov-2021 10:50:10 GMT; SameSite=None</span>
|
||||
<span class="s1">set-cookie: lidc="b=VGST04:s=V:r=V:g=2201:u=1:i=1604919010:t=1605005410:v=1:sig=AQHe-KzU8i_5Iy6MwnFEsgRct3c9Lh5R"; Expires=Tue, 10 Nov 2020 10:50:10 GMT; domain=.linkedin.com; Path=/; SameSite=None; Secure</span>
|
||||
<span class="s1">x-fs-txn-id: 2b8d5409ba70</span>
|
||||
<span class="s1">x-fs-uuid: 61bbf94956d14516302567fc882b0000</span>
|
||||
<span class="s1">expect-ct: max-age=86400, report-uri="https://www.linkedin.com/platform-telemetry/ct"</span>
|
||||
<span class="s1">x-xss-protection: 1; mode=block</span>
|
||||
<span class="s1">content-security-policy-report-only: default-src '</span>none<span class="s1">'; connect-src '</span>self<span class="s1">' www.linkedin.com www.google-analytics.com https://dpm.demdex.net/id lnkd.demdex.net blob: https://linkedin.sc.omtrdc.net/b/ss/ static.licdn.com static-exp1.licdn.com static-exp2.licdn.com static-exp3.licdn.com; script-src '</span>sha256-THuVhwbXPeTR0HszASqMOnIyxqEgvGyBwSPBKBF/iMc<span class="o">=</span><span class="s1">' '</span>sha256-PyCXNcEkzRWqbiNr087fizmiBBrq9O6GGD8eV3P09Ik<span class="o">=</span><span class="s1">' '</span>sha256-2SQ55Erm3CPCb+k03EpNxU9bdV3XL9TnVTriDs7INZ4<span class="o">=</span><span class="s1">' '</span>sha256-S/KSPe186K/1B0JEjbIXcCdpB97krdzX05S+dHnQjUs<span class="o">=</span><span class="s1">' platform.linkedin.com platform-akam.linkedin.com platform-ecst.linkedin.com platform-azur.linkedin.com static.licdn.com static-exp1.licdn.com static-exp2.licdn.com static-exp3.licdn.com; img-src data: blob: *; font-src data: *; style-src '</span>self<span class="s1">' '</span>unsafe-inline<span class="s1">' static.licdn.com static-exp1.licdn.com static-exp2.licdn.com static-exp3.licdn.com; media-src dms.licdn.com; child-src blob: *; frame-src '</span>self<span class="s1">' lnkd.demdex.net linkedin.cdn.qualaroo.com; manifest-src '</span>self<span class="s1">'; report-uri https://www.linkedin.com/platform-telemetry/csp?f=g</span>
|
||||
<span class="s1">content-security-policy: default-src *; connect-src '</span>self<span class="s1">' https://media-src.linkedin.com/media/ www.linkedin.com s.c.lnkd.licdn.com m.c.lnkd.licdn.com s.c.exp1.licdn.com s.c.exp2.licdn.com m.c.exp1.licdn.com m.c.exp2.licdn.com wss://*.linkedin.com dms.licdn.com https://dpm.demdex.net/id lnkd.demdex.net blob: https://accounts.google.com/gsi/status https://linkedin.sc.omtrdc.net/b/ss/ www.google-analytics.com static.licdn.com static-exp1.licdn.com static-exp2.licdn.com static-exp3.licdn.com media.licdn.com media-exp1.licdn.com media-exp2.licdn.com media-exp3.licdn.com; img-src data: blob: *; font-src data: *; style-src '</span>unsafe-inline<span class="s1">' '</span>self<span class="s1">' static-src.linkedin.com *.licdn.com; script-src '</span>report-sample<span class="s1">' '</span>unsafe-inline<span class="s1">' '</span>unsafe-eval<span class="s1">' '</span>self<span class="s1">' spdy.linkedin.com static-src.linkedin.com *.ads.linkedin.com *.licdn.com static.chartbeat.com www.google-analytics.com ssl.google-analytics.com bcvipva02.rightnowtech.com www.bizographics.com sjs.bizographics.com js.bizographics.com d.la4-c1-was.salesforceliveagent.com slideshare.www.linkedin.com https://snap.licdn.com/li.lms-analytics/ platform.linkedin.com platform-akam.linkedin.com platform-ecst.linkedin.com platform-azur.linkedin.com; object-src '</span>none<span class="s1">'; media-src blob: *; child-src blob: lnkd-communities: voyager: *; frame-ancestors '</span>self<span class="err">'</span><span class="p">;</span> report-uri https://www.linkedin.com/platform-telemetry/csp?f<span class="o">=</span>l
|
||||
* subject: C=US; ST=California; L=Sunnyvale; O=LinkedIn Corporation; CN=www.linkedin.com
|
||||
* start date: Oct 2 00:00:00 2020 GMT
|
||||
* expire date: Apr 2 12:00:00 2021 GMT
|
||||
* subjectAltName: host "www.linkedin.com" matched cert's "www.linkedin.com"
|
||||
* issuer: C=US; O=DigiCert Inc; CN=DigiCert SHA2 Secure Server CA
|
||||
* SSL certificate verify ok.
|
||||
* Using HTTP2, server supports multi-use
|
||||
* Connection state changed (HTTP/2 confirmed)
|
||||
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
|
||||
* Using Stream ID: 1 (easy handle 0x7fb055808200)
|
||||
* Connection state changed (MAX_CONCURRENT_STREAMS == 100)!
|
||||
0 82117 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
|
||||
* Connection #0 to host www.linkedin.com left intact
|
||||
HTTP/2 200
|
||||
cache-control: no-cache, no-store
|
||||
pragma: no-cache
|
||||
content-length: 82117
|
||||
content-type: text/html; charset=utf-8
|
||||
expires: Thu, 01 Jan 1970 00:00:00 GMT
|
||||
set-cookie: JSESSIONID=ajax:2747059799136291014; SameSite=None; Path=/; Domain=.www.linkedin.com; Secure
|
||||
set-cookie: lang=v=2&lang=en-us; SameSite=None; Path=/; Domain=linkedin.com; Secure
|
||||
set-cookie: bcookie="v=2&70bd59e3-5a51-406c-8e0d-dd70befa8890"; domain=.linkedin.com; Path=/; Secure; Expires=Wed, 09-Nov-2022 22:27:42 GMT; SameSite=None
|
||||
set-cookie: bscookie="v=1&202011091050107ae9b7ac-fe97-40fc-830d-d7a9ccf80659AQGib5iXwarbY8CCBP94Q39THkgUlx6J"; domain=.www.linkedin.com; Path=/; Secure; Expires=Wed, 09-Nov-2022 22:27:42 GMT; HttpOnly; SameSite=None
|
||||
set-cookie: lissc=1; domain=.linkedin.com; Path=/; Secure; Expires=Tue, 09-Nov-2021 10:50:10 GMT; SameSite=None
|
||||
set-cookie: lidc="b=VGST04:s=V:r=V:g=2201:u=1:i=1604919010:t=1605005410:v=1:sig=AQHe-KzU8i_5Iy6MwnFEsgRct3c9Lh5R"; Expires=Tue, 10 Nov 2020 10:50:10 GMT; domain=.linkedin.com; Path=/; SameSite=None; Secure
|
||||
x-fs-txn-id: 2b8d5409ba70
|
||||
x-fs-uuid: 61bbf94956d14516302567fc882b0000
|
||||
expect-ct: max-age=86400, report-uri="https://www.linkedin.com/platform-telemetry/ct"
|
||||
x-xss-protection: 1; mode=block
|
||||
content-security-policy-report-only: default-src 'none'; connect-src 'self' www.linkedin.com www.google-analytics.com https://dpm.demdex.net/id lnkd.demdex.net blob: https://linkedin.sc.omtrdc.net/b/ss/ static.licdn.com static-exp1.licdn.com static-exp2.licdn.com static-exp3.licdn.com; script-src 'sha256-THuVhwbXPeTR0HszASqMOnIyxqEgvGyBwSPBKBF/iMc=' 'sha256-PyCXNcEkzRWqbiNr087fizmiBBrq9O6GGD8eV3P09Ik=' 'sha256-2SQ55Erm3CPCb+k03EpNxU9bdV3XL9TnVTriDs7INZ4=' 'sha256-S/KSPe186K/1B0JEjbIXcCdpB97krdzX05S+dHnQjUs=' platform.linkedin.com platform-akam.linkedin.com platform-ecst.linkedin.com platform-azur.linkedin.com static.licdn.com static-exp1.licdn.com static-exp2.licdn.com static-exp3.licdn.com; img-src data: blob: *; font-src data: *; style-src 'self' 'unsafe-inline' static.licdn.com static-exp1.licdn.com static-exp2.licdn.com static-exp3.licdn.com; media-src dms.licdn.com; child-src blob: *; frame-src 'self' lnkd.demdex.net linkedin.cdn.qualaroo.com; manifest-src 'self'; report-uri https://www.linkedin.com/platform-telemetry/csp?f=g
|
||||
content-security-policy: default-src *; connect-src 'self' https://media-src.linkedin.com/media/ www.linkedin.com s.c.lnkd.licdn.com m.c.lnkd.licdn.com s.c.exp1.licdn.com s.c.exp2.licdn.com m.c.exp1.licdn.com m.c.exp2.licdn.com wss://*.linkedin.com dms.licdn.com https://dpm.demdex.net/id lnkd.demdex.net blob: https://accounts.google.com/gsi/status https://linkedin.sc.omtrdc.net/b/ss/ www.google-analytics.com static.licdn.com static-exp1.licdn.com static-exp2.licdn.com static-exp3.licdn.com media.licdn.com media-exp1.licdn.com media-exp2.licdn.com media-exp3.licdn.com; img-src data: blob: *; font-src data: *; style-src 'unsafe-inline' 'self' static-src.linkedin.com *.licdn.com; script-src 'report-sample' 'unsafe-inline' 'unsafe-eval' 'self' spdy.linkedin.com static-src.linkedin.com *.ads.linkedin.com *.licdn.com static.chartbeat.com www.google-analytics.com ssl.google-analytics.com bcvipva02.rightnowtech.com www.bizographics.com sjs.bizographics.com js.bizographics.com d.la4-c1-was.salesforceliveagent.com slideshare.www.linkedin.com https://snap.licdn.com/li.lms-analytics/ platform.linkedin.com platform-akam.linkedin.com platform-ecst.linkedin.com platform-azur.linkedin.com; object-src 'none'; media-src blob: *; child-src blob: lnkd-communities: voyager: *; frame-ancestors 'self'; report-uri https://www.linkedin.com/platform-telemetry/csp?f=l
|
||||
x-frame-options: sameorigin
|
||||
x-content-type-options: nosniff
|
||||
strict-transport-security: max-age<span class="o">=</span><span class="m">2592000</span>
|
||||
strict-transport-security: max-age=2592000
|
||||
x-li-fabric: prod-lva1
|
||||
x-li-pop: afd-prod-lva1
|
||||
x-li-proto: http/2
|
||||
x-li-uuid: <span class="nv">Ybv5SVbRRRYwJWf8iCsAAA</span><span class="o">==</span>
|
||||
x-msedge-ref: Ref A: CFB9AC1D2B0645DDB161CEE4A4909AEF Ref B: BOM02EDGE0712 Ref C: <span class="m">2020</span>-11-09T10:50:10Z
|
||||
date: Mon, <span class="m">09</span> Nov <span class="m">2020</span> <span class="m">10</span>:50:10 GMT
|
||||
x-li-uuid: Ybv5SVbRRRYwJWf8iCsAAA==
|
||||
x-msedge-ref: Ref A: CFB9AC1D2B0645DDB161CEE4A4909AEF Ref B: BOM02EDGE0712 Ref C: 2020-11-09T10:50:10Z
|
||||
date: Mon, 09 Nov 2020 10:50:10 GMT
|
||||
|
||||
* Closing connection <span class="m">0</span>
|
||||
</code></pre></div></p>
|
||||
* Closing connection 0
|
||||
</code></pre>
|
||||
<p>Here my system has a list of certificate authorities it trusts in this file /etc/ssl/cert.pem. Curl validates the certificate is for www.linkedin.com by seeing the CN section of the subject part of the certificate. It also makes sure the certificate is not expired by seeing the expire date. It also validates the signature on the certificate by using the public key of issuer Digicert in /etc/ssl/cert.pem. Once this is done, using the public key of www.linkedin.com it negotiates cipher TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 with a symmetric key. Subsequent data transfer including first HTTP request uses the same cipher and symmetric key.</p>
|
||||
|
||||
|
||||
|
||||
@@ -1269,14 +1269,14 @@
|
||||
|
||||
<h1 id="ip-routing-and-data-link-layer">IP Routing and Data Link Layer</h1>
|
||||
<p>We will dig how packets that leave the client reach the server and vice versa. When the packet reaches the IP layer, the transport layer populates source port, destination port. IP/Network layer populates destination IP(discovered from DNS) and then looks up the route to the destination IP on the routing table. </p>
|
||||
<div class="highlight"><pre><span></span><code><span class="c1">#Linux route -n command gives the default routing table</span>
|
||||
<pre><code class="language-bash">#Linux route -n command gives the default routing table
|
||||
route -n
|
||||
</code></pre></div>
|
||||
<div class="highlight"><pre><span></span><code>Kernel IP routing table
|
||||
</code></pre>
|
||||
<pre><code class="language-bash">Kernel IP routing table
|
||||
Destination Gateway Genmask Flags Metric Ref Use Iface
|
||||
<span class="m">0</span>.0.0.0 <span class="m">172</span>.17.0.1 <span class="m">0</span>.0.0.0 UG <span class="m">0</span> <span class="m">0</span> <span class="m">0</span> eth0
|
||||
<span class="m">172</span>.17.0.0 <span class="m">0</span>.0.0.0 <span class="m">255</span>.255.0.0 U <span class="m">0</span> <span class="m">0</span> <span class="m">0</span> eth0
|
||||
</code></pre></div>
|
||||
0.0.0.0 172.17.0.1 0.0.0.0 UG 0 0 0 eth0
|
||||
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0
|
||||
</code></pre>
|
||||
<p>Here the destination IP is bitwise AND’d with the Genmask and if the answer is the destination part of the table then that gateway and interface is picked for routing. Here linkedin.com’s IP 108.174.10.10 is AND’d with 255.255.255.0 and the answer we get is 108.174.10.0 which doesn’t match with any destination in the routing table. Then Linux does an AND of destination IP with 0.0.0.0 and we get 0.0.0.0. This answer matches the default row</p>
|
||||
<p>Routing table is processed in the order of more octets of 1 set in genmask and genmask 0.0.0.0 is the default route if nothing matches.
|
||||
At the end of this operation Linux figured out that the packet has to be sent to next hop 172.17.0.1 via eth0. The source IP of the packet will be set as the IP of interface eth0.
|
||||
|
||||
@@ -1271,11 +1271,11 @@
|
||||
<p>TCP is a transport layer protocol like UDP but it guarantees reliability, flow control and congestion control.
|
||||
TCP guarantees reliable delivery by using sequence numbers. A TCP connection is established by a three way handshake. In our case, the client sends a SYN packet along with the starting sequence number it plans to use, the server acknowledges the SYN packet and sends a SYN with its sequence number. Once the client acknowledges the syn packet, the connection is established. Each data transferred from here on is considered delivered reliably once acknowledgement for that sequence is received by the concerned party</p>
|
||||
<p><img alt="3-way handshake" src="../images/established.png" /></p>
|
||||
<div class="highlight"><pre><span></span><code><span class="c1">#To understand handshake run packet capture on one bash session</span>
|
||||
tcpdump -S -i any port <span class="m">80</span>
|
||||
<span class="c1">#Run curl on one bash session</span>
|
||||
<pre><code class="language-bash">#To understand handshake run packet capture on one bash session
|
||||
tcpdump -S -i any port 80
|
||||
#Run curl on one bash session
|
||||
curl www.linkedin.com
|
||||
</code></pre></div>
|
||||
</code></pre>
|
||||
<p><img alt="tcpdump-3way" src="../images/pcap.png" /></p>
|
||||
<p>Here client sends a syn flag shown by [S] flag with a sequence number 1522264672. The server acknowledges receipt of SYN with an ack [.] flag and a Syn flag for its sequence number[S]. The server uses the sequence number 1063230400 and acknowledges the client it’s expecting sequence number 1522264673 (client sequence+1). Client sends a zero length acknowledgement packet to the server(server sequence+1) and connection stands established. This is called three way handshake. The client sends a 76 bytes length packet after this and increments its sequence number by 76. Server sends a 170 byte response and closes the connection. This was the difference we were talking about between HTTP/1.1 and HTTP/1.0. In HTTP/1.1 this same connection can be reused which reduces overhead of 3 way handshake for each HTTP request. If a packet is missed between client and server, server won’t send an ack to the client and client would retry sending the packet till the ACK is received. This guarantees reliability.
|
||||
The flow control is established by the win size field in each segment. The win size says available TCP buffer length in the kernel which can be used to buffer received segments. A size 0 means the receiver has a lot of lag to catch from its socket buffer and the sender has to pause sending packets so that receiver can cope up. This flow control protects from slow receiver and fast sender problem</p>
|
||||
|
||||
@@ -1414,28 +1414,28 @@
|
||||
<p>This might sound a little weird to you: python, in a way is a compiled language! Python has a compiler built-in! It is obvious in the case of java since we compile it using a separate command ie: <code>javac helloWorld.java</code> and it will produce a <code>.class</code> file which we know as a <em>bytecode</em>. Well, python is very similar to that. One difference here is that there is no separate compile command/binary needed to run a python program.</p>
|
||||
<p><strong>What is the difference then, between java and python?</strong>
|
||||
Well, Java's compiler is more strict and sophisticated. As you might know Java is a statically typed language. So the compiler is written in a way that it can verify types related errors during compile time. While python being a <em>dynamic</em> language, types are not known until a program is run. So in a way, python compiler is dumb (or, less strict). But there indeed is a compile step involved when a python program is run. You might have seen python bytecode files with <code>.pyc</code> extension. Here is how you can see bytecode for a given python program.</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="c1"># Create a Hello World</span>
|
||||
$ <span class="nb">echo</span> <span class="s2">"print('hello world')"</span> > hello_world.py
|
||||
<pre><code class="language-bash"># Create a Hello World
|
||||
$ echo "print('hello world')" > hello_world.py
|
||||
|
||||
<span class="c1"># Making sure it runs</span>
|
||||
# Making sure it runs
|
||||
$ python3 hello_world.py
|
||||
hello world
|
||||
|
||||
<span class="c1"># The bytecode of the given program</span>
|
||||
# The bytecode of the given program
|
||||
$ python -m dis hello_world.py
|
||||
<span class="m">1</span> <span class="m">0</span> LOAD_NAME <span class="m">0</span> <span class="o">(</span>print<span class="o">)</span>
|
||||
<span class="m">2</span> LOAD_CONST <span class="m">0</span> <span class="o">(</span><span class="s1">'hello world'</span><span class="o">)</span>
|
||||
<span class="m">4</span> CALL_FUNCTION <span class="m">1</span>
|
||||
<span class="m">6</span> POP_TOP
|
||||
<span class="m">8</span> LOAD_CONST <span class="m">1</span> <span class="o">(</span>None<span class="o">)</span>
|
||||
<span class="m">10</span> RETURN_VALUE
|
||||
</code></pre></div>
|
||||
1 0 LOAD_NAME 0 (print)
|
||||
2 LOAD_CONST 0 ('hello world')
|
||||
4 CALL_FUNCTION 1
|
||||
6 POP_TOP
|
||||
8 LOAD_CONST 1 (None)
|
||||
10 RETURN_VALUE
|
||||
</code></pre>
|
||||
<p>Read more about dis module <a href="https://docs.python.org/3/library/dis.html">here</a></p>
|
||||
<p>Now coming to C/C++, there of course is a compiler. But the output is different than what java/python compiler would produce. Compiling a C program would produce what we also know as <em>machine code</em>. As opposed to bytecode.</p>
|
||||
<h3 id="running-the-programs">Running The Programs</h3>
|
||||
<p>We know compilation is involved in all 3 languages we are discussing. Just that the compilers are different in nature and they output different types of content. In case of C/C++, the output is machine code which can be directly read by your operating system. When you execute that program, your OS will know how exactly to run it. <strong>But this is not the case with bytecode.</strong></p>
|
||||
<p>Those bytecodes are language specific. Python has its own set of bytecode defined (more in <code>dis</code> module) and so does java. So naturally, your operating system will not know how to run it. To run this bytecode, we have something called Virtual Machines. Ie: The JVM or the Python VM (CPython, Jython). These so called Virtual Machines are the programs which can read the bytecode and run it on a given operating system. Python has multiple VMs available. Cpython is a python VM implemented in C language, similarly Jython is a Java implementation of python VM. <strong>At the end of the day, what they should be capable of is to understand python language syntax, be able to compile it to bytecode and be able to run that bytecode.</strong> You can implement a python VM in any language! (And people do so, just because it can be done)</p>
|
||||
<div class="highlight"><pre><span></span><code> The Operating System
|
||||
<pre><code> The Operating System
|
||||
|
||||
+------------------------------------+
|
||||
| |
|
||||
@@ -1465,7 +1465,7 @@ hello_world.c OS Specific machinecode | A New Pr
|
||||
| |
|
||||
| |
|
||||
+------------------------------------+
|
||||
</code></pre></div>
|
||||
</code></pre>
|
||||
<p>Two things to note for above diagram:</p>
|
||||
<ol>
|
||||
<li>Generally, when we run a python program, a python VM process is started which reads the python source code, compiles it to byte code and run it in a single step. Compiling is not a separate step. Shown only for illustration purpose.</li>
|
||||
|
||||
@@ -1338,84 +1338,84 @@
|
||||
<p><strong>Everything in Python is an object.</strong></p>
|
||||
<p>That includes the functions, lists, dicts, classes, modules, a running function (instance of function definition), everything. In the CPython, it would mean there is an underlying struct variable for each object.</p>
|
||||
<p>In python's current execution context, all the variables are stored in a dict. It'd be a string to object mapping. If you have a function and a float variable defined in the current context, here is how it is handled internally.</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="o">>>></span> <span class="n">float_number</span><span class="o">=</span><span class="mf">42.0</span>
|
||||
<span class="o">>>></span> <span class="k">def</span> <span class="nf">foo_func</span><span class="p">():</span>
|
||||
<span class="o">...</span> <span class="k">pass</span>
|
||||
<span class="o">...</span>
|
||||
<pre><code class="language-python">>>> float_number=42.0
|
||||
>>> def foo_func():
|
||||
... pass
|
||||
...
|
||||
|
||||
<span class="c1"># NOTICE HOW VARIABLE NAMES ARE STRINGS, stored in a dict</span>
|
||||
<span class="o">>>></span> <span class="nb">locals</span><span class="p">()</span>
|
||||
<span class="p">{</span><span class="s1">'__name__'</span><span class="p">:</span> <span class="s1">'__main__'</span><span class="p">,</span> <span class="s1">'__doc__'</span><span class="p">:</span> <span class="kc">None</span><span class="p">,</span> <span class="s1">'__package__'</span><span class="p">:</span> <span class="kc">None</span><span class="p">,</span> <span class="s1">'__loader__'</span><span class="p">:</span> <span class="o"><</span><span class="k">class</span> <span class="err">'</span><span class="nc">_frozen_importlib</span><span class="o">.</span><span class="n">BuiltinImporter</span><span class="s1">'>, '</span><span class="n">__spec__</span><span class="s1">': None, '</span><span class="vm">__annotations__</span><span class="s1">': </span><span class="si">{}</span><span class="s1">, '</span><span class="n">__builtins__</span><span class="s1">': <module '</span><span class="n">builtins</span><span class="s1">' (built-in)>, '</span><span class="n">float_number</span><span class="s1">': 42.0, '</span><span class="n">foo_func</span><span class="s1">': <function foo_func at 0x1055847a0>}</span>
|
||||
</code></pre></div>
|
||||
# NOTICE HOW VARIABLE NAMES ARE STRINGS, stored in a dict
|
||||
>>> locals()
|
||||
{'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <class '_frozen_importlib.BuiltinImporter'>, '__spec__': None, '__annotations__': {}, '__builtins__': <module 'builtins' (built-in)>, 'float_number': 42.0, 'foo_func': <function foo_func at 0x1055847a0>}
|
||||
</code></pre>
|
||||
<h2 id="python-functions">Python Functions</h2>
|
||||
<p>Since functions too are objects, we can see what all attributes a function contains as following</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="o">>>></span> <span class="k">def</span> <span class="nf">hello</span><span class="p">(</span><span class="n">name</span><span class="p">):</span>
|
||||
<span class="o">...</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Hello, </span><span class="si">{</span><span class="n">name</span><span class="si">}</span><span class="s2">!"</span><span class="p">)</span>
|
||||
<span class="o">...</span>
|
||||
<span class="o">>>></span> <span class="nb">dir</span><span class="p">(</span><span class="n">hello</span><span class="p">)</span>
|
||||
<span class="p">[</span><span class="s1">'__annotations__'</span><span class="p">,</span> <span class="s1">'__call__'</span><span class="p">,</span> <span class="s1">'__class__'</span><span class="p">,</span> <span class="s1">'__closure__'</span><span class="p">,</span> <span class="s1">'__code__'</span><span class="p">,</span> <span class="s1">'__defaults__'</span><span class="p">,</span> <span class="s1">'__delattr__'</span><span class="p">,</span> <span class="s1">'__dict__'</span><span class="p">,</span>
|
||||
<span class="s1">'__dir__'</span><span class="p">,</span> <span class="s1">'__doc__'</span><span class="p">,</span> <span class="s1">'__eq__'</span><span class="p">,</span> <span class="s1">'__format__'</span><span class="p">,</span> <span class="s1">'__ge__'</span><span class="p">,</span> <span class="s1">'__get__'</span><span class="p">,</span> <span class="s1">'__getattribute__'</span><span class="p">,</span> <span class="s1">'__globals__'</span><span class="p">,</span> <span class="s1">'__gt__'</span><span class="p">,</span>
|
||||
<span class="s1">'__hash__'</span><span class="p">,</span> <span class="s1">'__init__'</span><span class="p">,</span> <span class="s1">'__init_subclass__'</span><span class="p">,</span> <span class="s1">'__kwdefaults__'</span><span class="p">,</span> <span class="s1">'__le__'</span><span class="p">,</span> <span class="s1">'__lt__'</span><span class="p">,</span> <span class="s1">'__module__'</span><span class="p">,</span> <span class="s1">'__name__'</span><span class="p">,</span>
|
||||
<span class="s1">'__ne__'</span><span class="p">,</span> <span class="s1">'__new__'</span><span class="p">,</span> <span class="s1">'__qualname__'</span><span class="p">,</span> <span class="s1">'__reduce__'</span><span class="p">,</span> <span class="s1">'__reduce_ex__'</span><span class="p">,</span> <span class="s1">'__repr__'</span><span class="p">,</span> <span class="s1">'__setattr__'</span><span class="p">,</span> <span class="s1">'__sizeof__'</span><span class="p">,</span> <span class="s1">'__str__'</span><span class="p">,</span>
|
||||
<span class="s1">'__subclasshook__'</span><span class="p">]</span>
|
||||
</code></pre></div>
|
||||
<pre><code class="language-python">>>> def hello(name):
|
||||
... print(f"Hello, {name}!")
|
||||
...
|
||||
>>> dir(hello)
|
||||
['__annotations__', '__call__', '__class__', '__closure__', '__code__', '__defaults__', '__delattr__', '__dict__',
|
||||
'__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__get__', '__getattribute__', '__globals__', '__gt__',
|
||||
'__hash__', '__init__', '__init_subclass__', '__kwdefaults__', '__le__', '__lt__', '__module__', '__name__',
|
||||
'__ne__', '__new__', '__qualname__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__',
|
||||
'__subclasshook__']
|
||||
</code></pre>
|
||||
<p>While there are a lot of them, let's look at some interesting ones</p>
|
||||
<h4 id="globals"><strong>globals</strong></h4>
|
||||
<p>This attribute, as the name suggests, has references of global variables. If you ever need to know what all global variables are in the scope of this function, this will tell you. See how the function start seeing the new variable in globals</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="o">>>></span> <span class="n">hello</span><span class="o">.</span><span class="vm">__globals__</span>
|
||||
<span class="p">{</span><span class="s1">'__name__'</span><span class="p">:</span> <span class="s1">'__main__'</span><span class="p">,</span> <span class="s1">'__doc__'</span><span class="p">:</span> <span class="kc">None</span><span class="p">,</span> <span class="s1">'__package__'</span><span class="p">:</span> <span class="kc">None</span><span class="p">,</span> <span class="s1">'__loader__'</span><span class="p">:</span> <span class="o"><</span><span class="k">class</span> <span class="err">'</span><span class="nc">_frozen_importlib</span><span class="o">.</span><span class="n">BuiltinImporter</span><span class="s1">'>, '</span><span class="n">__spec__</span><span class="s1">': None, '</span><span class="vm">__annotations__</span><span class="s1">': </span><span class="si">{}</span><span class="s1">, '</span><span class="n">__builtins__</span><span class="s1">': <module '</span><span class="n">builtins</span><span class="s1">' (built-in)>, '</span><span class="n">hello</span><span class="s1">': <function hello at 0x7fe4e82554c0>}</span>
|
||||
<pre><code class="language-python">>>> hello.__globals__
|
||||
{'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <class '_frozen_importlib.BuiltinImporter'>, '__spec__': None, '__annotations__': {}, '__builtins__': <module 'builtins' (built-in)>, 'hello': <function hello at 0x7fe4e82554c0>}
|
||||
|
||||
<span class="c1"># adding new global variable</span>
|
||||
<span class="o">>>></span> <span class="n">GLOBAL</span><span class="o">=</span><span class="s2">"g_val"</span>
|
||||
<span class="o">>>></span> <span class="n">hello</span><span class="o">.</span><span class="vm">__globals__</span>
|
||||
<span class="p">{</span><span class="s1">'__name__'</span><span class="p">:</span> <span class="s1">'__main__'</span><span class="p">,</span> <span class="s1">'__doc__'</span><span class="p">:</span> <span class="kc">None</span><span class="p">,</span> <span class="s1">'__package__'</span><span class="p">:</span> <span class="kc">None</span><span class="p">,</span> <span class="s1">'__loader__'</span><span class="p">:</span> <span class="o"><</span><span class="k">class</span> <span class="err">'</span><span class="nc">_frozen_importlib</span><span class="o">.</span><span class="n">BuiltinImporter</span><span class="s1">'>, '</span><span class="n">__spec__</span><span class="s1">': None, '</span><span class="vm">__annotations__</span><span class="s1">': </span><span class="si">{}</span><span class="s1">, '</span><span class="n">__builtins__</span><span class="s1">': <module '</span><span class="n">builtins</span><span class="s1">' (built-in)>, '</span><span class="n">hello</span><span class="s1">': <function hello at 0x7fe4e82554c0>, '</span><span class="n">GLOBAL</span><span class="s1">': '</span><span class="n">g_val</span><span class="s1">'}</span>
|
||||
</code></pre></div>
|
||||
# adding new global variable
|
||||
>>> GLOBAL="g_val"
|
||||
>>> hello.__globals__
|
||||
{'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <class '_frozen_importlib.BuiltinImporter'>, '__spec__': None, '__annotations__': {}, '__builtins__': <module 'builtins' (built-in)>, 'hello': <function hello at 0x7fe4e82554c0>, 'GLOBAL': 'g_val'}
|
||||
</code></pre>
|
||||
<h3 id="code"><strong>code</strong></h3>
|
||||
<p>This is an interesting one! As everything in python is an object, this includes the bytecode too. The compiled python bytecode is a python code object. Which is accessible via <code>__code__</code> attribute here. A function has an associated code object which carries some interesting information.</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="c1"># the file in which function is defined</span>
|
||||
<span class="c1"># stdin here since this is run in an interpreter</span>
|
||||
<span class="o">>>></span> <span class="n">hello</span><span class="o">.</span><span class="vm">__code__</span><span class="o">.</span><span class="n">co_filename</span>
|
||||
<span class="s1">'<stdin>'</span>
|
||||
<pre><code class="language-python"># the file in which function is defined
|
||||
# stdin here since this is run in an interpreter
|
||||
>>> hello.__code__.co_filename
|
||||
'<stdin>'
|
||||
|
||||
<span class="c1"># number of arguments the function takes</span>
|
||||
<span class="o">>>></span> <span class="n">hello</span><span class="o">.</span><span class="vm">__code__</span><span class="o">.</span><span class="n">co_argcount</span>
|
||||
<span class="mi">1</span>
|
||||
# number of arguments the function takes
|
||||
>>> hello.__code__.co_argcount
|
||||
1
|
||||
|
||||
<span class="c1"># local variable names</span>
|
||||
<span class="o">>>></span> <span class="n">hello</span><span class="o">.</span><span class="vm">__code__</span><span class="o">.</span><span class="n">co_varnames</span>
|
||||
<span class="p">(</span><span class="s1">'name'</span><span class="p">,)</span>
|
||||
# local variable names
|
||||
>>> hello.__code__.co_varnames
|
||||
('name',)
|
||||
|
||||
<span class="c1"># the function code's compiled bytecode</span>
|
||||
<span class="o">>>></span> <span class="n">hello</span><span class="o">.</span><span class="vm">__code__</span><span class="o">.</span><span class="n">co_code</span>
|
||||
<span class="sa">b</span><span class="s1">'t</span><span class="se">\x00</span><span class="s1">d</span><span class="se">\x01</span><span class="s1">|</span><span class="se">\x00\x9b\x00</span><span class="s1">d</span><span class="se">\x02\x9d\x03\x83\x01\x01\x00</span><span class="s1">d</span><span class="se">\x00</span><span class="s1">S</span><span class="se">\x00</span><span class="s1">'</span>
|
||||
</code></pre></div>
|
||||
# the function code's compiled bytecode
|
||||
>>> hello.__code__.co_code
|
||||
b't\x00d\x01|\x00\x9b\x00d\x02\x9d\x03\x83\x01\x01\x00d\x00S\x00'
|
||||
</code></pre>
|
||||
<p>There are more code attributes which you can enlist by <code>>>> dir(hello.__code__)</code></p>
|
||||
<h2 id="decorators">Decorators</h2>
|
||||
<p>Related to functions, python has another feature called decorators. Let's see how that works, keeping <code>everything is an object</code> in mind.</p>
|
||||
<p>Here is a sample decorator:</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="o">>>></span> <span class="k">def</span> <span class="nf">deco</span><span class="p">(</span><span class="n">func</span><span class="p">):</span>
|
||||
<span class="o">...</span> <span class="k">def</span> <span class="nf">inner</span><span class="p">():</span>
|
||||
<span class="o">...</span> <span class="nb">print</span><span class="p">(</span><span class="s2">"before"</span><span class="p">)</span>
|
||||
<span class="o">...</span> <span class="n">func</span><span class="p">()</span>
|
||||
<span class="o">...</span> <span class="nb">print</span><span class="p">(</span><span class="s2">"after"</span><span class="p">)</span>
|
||||
<span class="o">...</span> <span class="k">return</span> <span class="n">inner</span>
|
||||
<span class="o">...</span>
|
||||
<span class="o">>>></span> <span class="nd">@deco</span>
|
||||
<span class="o">...</span> <span class="k">def</span> <span class="nf">hello_world</span><span class="p">():</span>
|
||||
<span class="o">...</span> <span class="nb">print</span><span class="p">(</span><span class="s2">"hello world"</span><span class="p">)</span>
|
||||
<span class="o">...</span>
|
||||
<span class="o">>>></span>
|
||||
<span class="o">>>></span> <span class="n">hello_world</span><span class="p">()</span>
|
||||
<span class="n">before</span>
|
||||
<span class="n">hello</span> <span class="n">world</span>
|
||||
<span class="n">after</span>
|
||||
</code></pre></div>
|
||||
<pre><code class="language-python">>>> def deco(func):
|
||||
... def inner():
|
||||
... print("before")
|
||||
... func()
|
||||
... print("after")
|
||||
... return inner
|
||||
...
|
||||
>>> @deco
|
||||
... def hello_world():
|
||||
... print("hello world")
|
||||
...
|
||||
>>>
|
||||
>>> hello_world()
|
||||
before
|
||||
hello world
|
||||
after
|
||||
</code></pre>
|
||||
<p>Here <code>@deco</code> syntax is used to decorate the <code>hello_world</code> function. It is essentially same as doing</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="o">>>></span> <span class="k">def</span> <span class="nf">hello_world</span><span class="p">():</span>
|
||||
<span class="o">...</span> <span class="nb">print</span><span class="p">(</span><span class="s2">"hello world"</span><span class="p">)</span>
|
||||
<span class="o">...</span>
|
||||
<span class="o">>>></span> <span class="n">hello_world</span> <span class="o">=</span> <span class="n">deco</span><span class="p">(</span><span class="n">hello_world</span><span class="p">)</span>
|
||||
</code></pre></div>
|
||||
<pre><code class="language-python">>>> def hello_world():
|
||||
... print("hello world")
|
||||
...
|
||||
>>> hello_world = deco(hello_world)
|
||||
</code></pre>
|
||||
<p>What goes inside the <code>deco</code> function might seem complex. Let's try to uncover it.</p>
|
||||
<ol>
|
||||
<li>Function <code>hello_world</code> is created</li>
|
||||
@@ -1429,7 +1429,7 @@
|
||||
<li><code>hello_world</code> is replaced with above function</li>
|
||||
</ol>
|
||||
<p>Let's visualize it for better understanding</p>
|
||||
<div class="highlight"><pre><span></span><code> BEFORE function_object (ID: 100)
|
||||
<pre><code> BEFORE function_object (ID: 100)
|
||||
|
||||
"hello_world" +--------------------+
|
||||
+ |print("hello_world")|
|
||||
@@ -1456,7 +1456,7 @@
|
||||
|
|
||||
|
|
||||
"hello_world" +-------------+
|
||||
</code></pre></div>
|
||||
</code></pre>
|
||||
<p>Note how the <code>hello_world</code> name points to a new function object but that new function object knows the reference (ID) of the original function.</p>
|
||||
<h2 id="some-gotchas">Some Gotchas</h2>
|
||||
<ul>
|
||||
|
||||
@@ -1284,31 +1284,31 @@
|
||||
<p>Since serving web requests is no longer a simple task like reading files from disk and return contents, we need to process each http request, perform some operations programmatically and construct a response.</p>
|
||||
<h2 id="sockets">Sockets</h2>
|
||||
<p>Though we have frameworks like flask, HTTP is still a protocol that works over TCP protocol. So let us setup a TCP server and send an HTTP request and inspect the request's payload. Note that this is not a tutorial on socket programming but what we are doing here is inspecting HTTP protocol at its ground level and look at what its contents look like. (Ref: <a href="https://realpython.com/python-sockets/">Socket Programming in Python (Guide) on RealPython</a>)</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">socket</span>
|
||||
<pre><code class="language-python">import socket
|
||||
|
||||
<span class="n">HOST</span> <span class="o">=</span> <span class="s1">'127.0.0.1'</span> <span class="c1"># Standard loopback interface address (localhost)</span>
|
||||
<span class="n">PORT</span> <span class="o">=</span> <span class="mi">65432</span> <span class="c1"># Port to listen on (non-privileged ports are > 1023)</span>
|
||||
HOST = '127.0.0.1' # Standard loopback interface address (localhost)
|
||||
PORT = 65432 # Port to listen on (non-privileged ports are > 1023)
|
||||
|
||||
<span class="k">with</span> <span class="n">socket</span><span class="o">.</span><span class="n">socket</span><span class="p">(</span><span class="n">socket</span><span class="o">.</span><span class="n">AF_INET</span><span class="p">,</span> <span class="n">socket</span><span class="o">.</span><span class="n">SOCK_STREAM</span><span class="p">)</span> <span class="k">as</span> <span class="n">s</span><span class="p">:</span>
|
||||
<span class="n">s</span><span class="o">.</span><span class="n">bind</span><span class="p">((</span><span class="n">HOST</span><span class="p">,</span> <span class="n">PORT</span><span class="p">))</span>
|
||||
<span class="n">s</span><span class="o">.</span><span class="n">listen</span><span class="p">()</span>
|
||||
<span class="n">conn</span><span class="p">,</span> <span class="n">addr</span> <span class="o">=</span> <span class="n">s</span><span class="o">.</span><span class="n">accept</span><span class="p">()</span>
|
||||
<span class="k">with</span> <span class="n">conn</span><span class="p">:</span>
|
||||
<span class="nb">print</span><span class="p">(</span><span class="s1">'Connected by'</span><span class="p">,</span> <span class="n">addr</span><span class="p">)</span>
|
||||
<span class="k">while</span> <span class="kc">True</span><span class="p">:</span>
|
||||
<span class="n">data</span> <span class="o">=</span> <span class="n">conn</span><span class="o">.</span><span class="n">recv</span><span class="p">(</span><span class="mi">1024</span><span class="p">)</span>
|
||||
<span class="k">if</span> <span class="ow">not</span> <span class="n">data</span><span class="p">:</span>
|
||||
<span class="k">break</span>
|
||||
<span class="nb">print</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
|
||||
</code></pre></div>
|
||||
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
|
||||
s.bind((HOST, PORT))
|
||||
s.listen()
|
||||
conn, addr = s.accept()
|
||||
with conn:
|
||||
print('Connected by', addr)
|
||||
while True:
|
||||
data = conn.recv(1024)
|
||||
if not data:
|
||||
break
|
||||
print(data)
|
||||
</code></pre>
|
||||
<p>Then we open <code>localhost:65432</code> in our web browser and following would be the output:</p>
|
||||
<div class="highlight"><pre><span></span><code>Connected by <span class="o">(</span><span class="s1">'127.0.0.1'</span>, <span class="m">54719</span><span class="o">)</span>
|
||||
b<span class="s1">'GET / HTTP/1.1\r\nHost: localhost:65432\r\nConnection: keep-alive\r\nDNT: 1\r\nUpgrade-Insecure-Requests: 1\r\nUser-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.83 Safari/537.36 Edg/85.0.564.44\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9\r\nSec-Fetch-Site: none\r\nSec-Fetch-Mode: navigate\r\nSec-Fetch-User: ?1\r\nSec-Fetch-Dest: document\r\nAccept-Encoding: gzip, deflate, br\r\nAccept-Language: en-US,en;q=0.9\r\n\r\n'</span>
|
||||
</code></pre></div>
|
||||
<pre><code class="language-bash">Connected by ('127.0.0.1', 54719)
|
||||
b'GET / HTTP/1.1\r\nHost: localhost:65432\r\nConnection: keep-alive\r\nDNT: 1\r\nUpgrade-Insecure-Requests: 1\r\nUser-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.83 Safari/537.36 Edg/85.0.564.44\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9\r\nSec-Fetch-Site: none\r\nSec-Fetch-Mode: navigate\r\nSec-Fetch-User: ?1\r\nSec-Fetch-Dest: document\r\nAccept-Encoding: gzip, deflate, br\r\nAccept-Language: en-US,en;q=0.9\r\n\r\n'
|
||||
</code></pre>
|
||||
<p>Examine closely and the content will look like the HTTP protocol's format. ie:</p>
|
||||
<div class="highlight"><pre><span></span><code>HTTP_METHOD URI_PATH HTTP_VERSION
|
||||
<pre><code>HTTP_METHOD URI_PATH HTTP_VERSION
|
||||
HEADERS_SEPARATED_BY_SEPARATOR
|
||||
</code></pre></div>
|
||||
</code></pre>
|
||||
<p>So though it's a blob of bytes, knowing <a href="https://tools.ietf.org/html/rfc2616">http protocol specification</a>, you can parse that string (ie: split by <code>\r\n</code>) and get meaningful information out of it.</p>
|
||||
<h2 id="flask">Flask</h2>
|
||||
<p>Flask, and other such frameworks does pretty much what we just discussed in the last section (with added more sophistication). They listen on a port on a TCP socket, receive an HTTP request, parse the data according to protocol format and make it available to you in a convenient manner.</p>
|
||||
|
||||
@@ -1369,87 +1369,87 @@
|
||||
<h3 id="5-other">5. Other</h3>
|
||||
<p>We are not accounting for users into our app and other possible features like rate limiting, customized links etc but it will eventually come up with time. Depending on the requirements, they too might need to get incorporated.</p>
|
||||
<p>The minimal working code is given below for reference but I'd encourage you to come up with your own.</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">flask</span> <span class="kn">import</span> <span class="n">Flask</span><span class="p">,</span> <span class="n">redirect</span><span class="p">,</span> <span class="n">request</span>
|
||||
<pre><code class="language-python">from flask import Flask, redirect, request
|
||||
|
||||
<span class="kn">from</span> <span class="nn">hashlib</span> <span class="kn">import</span> <span class="n">md5</span>
|
||||
from hashlib import md5
|
||||
|
||||
<span class="n">app</span> <span class="o">=</span> <span class="n">Flask</span><span class="p">(</span><span class="s2">"url_shortener"</span><span class="p">)</span>
|
||||
app = Flask("url_shortener")
|
||||
|
||||
<span class="n">mapping</span> <span class="o">=</span> <span class="p">{}</span>
|
||||
mapping = {}
|
||||
|
||||
<span class="nd">@app</span><span class="o">.</span><span class="n">route</span><span class="p">(</span><span class="s2">"/shorten"</span><span class="p">,</span> <span class="n">methods</span><span class="o">=</span><span class="p">[</span><span class="s2">"POST"</span><span class="p">])</span>
|
||||
<span class="k">def</span> <span class="nf">shorten</span><span class="p">():</span>
|
||||
<span class="k">global</span> <span class="n">mapping</span>
|
||||
<span class="n">payload</span> <span class="o">=</span> <span class="n">request</span><span class="o">.</span><span class="n">json</span>
|
||||
@app.route("/shorten", methods=["POST"])
|
||||
def shorten():
|
||||
global mapping
|
||||
payload = request.json
|
||||
|
||||
<span class="k">if</span> <span class="s2">"url"</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">payload</span><span class="p">:</span>
|
||||
<span class="k">return</span> <span class="s2">"Missing URL Parameter"</span><span class="p">,</span> <span class="mi">400</span>
|
||||
if "url" not in payload:
|
||||
return "Missing URL Parameter", 400
|
||||
|
||||
<span class="c1"># TODO: check if URL is valid</span>
|
||||
# TODO: check if URL is valid
|
||||
|
||||
<span class="n">hash_</span> <span class="o">=</span> <span class="n">md5</span><span class="p">()</span>
|
||||
<span class="n">hash_</span><span class="o">.</span><span class="n">update</span><span class="p">(</span><span class="n">payload</span><span class="p">[</span><span class="s2">"url"</span><span class="p">]</span><span class="o">.</span><span class="n">encode</span><span class="p">())</span>
|
||||
<span class="n">digest</span> <span class="o">=</span> <span class="n">hash_</span><span class="o">.</span><span class="n">hexdigest</span><span class="p">()[:</span><span class="mi">5</span><span class="p">]</span> <span class="c1"># limiting to 5 chars. Less the limit more the chances of collission</span>
|
||||
hash_ = md5()
|
||||
hash_.update(payload["url"].encode())
|
||||
digest = hash_.hexdigest()[:5] # limiting to 5 chars. Less the limit more the chances of collission
|
||||
|
||||
<span class="k">if</span> <span class="n">digest</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">mapping</span><span class="p">:</span>
|
||||
<span class="n">mapping</span><span class="p">[</span><span class="n">digest</span><span class="p">]</span> <span class="o">=</span> <span class="n">payload</span><span class="p">[</span><span class="s2">"url"</span><span class="p">]</span>
|
||||
<span class="k">return</span> <span class="sa">f</span><span class="s2">"Shortened: r/</span><span class="si">{</span><span class="n">digest</span><span class="si">}</span><span class="se">\n</span><span class="s2">"</span>
|
||||
<span class="k">else</span><span class="p">:</span>
|
||||
<span class="c1"># TODO: check for hash collission</span>
|
||||
<span class="k">return</span> <span class="sa">f</span><span class="s2">"Already exists: r/</span><span class="si">{</span><span class="n">digest</span><span class="si">}</span><span class="se">\n</span><span class="s2">"</span>
|
||||
if digest not in mapping:
|
||||
mapping[digest] = payload["url"]
|
||||
return f"Shortened: r/{digest}\n"
|
||||
else:
|
||||
# TODO: check for hash collission
|
||||
return f"Already exists: r/{digest}\n"
|
||||
|
||||
|
||||
<span class="nd">@app</span><span class="o">.</span><span class="n">route</span><span class="p">(</span><span class="s2">"/r/<hash_>"</span><span class="p">)</span>
|
||||
<span class="k">def</span> <span class="nf">redirect_</span><span class="p">(</span><span class="n">hash_</span><span class="p">):</span>
|
||||
<span class="k">if</span> <span class="n">hash_</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">mapping</span><span class="p">:</span>
|
||||
<span class="k">return</span> <span class="s2">"URL Not Found"</span><span class="p">,</span> <span class="mi">404</span>
|
||||
<span class="k">return</span> <span class="n">redirect</span><span class="p">(</span><span class="n">mapping</span><span class="p">[</span><span class="n">hash_</span><span class="p">])</span>
|
||||
@app.route("/r/<hash_>")
|
||||
def redirect_(hash_):
|
||||
if hash_ not in mapping:
|
||||
return "URL Not Found", 404
|
||||
return redirect(mapping[hash_])
|
||||
|
||||
|
||||
<span class="k">if</span> <span class="vm">__name__</span> <span class="o">==</span> <span class="s2">"__main__"</span><span class="p">:</span>
|
||||
<span class="n">app</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">debug</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
|
||||
if __name__ == "__main__":
|
||||
app.run(debug=True)
|
||||
|
||||
<span class="sd">"""</span>
|
||||
<span class="sd">OUTPUT:</span>
|
||||
"""
|
||||
OUTPUT:
|
||||
|
||||
|
||||
<span class="sd">===> SHORTENING</span>
|
||||
===> SHORTENING
|
||||
|
||||
<span class="sd">$ curl localhost:5000/shorten -H "content-type: application/json" --data '{"url":"https://linkedin.com"}'</span>
|
||||
<span class="sd">Shortened: r/a62a4</span>
|
||||
$ curl localhost:5000/shorten -H "content-type: application/json" --data '{"url":"https://linkedin.com"}'
|
||||
Shortened: r/a62a4
|
||||
|
||||
|
||||
<span class="sd">===> REDIRECTING, notice the response code 302 and the location header</span>
|
||||
===> REDIRECTING, notice the response code 302 and the location header
|
||||
|
||||
<span class="sd">$ curl localhost:5000/r/a62a4 -v</span>
|
||||
<span class="sd">* Uses proxy env variable NO_PROXY == '127.0.0.1'</span>
|
||||
<span class="sd">* Trying ::1...</span>
|
||||
<span class="sd">* TCP_NODELAY set</span>
|
||||
<span class="sd">* Connection failed</span>
|
||||
<span class="sd">* connect to ::1 port 5000 failed: Connection refused</span>
|
||||
<span class="sd">* Trying 127.0.0.1...</span>
|
||||
<span class="sd">* TCP_NODELAY set</span>
|
||||
<span class="sd">* Connected to localhost (127.0.0.1) port 5000 (#0)</span>
|
||||
<span class="sd">> GET /r/a62a4 HTTP/1.1</span>
|
||||
<span class="sd">> Host: localhost:5000</span>
|
||||
<span class="sd">> User-Agent: curl/7.64.1</span>
|
||||
<span class="sd">> Accept: */*</span>
|
||||
<span class="sd">></span>
|
||||
<span class="sd">* HTTP 1.0, assume close after body</span>
|
||||
<span class="sd">< HTTP/1.0 302 FOUND</span>
|
||||
<span class="sd">< Content-Type: text/html; charset=utf-8</span>
|
||||
<span class="sd">< Content-Length: 247</span>
|
||||
<span class="sd">< Location: https://linkedin.com</span>
|
||||
<span class="sd">< Server: Werkzeug/0.15.4 Python/3.7.7</span>
|
||||
<span class="sd">< Date: Tue, 27 Oct 2020 09:37:12 GMT</span>
|
||||
<span class="sd"><</span>
|
||||
<span class="sd"><!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"></span>
|
||||
<span class="sd"><title>Redirecting...</title></span>
|
||||
<span class="sd"><h1>Redirecting...</h1></span>
|
||||
<span class="sd">* Closing connection 0</span>
|
||||
<span class="sd"><p>You should be redirected automatically to target URL: <a href="https://linkedin.com">https://linkedin.com</a>. If not click the link.</span>
|
||||
<span class="sd">"""</span>
|
||||
</code></pre></div>
|
||||
$ curl localhost:5000/r/a62a4 -v
|
||||
* Uses proxy env variable NO_PROXY == '127.0.0.1'
|
||||
* Trying ::1...
|
||||
* TCP_NODELAY set
|
||||
* Connection failed
|
||||
* connect to ::1 port 5000 failed: Connection refused
|
||||
* Trying 127.0.0.1...
|
||||
* TCP_NODELAY set
|
||||
* Connected to localhost (127.0.0.1) port 5000 (#0)
|
||||
> GET /r/a62a4 HTTP/1.1
|
||||
> Host: localhost:5000
|
||||
> User-Agent: curl/7.64.1
|
||||
> Accept: */*
|
||||
>
|
||||
* HTTP 1.0, assume close after body
|
||||
< HTTP/1.0 302 FOUND
|
||||
< Content-Type: text/html; charset=utf-8
|
||||
< Content-Length: 247
|
||||
< Location: https://linkedin.com
|
||||
< Server: Werkzeug/0.15.4 Python/3.7.7
|
||||
< Date: Tue, 27 Oct 2020 09:37:12 GMT
|
||||
<
|
||||
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
|
||||
<title>Redirecting...</title>
|
||||
<h1>Redirecting...</h1>
|
||||
* Closing connection 0
|
||||
<p>You should be redirected automatically to target URL: <a href="https://linkedin.com">https://linkedin.com</a>. If not click the link.
|
||||
"""
|
||||
</code></pre>
|
||||
|
||||
|
||||
|
||||
|
||||
File diff suppressed because one or more lines are too long
@@ -1516,15 +1516,16 @@
|
||||
<ul>
|
||||
<li>Applications regularly fail to process transactions for many reasons. How they fail can determine if an application is secure or not.</li>
|
||||
</ul>
|
||||
<p><div class="highlight"><pre><span></span><code><span class="n">is_admin</span> <span class="o">=</span> <span class="kc">true</span><span class="p">;</span>
|
||||
<span class="k">try</span> <span class="p">{</span>
|
||||
<span class="n">code_which_may_faile</span><span class="p">();</span>
|
||||
<span class="n">is_admin</span> <span class="o">=</span> <span class="n">is_user_assigned_role</span><span class="p">(</span><span class="s">"Adminstrator"</span><span class="p">);</span>
|
||||
<span class="p">}</span>
|
||||
<span class="k">catch</span> <span class="p">(</span><span class="n">Exception</span> <span class="n">err</span><span class="p">)</span> <span class="p">{</span>
|
||||
<span class="n">log</span><span class="p">.</span><span class="na">error</span><span class="p">(</span><span class="n">err</span><span class="p">.</span><span class="na">toString</span><span class="p">());</span>
|
||||
<span class="p">}</span>
|
||||
</code></pre></div>
|
||||
<p>```</p>
|
||||
<p>is_admin = true;
|
||||
try {
|
||||
code_which_may_faile();
|
||||
is_admin = is_user_assigned_role("Adminstrator");
|
||||
}
|
||||
catch (Exception err) {
|
||||
log.error(err.toString());
|
||||
}</p>
|
||||
<p>```
|
||||
- If either codeWhichMayFail() or isUserInRole fails or throws an exception, the user is an admin by default. This is obviously a security risk.</p>
|
||||
</li>
|
||||
<li>
|
||||
@@ -1596,14 +1597,17 @@
|
||||
<ul>
|
||||
<li>Ciphers are the cornerstone of cryptography. A cipher is a set of algorithms that performs encryption or decryption on a message. An encryption algorithm (E) takes a secret key (k) and a message (m) and produces a ciphertext (c). Similarly, a Decryption algorithm (D) takes a secret key (K) and the previous resulting Ciphertext (C). They are represented as follows:</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre><span></span><code>E(k,m) = c
|
||||
<pre><code>
|
||||
E(k,m) = c
|
||||
D(k,c) = m
|
||||
</code></pre></div>
|
||||
|
||||
</code></pre>
|
||||
<ul>
|
||||
<li>This also means that for it to be a cipher, it must satisfy the consistency equation as follows, making it possible to decrypt.</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre><span></span><code>D(k,E(k,m)) = m
|
||||
</code></pre></div>
|
||||
<pre><code>
|
||||
D(k,E(k,m)) = m
|
||||
</code></pre>
|
||||
<p>Stream Ciphers:</p>
|
||||
<ul>
|
||||
<li>The message is broken into characters or bits and enciphered with a key or keystream(should be random and generated independently of the message stream) that is as long as the plaintext bitstream.</li>
|
||||
|
||||
@@ -1835,8 +1835,8 @@ Correspondence between layers of the TCP/IP architecture and the OSI model. Also
|
||||
<li>Nmap is often used to determine alive hosts in a network, open ports on those hosts, services running on those open ports, and version identification of that service on that port.</li>
|
||||
<li>More at http://scanme.nmap.org/</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre><span></span><code>nmap <span class="o">[</span>scan type<span class="o">]</span> <span class="o">[</span>options<span class="o">]</span> <span class="o">[</span>target specification<span class="o">]</span>
|
||||
</code></pre></div>
|
||||
<pre><code>nmap [scan type] [options] [target specification]
|
||||
</code></pre>
|
||||
<p>Nmap uses 6 different port states:</p>
|
||||
<ul>
|
||||
<li><strong>Open</strong> — An open port is one that is actively accepting TCP, UDP or SCTP connections. Open ports are what interests us the most because they are the ones that are vulnerable to attacks. Open ports also show the available services on a network.</li>
|
||||
@@ -2195,13 +2195,13 @@ IDS sensors can be software and hardware-based used to collect and analyze the n
|
||||
<p>Abuse of the normal operation or settings of these flags can be used by attackers to launch DoS attacks. This causes network servers or web servers to crash or hang.</p>
|
||||
</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre><span></span><code>| SYN | FIN | PSH | RST | Validity|
|
||||
<pre><code>| SYN | FIN | PSH | RST | Validity|
|
||||
|------|------|-------|------|---------|
|
||||
| 1 |1 |0 |0 |Illegal Combination
|
||||
| 1 |1 |1 |0 |Illegal Combination
|
||||
| 1 |1 |0 |1 |Illegal Combination
|
||||
| 1 |1 |1 |1 |Illegal Combination
|
||||
</code></pre></div>
|
||||
</code></pre>
|
||||
<ul>
|
||||
<li>The attacker's ultimate goal is to write special programs or pieces of code that can construct these illegal combinations resulting in an efficient DoS attack.</li>
|
||||
</ul>
|
||||
|
||||
@@ -1873,14 +1873,14 @@ the typical time to live (TTL) for cached entries is a couple of hours, thereby
|
||||
<li>A successful exploit will allow attackers to access, modify, or delete information in the database.</li>
|
||||
<li>It permits attackers to steal sensitive information stored within the backend databases of affected websites, which may include such things as user credentials, email addresses, personal information, and credit card numbers</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre><span></span><code><span class="k">SELECT</span> <span class="n">USERNAME</span><span class="p">,</span><span class="n">PASSWORD</span> <span class="k">from</span> <span class="n">USERS</span> <span class="k">where</span> <span class="n">USERNAME</span><span class="o">=</span><span class="s1">'<username>'</span> <span class="k">AND</span> <span class="n">PASSWORD</span><span class="o">=</span><span class="s1">'<password>'</span><span class="p">;</span>
|
||||
<pre><code>SELECT USERNAME,PASSWORD from USERS where USERNAME='<username>' AND PASSWORD='<password>';
|
||||
|
||||
<span class="n">Here</span> <span class="n">the</span> <span class="n">username</span> <span class="o">&</span> <span class="n">password</span> <span class="k">is</span> <span class="n">the</span> <span class="k">input</span> <span class="n">provided</span> <span class="k">by</span> <span class="n">the</span> <span class="k">user</span><span class="p">.</span> <span class="n">Suppose</span> <span class="n">an</span> <span class="n">attacker</span> <span class="n">gives</span> <span class="n">the</span> <span class="k">input</span> <span class="k">as</span> <span class="ss">" OR '1'='1'"</span> <span class="k">in</span> <span class="k">both</span> <span class="n">fields</span><span class="p">.</span> <span class="n">Therefore</span> <span class="n">the</span> <span class="k">SQL</span> <span class="n">query</span> <span class="n">will</span> <span class="n">look</span> <span class="k">like</span><span class="p">:</span>
|
||||
Here the username & password is the input provided by the user. Suppose an attacker gives the input as " OR '1'='1'" in both fields. Therefore the SQL query will look like:
|
||||
|
||||
<span class="k">SELECT</span> <span class="n">USERNAME</span><span class="p">,</span><span class="n">PASSWORD</span> <span class="k">from</span> <span class="n">USERS</span> <span class="k">where</span> <span class="n">USERNAME</span><span class="o">=</span><span class="s1">''</span> <span class="k">OR</span> <span class="s1">'1'</span><span class="o">=</span><span class="s1">'1'</span> <span class="k">AND</span> <span class="n">PASSOWRD</span><span class="o">=</span><span class="s1">''</span> <span class="k">OR</span> <span class="s1">'1'</span><span class="o">=</span><span class="s1">'1'</span><span class="p">;</span>
|
||||
SELECT USERNAME,PASSWORD from USERS where USERNAME='' OR '1'='1' AND PASSOWRD='' OR '1'='1';
|
||||
|
||||
<span class="n">This</span> <span class="n">query</span> <span class="n">results</span> <span class="k">in</span> <span class="n">a</span> <span class="k">true</span> <span class="k">statement</span> <span class="o">&</span> <span class="n">the</span> <span class="k">user</span> <span class="n">gets</span> <span class="n">logged</span> <span class="k">in</span><span class="p">.</span> <span class="n">This</span> <span class="n">example</span> <span class="n">depicts</span> <span class="n">the</span> <span class="n">bost</span> <span class="n">basic</span> <span class="k">type</span> <span class="k">of</span> <span class="k">SQL</span> <span class="n">injection</span>
|
||||
</code></pre></div>
|
||||
This query results in a true statement & the user gets logged in. This example depicts the bost basic type of SQL injection
|
||||
</code></pre>
|
||||
<h3 id="sql-injection-attack-defenses">SQL Injection Attack Defenses</h3>
|
||||
<ul>
|
||||
<li>SQL injection can be protected by filtering the query to eliminate malicious syntax, which involves the employment of some tools in order to (a) scan the source code.</li>
|
||||
|
||||
BIN
sitemap.xml.gz
BIN
sitemap.xml.gz
Binary file not shown.
Reference in New Issue
Block a user