diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 120000 index 0000000..439ad26 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1 @@ +courses/CONTRIBUTING.md \ No newline at end of file diff --git a/NOTICE b/NOTICE new file mode 100644 index 0000000..d00a531 --- /dev/null +++ b/NOTICE @@ -0,0 +1,7 @@ +Copyright 2020 LinkedIn Corporation +All Rights Reserved. +Licensed under the BSD 2-Clause License (the "License"). +See LICENSE in the project root for license information. + +This product includes: +* N/A diff --git a/README.md b/README.md new file mode 120000 index 0000000..2406e9f --- /dev/null +++ b/README.md @@ -0,0 +1 @@ +courses/index.md \ No newline at end of file diff --git a/courses/CONTRIBUTING.md b/courses/CONTRIBUTING.md new file mode 100644 index 0000000..52323d2 --- /dev/null +++ b/courses/CONTRIBUTING.md @@ -0,0 +1,5 @@ +We realise that the initial content we created is just a starting point and our hope is that the community can help in the journey refining and extending the contents. + +As a contributor, you represent that the content you submit is not plagiarised. By submitting the content, you (and, if applicable, your employer) are licensing the submitted content to LinkedIn and the open source community subject to the BSD 2-Clause license. + +We suggest to open an issue first and seek advice for your changes before submitting a pull request. diff --git a/courses/big_data/intro.md b/courses/big_data/intro.md index 5443037..143cc8d 100644 --- a/courses/big_data/intro.md +++ b/courses/big_data/intro.md @@ -1,35 +1,32 @@ -# School of SRE: Big Data +# Big Data -## Pre - Reads +## Prerequisites - Basics of Linux File systems. - Basic understanding of System Design. -## Target Audience - -The concept of Big Data has been around for years; most organizations now understand that if they capture all the data that streams into their businesses, they can apply analytics and get significant value from it. -This training material covers the basics of Big Data(using Hadoop) for beginners, who would like to quickly get started and get their hands dirty in this domain. - -## What to expect from this training +## What to expect from this course This course covers the basics of Big Data and how it has evolved to become what it is today. We will take a look at a few realistic scenarios where Big Data would be a perfect fit. An interesting assignment on designing a Big Data system is followed by understanding the architecture of Hadoop and the tooling around it. -## What is not covered under this training +## What is not covered under this course Writing programs to draw analytics from data. -## TOC: +## Course Content -1. Overview of Big Data -2. Usage of Big Data techniques -3. Evolution of Hadoop -4. Architecture of hadoop +### Table of Contents + +1. [Overview of Big Data](https://linkedin.github.io/school-of-sre/big_data/overview/) +2. [Usage of Big Data techniques](https://linkedin.github.io/school-of-sre/big_data/overview/) +3. [Evolution of Hadoop](https://linkedin.github.io/school-of-sre/big_data/evolution/) +4. [Architecture of hadoop](https://linkedin.github.io/school-of-sre/big_data/architecture/) 1. HDFS 2. Yarn -5. MapReduce framework -6. Other tooling around hadoop +5. [MapReduce framework](https://linkedin.github.io/school-of-sre/big_data/architecture/#mapreduce-framework) +6. [Other tooling around hadoop](https://linkedin.github.io/school-of-sre/big_data/architecture/#other-tooling-around-hadoop) 1. Hive 2. Pig 3. Spark 4. Presto -7. Data Serialisation and storage \ No newline at end of file +7. [Data Serialisation and storage](https://linkedin.github.io/school-of-sre/big_data/architecture/#data-serialisation-and-storage) diff --git a/courses/git/git-basics.md b/courses/git/git-basics.md index fa257ed..be8df86 100644 --- a/courses/git/git-basics.md +++ b/courses/git/git-basics.md @@ -1,6 +1,6 @@ -# School Of SRE: Git +# Git -## Prerequisite +## Prerequisites 1. Have Git installed [https://git-scm.com/downloads](https://git-scm.com/downloads) 2. Have taken any git high level tutorial or following LinkedIn learning courses @@ -8,22 +8,22 @@ - [https://www.linkedin.com/learning/git-branches-merges-and-remotes/](https://www.linkedin.com/learning/git-branches-merges-and-remotes/) - [The Official Git Docs](https://git-scm.com/doc) -## What to expect from this training +## What to expect from this course As an engineer in the field of computer science, having knowledge of version control tools becomes almost a requirement. While there are a lot of version control tools that exist today like SVN, Mercurial, etc, Git perhaps is the most used one and this course we will be working with Git. While this course does not start with Git 101 and expects basic knowledge of git as a prerequisite, it will reintroduce the git concepts known by you with details covering what is happening under the hood as you execute various git commands. So that next time you run a git command, you will be able to press enter more confidently! -## What is not covered under this training +## What is not covered under this course Advanced usage and specifics of internal implementation details of Git. -## Training Content +## Course Content ### Table of Contents - 1. Git Basics - 2. Working with Branches - 3. Git with Github - 4. Hooks + 1. [Git Basics](https://linkedin.github.io/school-of-sre/git/git-basics/#git-basics) + 2. [Working with Branches](https://linkedin.github.io/school-of-sre/git/branches/) + 3. [Git with Github](https://linkedin.github.io/school-of-sre/git/github-hooks/#git-with-github) + 4. [Hooks](https://linkedin.github.io/school-of-sre/git/github-hooks/#hooks) ## Git Basics diff --git a/courses/git/github-hooks.md b/courses/git/github-hooks.md index a16a9e2..db78e76 100644 --- a/courses/git/github-hooks.md +++ b/courses/git/github-hooks.md @@ -1,4 +1,4 @@ -## Git with Github +# Git with Github Till now all the operations we did were in our local repo while git also helps us in a collaborative environment. GitHub is one place on the internet where you can centrally host your git repos and collaborate with other developers. diff --git a/courses/img/favicon.ico b/courses/img/favicon.ico new file mode 100644 index 0000000..0090cbb Binary files /dev/null and b/courses/img/favicon.ico differ diff --git a/courses/img/sos.png b/courses/img/sos.png new file mode 100644 index 0000000..584c1b4 Binary files /dev/null and b/courses/img/sos.png differ diff --git a/courses/index.md b/courses/index.md index 5ac001b..28d24aa 100644 --- a/courses/index.md +++ b/courses/index.md @@ -1 +1,25 @@ -Hello, World!!! +# School of SRE +![School of SRE](img/sos.png) +Early 2019, we started visiting campuses to recruit the brightest minds to ensure LinkedIn and all the services that it is composed of is always available for everyone. This function at Linkedin falls in the purview of the Site Reliability Engineering team and Site Reliability Engineers ( SRE ) who are Software Engineers who specialize in reliability. SREs apply the principles of computer science and engineering to the design and development of computer systems: generally, large distributed ones. + +As we continued on this journey we started getting a lot of questions from these campuses on what exactly site engineering roll entails? and, how could someone learn the skills and the disciplines involved to become a successful site engineer? Fast forward a few months, and a few of these campus students had joined LinkedIn either as Interns or as full time engineers to become a part of the Site Engineering team, we also had a few lateral hires who joined our organization who were not from a traditional SRE background. That's when a few of us got together and started to think about how we can on board new new graduate engineers to the site engineering team. + +There is a vast amount of resources scattered throughout the web on what are the roles and responsibilities of an SREs, how to monitor site health, handling incidents, maintain SLO/SLI etc. But there are very few resources out there guiding someone on what all basic skill sets one has to acquire as a beginner. Because of the lack of these resources we felt that individuals are having a tough time getting into open positions in the industry. We created School Of SRE as a starting point for anyone wanting to build their career in the role of SRE. + +In this course we are focusing on building strong foundational skills. The course is structured in a way to provide more real life examples and how learning each of the topics can play a bigger role in your day to day SRE life. Currently we are covering the following topics under the School Of SRE: + +- Fundamentals Series + - [Linux Basics](https://linkedin.github.io/school-of-sre/linux_basics/intro/) + - [Git](https://linkedin.github.io/school-of-sre/git/git-basics/) + - [Linux Networking](https://linkedin.github.io/school-of-sre/linux_networking/intro/) +- [Python and Web](https://linkedin.github.io/school-of-sre/python_web/intro/) +- Data + - Relational databases (MySQL) + - NoSQL concepts + - [Big Data](https://linkedin.github.io/school-of-sre/big_data/intro/) +- [Systems Design](https://linkedin.github.io/school-of-sre/systems_design/intro/) +- [Security](https://linkedin.github.io/school-of-sre/security/intro/) + +We believe continuous learning will help in acquiring deeper knowledge and competencies in order to expand your skill sets, every module has added reference which could be a guide for further learning. Our hope is that by going through these modules we should be able build the essential skills required for a Site Reliability Engineer. + +At Linkedin we are using this curriculum for onboarding our non-traditional hires and new college grads to the SRE role. We had multiple rounds of successful onboarding experience with the new members and helped them to be productive in a very short period of time. This motivated us to opensource these contents for helping other organisations onboarding new engineers to the role and individuals to get into the role. We realise that the initial content we created is just a starting point and our hope is that the community can help in the journey refining and extending the contents. diff --git a/courses/linux_basics/command_line_basics.md b/courses/linux_basics/command_line_basics.md new file mode 100644 index 0000000..54845ff --- /dev/null +++ b/courses/linux_basics/command_line_basics.md @@ -0,0 +1,459 @@ +# Command Line Basics + +## What is a command ? + + +A command is a program that tells the operating system to perform +specific work. Programs are stored as files in linux. Therefore, a +command is also a file which is stored somewhere on the disk. + +Commands may also take additional arguments as input from the user. +These arguments are called command line arguments. Knowing how to use +the commands is important and there are many ways to get help in Linux, +especially for commands. Almost every command will have some form of +documentation, most commands will have a command-line argument -h or +\--help that will display a reasonable amount of documentation. But the +most popular documentation system in Linux is called man pages - short +for manual pages. + +Using \--help to show the documentation for ls command. + +![](images/linux/commands/image19.png) + +## File System Organization + +The linux file system has a hierarchical (or tree-like) structure with +its highest level directory called root ( denoted by / ). Directories +present inside the root directory stores file related to the system. +These directories in turn can either store system files or application +files or user related files. + +![](images/linux/commands/image17.png) + + bin | The executable program of most commonly used commands reside in bin directory + sbin | This directory contains programs used for system administration. + home | This directory contains user related files and directories. + lib | This directory contains all the library files + etc | This directory contains all the system configuration files + proc | This directory contains files related to the running processes on the system + dev | This directory contains files related to devices on the system + mnt | This directory contains files related to mounted devices on the system + tmp | This directory is used to store temporary files on the system + usr | This directory is used to store application programs on the system + +## Commands for Navigating the File System + +There are three basic commands which are used frequently to navigate the +file system: + +- ls + +- pwd + +- cd + +We will now try to understand what each command does and how to use +these commands. You should also practice the given examples on the +online bash shell. + +### pwd (print working directory) + +At any given moment of time, we will be standing in a certain directory. +To get the name of the directory in which we are standing, we can use +the pwd command in linux. + +![](images/linux/commands/image2.png) + +We will now use the cd command to move to a different directory and then +print the working directory. + +![](images/linux/commands/image20.png) + +### cd (change directory) + +The cd command can be used to change the working directory. Using the +command, you can move from one directory to another. + +In the below example, we are initially in the root directory. we have +then used the cd command to change the directory. + +![](images/linux/commands/image3.png) + +### ls (list files and directories)** + +The ls command is used to list the contents of a directory. It will list +down all the files and folders present in the given directory. + +If we just type ls in the shell, it will list all the files and +directories present in the current directory. + +![](images/linux/commands/image7.png) + +We can also provide the directory name as argument to ls command. It +will then list all the files and directories inside the given directory. + +![](images/linux/commands/image4.png) + +## Commands for Manipulating Files + +There are four basic commands which are used frequently to manipulate +files: + +- touch + +- mkdir + +- cp + +- mv + +- rm + +We will now try to understand what each command does and how to use +these commands. You should also practice the given examples on the +online bash shell. + +### touch (create new file) + +The touch command can be used to create an empty new file. +This command is very useful for many other purposes but we will discuss +the simplest use case of creating a new file. + +General syntax of using touch command + +``` +touch +``` + +![](images/linux/commands/image9.png) + +### mkdir (create new directories) + +The mkdir command is used to create directories.You can use ls command +to verify that the new directory is created. + +General syntax of using mkdir command + +``` +mkdir +``` + +![](images/linux/commands/image11.png) + +### rm (delete files and directories) + +The rm command can be used to delete files and directories. It is very +important to note that this command permanently deletes the files and +directories. It's almost impossible to recover these files and +directories once you have executed rm command on them successfully. Do +run this command with care. + +General syntax of using rm command: + +``` +rm +``` + +Let's try to understand the rm command with an example. We will try to +delete the file and directory we created using touch and mkdir command +respectively. + +![](images/linux/commands/image18.png) + +### cp (copy files and directories) + +The cp command is used to copy files and directories from one location +to another. Do note that the cp command doesn't do any change to the +original files or directories. The original files or directories and +their copy both co-exist after running cp command successfully. + +General syntax of using cp command: + +``` +cp +``` + +We are currently in the '/home/runner' directory. We will use the mkdir +command to create a new directory named "test_directory". We will now +try to copy the "\_test_runner.py" file to the directory we created just +now. + +![](images/linux/commands/image23.png) + +Do note that nothing happened to the original "\_test_runner.py" file. +It's still there in the current directory. A new copy of it got created +inside the "test_directory". + +![](images/linux/commands/image14.png) + +We can also use the cp command to copy the whole directory from one +location to another. Let's try to understand this with an example. + +![](images/linux/commands/image12.png) + +We again used the mkdir command to create a new directory called +"another_directory". We then used the cp command along with an +additional argument '-r' to copy the "test_directory". + +**mv (move files and directories)** + +The mv command can either be used to move files or directories from one +location to another or it can be used to rename files or directories. Do +note that moving files and copying them are very different. When you +move the files or directories, the original copy is lost. + +General syntax of using mv command: + +``` +mv +``` + +In this example, we will use the mv command to move the +"\_test_runner.py" file to "test_directory". In this case, this file +already exists in "test_directory". The mv command will just replace it. +**Do note that the original file doesn't exist in the current directory +after mv command ran successfully.** + +![](images/linux/commands/image26.png) + +We can also use the mv command to move a directory from one location to +another. In this case, we do not need to use the '-r' flag that we did +while using the cp command. Do note that the original directory will not +exist if we use mv command. + +One of the important uses of the mv command is to rename files and +directories. Let's see how we can use this command for renaming. + +We have first changed our location to "test_directory". We then use the +mv command to rename the ""\_test_runner.py" file to "test.py". + +![](images/linux/commands/image29.png) + +## Commands for Viewing Files + +There are three basic commands which are used frequently to view the +files: + +- cat + +- head + +- tail + +We will now try to understand what each command does and how to use +these commands. You should also practice the given examples on the +online bash shell. + +We will create a new file called "numbers.txt" and insert numbers from 1 +to 100 in this file. Each number will be in a separate line. + +![](images/linux/commands/image21.png) + +Do not worry about the above command now. It's an advanced command which +is used to generate numbers. We have then used a redirection operator to +push these numbers to the file. We will be discussing I/O redirection in the +later sections. + + +### cat + +The most simplest use of cat command is to print the contents of the file on +your output screen. This command is very useful and can be used for many +other purposes. We will study about other use cases later. + +![](images/linux/commands/image1.png) + +You can try to run the above command and you will see numbers being +printed from 1 to 100 on your screen. You will need to scroll up to view +all the numbers. + +### head + +The head command displays the first 10 lines of the file by default. We +can include additional arguments to display as many lines as we want +from the top. + +In this example, we are only able to see the first 10 lines from the +file when we use the head command. + +![](images/linux/commands/image15.png) + +By default, head command will only display the first 10 lines. If we +want to specify the number of lines we want to see from start, use the +'-n' argument to provide the input. + +![](images/linux/commands/image16.png) + +### tail + +The tail command displays the last 10 lines of the file by default. We +can include additional arguments to display as many lines as we want +from the end of the file. + +![](images/linux/commands/image22.png) + +By default, the tail command will only display the last 10 lines. If we +want to specify the number of lines we want to see from the end, use '-n' +argument to provide the input. + +![](images/linux/commands/image10.png) + +In this example, we are only able to see the last 5 lines from the file +when we use the tail command with explicit -n option. + + +## Echo Command in Linux + +The echo command is one of the simplest commands that is used in the +shell. This command is equivalent to what we have in other +programming languages. + +The echo command prints the given input string on the screen. + +![](images/linux/commands/image24.png) + +## Text Processing Commands + +In the previous section, we learned how to view the content of a file. +In many cases, we will be interested in performing the below operations: + +- Print only the lines which contain a particular word(s) + +- Replace a particular word with another word in a file + +- Sort the lines in a particular order + +There are three basic commands which are used frequently to process +texts: + +- grep + +- sed + +- sort + +We will now try to understand what each command does and how to use +these commands. You should also practice the given examples on the +online bash shell. + +We will create a new file called "numbers.txt" and insert numbers from 1 +to 10 in this file. Each number will be in a separate line. + +![](images/linux/commands/image8.png) + +### grep + +The grep command in its simplest form can be used to search particular +words in a text file. It will display all the lines in a file that +contains a particular input. The word we want to search is provided as +an input to the grep command. + +General syntax of using grep command: + +``` +grep +``` + +In this example, we are trying to search for a string "1" in this file. +The grep command outputs the lines where it found this string. + +![](images/linux/commands/image5.png) + +### sed + +The sed command in its simplest form can be used to replace a text in a +file. + +General syntax of using the sed command for replacement: + +``` +sed 's///' +``` + +Let's try to replace each occurrence of "1" in the file with "3" using +sed command. + +![](images/linux/commands/image31.png) + +The content of the file will not change in the above +example. To do so, we have to use an extra argument '-i' so that the +changes are reflected back in the file. + +### sort + +The sort command can be used to sort the input provided to it as an +argument. By default, it will sort in increasing order. + +Let's first see the content of the file before trying to sort it. + +![](images/linux/commands/image27.png) + +Now, we will try to sort the file using the sort command. The sort +command sorts the content in lexicographical order. + +![](images/linux/commands/image32.png) + +The content of the file will not change in the above +example. + +## I/O Redirection + +Each open file gets assigned a file descriptor. A file descriptor is an +unique identifier for open files in the system. There are always three +default files open, stdin (the keyboard), stdout (the screen), and +stderr (error messages output to the screen). These files can be +redirected. + +Everything is a file in linux - +[https://unix.stackexchange.com/questions/225537/everything-is-a-file](https://unix.stackexchange.com/questions/225537/everything-is-a-file) + +Till now, we have displayed all the output on the screen which is the +standard output. We can use some special operators to redirect the +output of the command to files or even to the input of other commands. +I/O redirection is a very powerful feature. + +In the below example, we have used the '>' operator to redirect the +output of ls command to output.txt file. + +![](images/linux/commands/image30.png) + +In the below example, we have redirected the output from echo command to +a file. + +![](images/linux/commands/image13.png) + +We can also redirect the output of a command as an input to another +command. This is possible with the help of pipes. + +In the below example, we have passed the output of cat command as an +input to grep command using pipe(\|) operator. + +![](images/linux/commands/image6.png) + +In the below example, we have passed the output of sort command as an +input to uniq command using pipe(\|) operator. The uniq command only +prints the unique numbers from the input. + +![](images/linux/commands/image28.png) + +I/O redirection - +[https://tldp.org/LDP/abs/html/io-redirection.html](https://tldp.org/LDP/abs/html/io-redirection.html) + +## Applications in SRE Role + +- As a SRE, you will be required to perform some general tasks on these linux servers. You will also be using the command line when you are troubleshooting issues. + +- Moving from one location to another in the filesystem will require the help of ls, pwd and cd commands + +- You may need to search some specific information in the log files. Grep command would be very useful here. I/O redirection will become handy if you want to store the output in a file or pass it as an input to another command. + +- Tail command is very useful to view the latest data in the log file. + +## Useful courses and tutorials + + +- [Edx linuxcourse](https://courses.edx.org/courses/course-v1:LinuxFoundationX+LFS101x+1T2020/course/) - + This video course can be very helpful in developing the basics of linux command line. This course is provided + in both free and paidmodes by edX. If you take the free course, you will not be able to access the assignments. + +- [https://linuxcommand.org/lc3_learning_the_shell.php](https://linuxcommand.org/lc3_learning_the_shell.php) diff --git a/courses/linux_basics/images/linux/admin/image1.png b/courses/linux_basics/images/linux/admin/image1.png new file mode 100644 index 0000000..a0847e0 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image1.png differ diff --git a/courses/linux_basics/images/linux/admin/image10.png b/courses/linux_basics/images/linux/admin/image10.png new file mode 100644 index 0000000..2022fab Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image10.png differ diff --git a/courses/linux_basics/images/linux/admin/image11.png b/courses/linux_basics/images/linux/admin/image11.png new file mode 100644 index 0000000..c45dffb Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image11.png differ diff --git a/courses/linux_basics/images/linux/admin/image12.png b/courses/linux_basics/images/linux/admin/image12.png new file mode 100644 index 0000000..328b0a6 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image12.png differ diff --git a/courses/linux_basics/images/linux/admin/image13.png b/courses/linux_basics/images/linux/admin/image13.png new file mode 100644 index 0000000..1d701a7 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image13.png differ diff --git a/courses/linux_basics/images/linux/admin/image14.png b/courses/linux_basics/images/linux/admin/image14.png new file mode 100644 index 0000000..42d25e5 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image14.png differ diff --git a/courses/linux_basics/images/linux/admin/image15.png b/courses/linux_basics/images/linux/admin/image15.png new file mode 100644 index 0000000..c0d9979 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image15.png differ diff --git a/courses/linux_basics/images/linux/admin/image16.png b/courses/linux_basics/images/linux/admin/image16.png new file mode 100644 index 0000000..e5fc1a0 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image16.png differ diff --git a/courses/linux_basics/images/linux/admin/image17.png b/courses/linux_basics/images/linux/admin/image17.png new file mode 100644 index 0000000..9416248 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image17.png differ diff --git a/courses/linux_basics/images/linux/admin/image18.png b/courses/linux_basics/images/linux/admin/image18.png new file mode 100644 index 0000000..8e21fa8 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image18.png differ diff --git a/courses/linux_basics/images/linux/admin/image19.png b/courses/linux_basics/images/linux/admin/image19.png new file mode 100644 index 0000000..fc61c5f Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image19.png differ diff --git a/courses/linux_basics/images/linux/admin/image2.png b/courses/linux_basics/images/linux/admin/image2.png new file mode 100644 index 0000000..0b84a99 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image2.png differ diff --git a/courses/linux_basics/images/linux/admin/image20.png b/courses/linux_basics/images/linux/admin/image20.png new file mode 100644 index 0000000..1e49ce0 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image20.png differ diff --git a/courses/linux_basics/images/linux/admin/image21.png b/courses/linux_basics/images/linux/admin/image21.png new file mode 100644 index 0000000..b8034fa Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image21.png differ diff --git a/courses/linux_basics/images/linux/admin/image22.png b/courses/linux_basics/images/linux/admin/image22.png new file mode 100644 index 0000000..8a339a5 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image22.png differ diff --git a/courses/linux_basics/images/linux/admin/image23.png b/courses/linux_basics/images/linux/admin/image23.png new file mode 100644 index 0000000..2d43eac Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image23.png differ diff --git a/courses/linux_basics/images/linux/admin/image24.png b/courses/linux_basics/images/linux/admin/image24.png new file mode 100644 index 0000000..8aed08a Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image24.png differ diff --git a/courses/linux_basics/images/linux/admin/image25.png b/courses/linux_basics/images/linux/admin/image25.png new file mode 100644 index 0000000..0606aab Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image25.png differ diff --git a/courses/linux_basics/images/linux/admin/image26.png b/courses/linux_basics/images/linux/admin/image26.png new file mode 100644 index 0000000..a35f697 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image26.png differ diff --git a/courses/linux_basics/images/linux/admin/image27.png b/courses/linux_basics/images/linux/admin/image27.png new file mode 100644 index 0000000..394b79f Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image27.png differ diff --git a/courses/linux_basics/images/linux/admin/image28.png b/courses/linux_basics/images/linux/admin/image28.png new file mode 100644 index 0000000..b4bd8ba Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image28.png differ diff --git a/courses/linux_basics/images/linux/admin/image29.png b/courses/linux_basics/images/linux/admin/image29.png new file mode 100644 index 0000000..6ccaf73 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image29.png differ diff --git a/courses/linux_basics/images/linux/admin/image3.png b/courses/linux_basics/images/linux/admin/image3.png new file mode 100644 index 0000000..6d9ebc4 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image3.png differ diff --git a/courses/linux_basics/images/linux/admin/image30.png b/courses/linux_basics/images/linux/admin/image30.png new file mode 100644 index 0000000..d9dccae Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image30.png differ diff --git a/courses/linux_basics/images/linux/admin/image31.jpg b/courses/linux_basics/images/linux/admin/image31.jpg new file mode 100644 index 0000000..d781a27 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image31.jpg differ diff --git a/courses/linux_basics/images/linux/admin/image32.png b/courses/linux_basics/images/linux/admin/image32.png new file mode 100644 index 0000000..e11722a Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image32.png differ diff --git a/courses/linux_basics/images/linux/admin/image33.png b/courses/linux_basics/images/linux/admin/image33.png new file mode 100644 index 0000000..17a3bad Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image33.png differ diff --git a/courses/linux_basics/images/linux/admin/image34.png b/courses/linux_basics/images/linux/admin/image34.png new file mode 100644 index 0000000..fba8d86 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image34.png differ diff --git a/courses/linux_basics/images/linux/admin/image35.png b/courses/linux_basics/images/linux/admin/image35.png new file mode 100644 index 0000000..ae39c08 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image35.png differ diff --git a/courses/linux_basics/images/linux/admin/image36.png b/courses/linux_basics/images/linux/admin/image36.png new file mode 100644 index 0000000..f38a037 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image36.png differ diff --git a/courses/linux_basics/images/linux/admin/image37.png b/courses/linux_basics/images/linux/admin/image37.png new file mode 100644 index 0000000..021d4a9 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image37.png differ diff --git a/courses/linux_basics/images/linux/admin/image38.png b/courses/linux_basics/images/linux/admin/image38.png new file mode 100644 index 0000000..e90f505 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image38.png differ diff --git a/courses/linux_basics/images/linux/admin/image39.png b/courses/linux_basics/images/linux/admin/image39.png new file mode 100644 index 0000000..286308d Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image39.png differ diff --git a/courses/linux_basics/images/linux/admin/image4.png b/courses/linux_basics/images/linux/admin/image4.png new file mode 100644 index 0000000..6c34447 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image4.png differ diff --git a/courses/linux_basics/images/linux/admin/image40.png b/courses/linux_basics/images/linux/admin/image40.png new file mode 100644 index 0000000..6f699ae Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image40.png differ diff --git a/courses/linux_basics/images/linux/admin/image41.png b/courses/linux_basics/images/linux/admin/image41.png new file mode 100644 index 0000000..e445a74 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image41.png differ diff --git a/courses/linux_basics/images/linux/admin/image42.png b/courses/linux_basics/images/linux/admin/image42.png new file mode 100644 index 0000000..2fb65b1 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image42.png differ diff --git a/courses/linux_basics/images/linux/admin/image43.png b/courses/linux_basics/images/linux/admin/image43.png new file mode 100644 index 0000000..b2effc4 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image43.png differ diff --git a/courses/linux_basics/images/linux/admin/image44.png b/courses/linux_basics/images/linux/admin/image44.png new file mode 100644 index 0000000..00661ff Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image44.png differ diff --git a/courses/linux_basics/images/linux/admin/image45.png b/courses/linux_basics/images/linux/admin/image45.png new file mode 100644 index 0000000..8c85219 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image45.png differ diff --git a/courses/linux_basics/images/linux/admin/image46.png b/courses/linux_basics/images/linux/admin/image46.png new file mode 100644 index 0000000..ff22c92 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image46.png differ diff --git a/courses/linux_basics/images/linux/admin/image47.png b/courses/linux_basics/images/linux/admin/image47.png new file mode 100644 index 0000000..2258509 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image47.png differ diff --git a/courses/linux_basics/images/linux/admin/image48.png b/courses/linux_basics/images/linux/admin/image48.png new file mode 100644 index 0000000..00ec3be Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image48.png differ diff --git a/courses/linux_basics/images/linux/admin/image49.png b/courses/linux_basics/images/linux/admin/image49.png new file mode 100644 index 0000000..1197ea4 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image49.png differ diff --git a/courses/linux_basics/images/linux/admin/image5.png b/courses/linux_basics/images/linux/admin/image5.png new file mode 100644 index 0000000..749506c Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image5.png differ diff --git a/courses/linux_basics/images/linux/admin/image50.png b/courses/linux_basics/images/linux/admin/image50.png new file mode 100644 index 0000000..25da00e Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image50.png differ diff --git a/courses/linux_basics/images/linux/admin/image51.png b/courses/linux_basics/images/linux/admin/image51.png new file mode 100644 index 0000000..90eb629 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image51.png differ diff --git a/courses/linux_basics/images/linux/admin/image52.png b/courses/linux_basics/images/linux/admin/image52.png new file mode 100644 index 0000000..62f8843 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image52.png differ diff --git a/courses/linux_basics/images/linux/admin/image53.png b/courses/linux_basics/images/linux/admin/image53.png new file mode 100644 index 0000000..2866610 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image53.png differ diff --git a/courses/linux_basics/images/linux/admin/image54.png b/courses/linux_basics/images/linux/admin/image54.png new file mode 100644 index 0000000..599a905 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image54.png differ diff --git a/courses/linux_basics/images/linux/admin/image55.png b/courses/linux_basics/images/linux/admin/image55.png new file mode 100644 index 0000000..64a41df Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image55.png differ diff --git a/courses/linux_basics/images/linux/admin/image56.png b/courses/linux_basics/images/linux/admin/image56.png new file mode 100644 index 0000000..8717179 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image56.png differ diff --git a/courses/linux_basics/images/linux/admin/image57.png b/courses/linux_basics/images/linux/admin/image57.png new file mode 100644 index 0000000..382f95d Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image57.png differ diff --git a/courses/linux_basics/images/linux/admin/image58.png b/courses/linux_basics/images/linux/admin/image58.png new file mode 100644 index 0000000..2d8a003 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image58.png differ diff --git a/courses/linux_basics/images/linux/admin/image6.png b/courses/linux_basics/images/linux/admin/image6.png new file mode 100644 index 0000000..f1f270d Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image6.png differ diff --git a/courses/linux_basics/images/linux/admin/image7.png b/courses/linux_basics/images/linux/admin/image7.png new file mode 100644 index 0000000..4153cbe Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image7.png differ diff --git a/courses/linux_basics/images/linux/admin/image8.png b/courses/linux_basics/images/linux/admin/image8.png new file mode 100644 index 0000000..003f462 Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image8.png differ diff --git a/courses/linux_basics/images/linux/admin/image9.png b/courses/linux_basics/images/linux/admin/image9.png new file mode 100644 index 0000000..8ab7b9e Binary files /dev/null and b/courses/linux_basics/images/linux/admin/image9.png differ diff --git a/courses/linux_basics/images/linux/commands/image1.png b/courses/linux_basics/images/linux/commands/image1.png new file mode 100644 index 0000000..c6eb3d1 Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image1.png differ diff --git a/courses/linux_basics/images/linux/commands/image10.png b/courses/linux_basics/images/linux/commands/image10.png new file mode 100644 index 0000000..36dec6b Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image10.png differ diff --git a/courses/linux_basics/images/linux/commands/image11.png b/courses/linux_basics/images/linux/commands/image11.png new file mode 100644 index 0000000..1f80575 Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image11.png differ diff --git a/courses/linux_basics/images/linux/commands/image12.png b/courses/linux_basics/images/linux/commands/image12.png new file mode 100644 index 0000000..7ca5dd7 Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image12.png differ diff --git a/courses/linux_basics/images/linux/commands/image13.png b/courses/linux_basics/images/linux/commands/image13.png new file mode 100644 index 0000000..fe2ed7f Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image13.png differ diff --git a/courses/linux_basics/images/linux/commands/image14.png b/courses/linux_basics/images/linux/commands/image14.png new file mode 100644 index 0000000..474377a Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image14.png differ diff --git a/courses/linux_basics/images/linux/commands/image15.png b/courses/linux_basics/images/linux/commands/image15.png new file mode 100644 index 0000000..31b39c8 Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image15.png differ diff --git a/courses/linux_basics/images/linux/commands/image16.png b/courses/linux_basics/images/linux/commands/image16.png new file mode 100644 index 0000000..a85334f Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image16.png differ diff --git a/courses/linux_basics/images/linux/commands/image17.png b/courses/linux_basics/images/linux/commands/image17.png new file mode 100644 index 0000000..04ce543 Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image17.png differ diff --git a/courses/linux_basics/images/linux/commands/image18.png b/courses/linux_basics/images/linux/commands/image18.png new file mode 100644 index 0000000..592c5dd Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image18.png differ diff --git a/courses/linux_basics/images/linux/commands/image19.png b/courses/linux_basics/images/linux/commands/image19.png new file mode 100644 index 0000000..63eaf66 Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image19.png differ diff --git a/courses/linux_basics/images/linux/commands/image2.png b/courses/linux_basics/images/linux/commands/image2.png new file mode 100644 index 0000000..49c10e8 Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image2.png differ diff --git a/courses/linux_basics/images/linux/commands/image20.png b/courses/linux_basics/images/linux/commands/image20.png new file mode 100644 index 0000000..1f389b6 Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image20.png differ diff --git a/courses/linux_basics/images/linux/commands/image21.png b/courses/linux_basics/images/linux/commands/image21.png new file mode 100644 index 0000000..c605d6f Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image21.png differ diff --git a/courses/linux_basics/images/linux/commands/image22.png b/courses/linux_basics/images/linux/commands/image22.png new file mode 100644 index 0000000..8a53a36 Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image22.png differ diff --git a/courses/linux_basics/images/linux/commands/image23.png b/courses/linux_basics/images/linux/commands/image23.png new file mode 100644 index 0000000..b5e7c2d Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image23.png differ diff --git a/courses/linux_basics/images/linux/commands/image24.png b/courses/linux_basics/images/linux/commands/image24.png new file mode 100644 index 0000000..f80d6c7 Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image24.png differ diff --git a/courses/linux_basics/images/linux/commands/image25.png b/courses/linux_basics/images/linux/commands/image25.png new file mode 100644 index 0000000..8dfaad9 Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image25.png differ diff --git a/courses/linux_basics/images/linux/commands/image26.png b/courses/linux_basics/images/linux/commands/image26.png new file mode 100644 index 0000000..c171945 Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image26.png differ diff --git a/courses/linux_basics/images/linux/commands/image27.png b/courses/linux_basics/images/linux/commands/image27.png new file mode 100644 index 0000000..f0dcc70 Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image27.png differ diff --git a/courses/linux_basics/images/linux/commands/image28.png b/courses/linux_basics/images/linux/commands/image28.png new file mode 100644 index 0000000..9467313 Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image28.png differ diff --git a/courses/linux_basics/images/linux/commands/image29.png b/courses/linux_basics/images/linux/commands/image29.png new file mode 100644 index 0000000..4d9dd96 Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image29.png differ diff --git a/courses/linux_basics/images/linux/commands/image3.png b/courses/linux_basics/images/linux/commands/image3.png new file mode 100644 index 0000000..0693a5a Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image3.png differ diff --git a/courses/linux_basics/images/linux/commands/image30.png b/courses/linux_basics/images/linux/commands/image30.png new file mode 100644 index 0000000..7d9088c Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image30.png differ diff --git a/courses/linux_basics/images/linux/commands/image31.png b/courses/linux_basics/images/linux/commands/image31.png new file mode 100644 index 0000000..1912203 Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image31.png differ diff --git a/courses/linux_basics/images/linux/commands/image32.png b/courses/linux_basics/images/linux/commands/image32.png new file mode 100644 index 0000000..fb2bf71 Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image32.png differ diff --git a/courses/linux_basics/images/linux/commands/image4.png b/courses/linux_basics/images/linux/commands/image4.png new file mode 100644 index 0000000..c589d5f Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image4.png differ diff --git a/courses/linux_basics/images/linux/commands/image5.png b/courses/linux_basics/images/linux/commands/image5.png new file mode 100644 index 0000000..d993a83 Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image5.png differ diff --git a/courses/linux_basics/images/linux/commands/image6.png b/courses/linux_basics/images/linux/commands/image6.png new file mode 100644 index 0000000..0996234 Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image6.png differ diff --git a/courses/linux_basics/images/linux/commands/image7.png b/courses/linux_basics/images/linux/commands/image7.png new file mode 100644 index 0000000..40625ed Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image7.png differ diff --git a/courses/linux_basics/images/linux/commands/image8.png b/courses/linux_basics/images/linux/commands/image8.png new file mode 100644 index 0000000..a96fad7 Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image8.png differ diff --git a/courses/linux_basics/images/linux/commands/image9.png b/courses/linux_basics/images/linux/commands/image9.png new file mode 100644 index 0000000..1de865d Binary files /dev/null and b/courses/linux_basics/images/linux/commands/image9.png differ diff --git a/courses/linux_basics/intro.md b/courses/linux_basics/intro.md new file mode 100644 index 0000000..94e01d7 --- /dev/null +++ b/courses/linux_basics/intro.md @@ -0,0 +1,169 @@ +# Introduction + +## Prerequisites + +- Experience of working on any operating systems like Windows, Linux or Mac +- Basics of operating system + +## What to expect from this course + +This course is divided into three parts. In the first part, we will cover the +fundamentals of linux operating systems. We will talk about linux architecture, +linux distributions and uses of linux operating systems. We will also talk about +difference between GUI and CLI. + +In the second part, we will study about some of the basic commands that are used +in linux. We will focus on commands used for navigating file system, commands used +for manipulating files, commands used for viewing files, I/O redirection etc. + +In the third part, we will study about linux system administration. In this part, we +will focus on day to day tasks performed by linux admins like managing users/groups, +managing file permissions, monitoring system performance, log files etc. + +In the second and third part, we will be taking examples to understand the concepts. + +## What is not covered under this course + +We are not covering advanced linux commands and bash scripting in this +course. We will also not be covering linux internals. + +## Course Content + +### Table of Contents + +The following topics has been covered in this course: + +- Introduction to Linux + - [What are Linux Operating Systems](https://linkedin.github.io/school-of-sre/linux_basics/intro/#what-are-linux-operating-systems) + - [Linux Distributions](https://linkedin.github.io/school-of-sre/linux_basics/intro/#what-are-popular-linux-distributions) + - [Uses of Linux Operating Systems](https://linkedin.github.io/school-of-sre/linux_basics/intro/#uses-of-linux-operating-systems) + - [Linux Architecture](https://linkedin.github.io/school-of-sre/linux_basics/intro/#linux-architecture) + - [GUI vs CLI](https://linkedin.github.io/school-of-sre/linux_basics/intro/#graphical-user-interface-gui-vs-command-line-interface-cli) +- [Command Line Basics](https://linkedin.github.io/school-of-sre/linux_basics/command_line_basics/) + - [Navigating File System](https://linkedin.github.io/school-of-sre/linux_basics/command_line_basics/#commands-for-navigating-the-file-system) + - [Manipulating Files](https://linkedin.github.io/school-of-sre/linux_basics/command_line_basics/#commands-for-manipulating-files) + - [Viewing Files](https://linkedin.github.io/school-of-sre/linux_basics/command_line_basics/#commands-for-viewing-files) + - [Text Processing Commands](https://linkedin.github.io/school-of-sre/linux_basics/command_line_basics/#text-processing-commands) + - [I/O Redirection](https://linkedin.github.io/school-of-sre/linux_basics/command_line_basics/#io-redirection) +- [Linux system administration](https://linkedin.github.io/school-of-sre/linux_basics/linux_server_administration/) + - [User/Groups management](https://linkedin.github.io/school-of-sre/linux_basics/linux_server_administration/#usergroup-management-in-linux) + - [Superuser in Linux](https://linkedin.github.io/school-of-sre/linux_basics/linux_server_administration/#becoming-a-superuser-in-linux) + - [File Permissions](https://linkedin.github.io/school-of-sre/linux_basics/linux_server_administration/#file-permissions-in-linux) + - [SSH Command](https://linkedin.github.io/school-of-sre/linux_basics/linux_server_administration/#ssh-command) + - [Package Management](https://linkedin.github.io/school-of-sre/linux_basics/linux_server_administration/#package-management) + - [Process Management](https://linkedin.github.io/school-of-sre/linux_basics/linux_server_administration/#process-management) + - [Memory Management](https://linkedin.github.io/school-of-sre/linux_basics/linux_server_administration/#memory-management) + - [Daemons and Systemd](https://linkedin.github.io/school-of-sre/linux_basics/linux_server_administration/#daemons) + - [Logs](https://linkedin.github.io/school-of-sre/linux_basics/linux_server_administration/#logs) + + +## What are Linux operating systems + +Most of us will be familiar with the windows operating system which is +used in more than 75% of the personal computers. The windows operating systems +are based on windows NT kernel. A kernel is the most important part of +an operating system which performs important functions like process +management, memory management, filesystem management etc. + +Linux operating systems are based on the Linux kernel. A linux based +operating system will consist of linux kernel, GUI/CLI, system libraries +and system utilities. The Linux kernel was independently developed and +released by Linus Torvalds. The linux kernel is free and open-source - +[https://github.com/torvalds/linux](https://github.com/torvalds/linux) + +History of Linux - +[https://en.wikipedia.org/wiki/History_of_Linux](https://en.wikipedia.org/wiki/History_of_Linux) + +## What are popular Linux distributions + +A linux distribution(distro) is an operating system that is based on +the linux kernel and a package management system. A package management +system consists of tools that helps in installing, upgrading, +configuring and removing softwares on the operating system. + +Softwares are usually adopted to a distribution and are packaged in a +distro specific format. These packages are available through a distro +specific repository. Packages are installed and managed in the operating +system by a package manager. + +**List of popular Linux distributions:** + +- Fedora + +- Ubuntu + +- Debian + +- Centos + +- Red Hat Enterprise Linux + +- Suse + +- Arch Linux + + +| Packaging systems | Distributions | Package manager +| ---------------------- | ------------------------------------------ | ----------------- +| Debian style (.deb) | Debian, Ubuntu | APT +| Red Hat style (.rpm) | Fedora, CentOS, Red Hat Enterprise Linux | YUM + +## Linux Architecture + +![](images/linux/commands/image25.png) + +- The Linux kernel is monolithic in nature. + +- System calls are used to interact with the linux kernel space. + +- Kernel code can only be executed in the kernel mode. Non-kernel code is executed in the user mode. + +- Device drivers are used to communicate with the hardware devices. + +## Uses of Linux Operating Systems + +Operating system based on linux kernel are widely used in: + +- Personal computers + +- Servers + +- Mobile phones - Android is based on linux operating system + +- Embedded devices - watches, televisions, traffic lights etc + +- Satelites + +- Network devices - routers, switches etc. + +## Graphical user interface (GUI) vs Command line interface (CLI) + +A user interacts with a computer with the help of user interfaces. The +user interface can be either GUI or CLI. + +Graphical user interface allows a user to interact with the computer +using graphics such as icons and images. When a user clicks on an icon +to open an application on a computer, he or she is actually using the +GUI. It's easy to perform tasks using GUI. + +Command line interface allows a user to interact with the computer using +commands. A user types the command in a terminal and the system helps in +executing these commands. A new user with experience on GUI may find it +difficult to interact with CLI as he/she needs to be aware of the commands +to perform a particular operation. + +## Shell vs Terminal + +Shell is a program that takes command or a group of commands from the +users and gives them to the operating system for processing. Shell is an +example of command line interface. Bash is one of the most popular shell +programs available on linux servers. Other popular shell programs are +zsh, ksh and tcsh. + +Terminal is a program that opens a window and lets you interact with the +shell. Some popular examples of terminals are gnome-terminal, xterm, +konsole etc. + +Linux users do use the terms shell, terminal, prompt, console etc. +interchangeably. In simple terms, these all refer to a way of taking +commands from the user. diff --git a/courses/linux_basics/linux_server_administration.md b/courses/linux_basics/linux_server_administration.md new file mode 100644 index 0000000..9da1e44 --- /dev/null +++ b/courses/linux_basics/linux_server_administration.md @@ -0,0 +1,622 @@ +# Linux Server Administration + +In this course will try to cover some of the common tasks that a linux +server administrator performs. We will first try to understand what a +particular command does and then try to understand the commands using +examples. Do keep in mind that it's very important to practice the linux +commands on your own. + +## Lab Environment Setup + +- Install docker on your system - [https://docs.docker.com/engine/install/](https://docs.docker.com/engine/install/) + +- We will be running all the commands on Red Hat Enterprise Linux (RHEL) 8 system. + + ![](images/linux/admin/image19.png) + +- We will run most of the commands used in this module in the above docker container. + +## Multi-User Operating Systems + +An operating system is considered as multi-user if it allows multiple people/users to use a computer and not affect each other files and preferences. Linux based operating systems are multi-user in nature as it allows multiple users to access the system at the same time. A typical computer will only have one keyboard and monitor but multiple users can log in via ssh if the computer is connected to the network. We will cover more about ssh later. + +As a server administrator, we are mostly concerned with the linux servers which are physically present at a very large distance from us. We can connect to these servers with the help of remote login methods like ssh. + +Since linux supports multiple users, we need to have a method which can protect the users from each other. One user should not be able to access and modify files of other users + + +## User/Group Management in Linux + +- Each user in linux has an associated user ID called UID attached to him + +- Each user also has a home directory and a login shell associated with him/her + +- A group is a collection of one or more users. A group makes it easier to share permissions among a group of users. + +- Each group has a group ID called GID associated with it. + +### id command in linux + +id command can be used to find the uid and gid associated with an user. +It also lists down the groups to which the user belongs to. + +The uid and gid associated with the root user is 0. +![](images/linux/admin/image30.png) + +A good way to find out the current user in linux is to use the whoami +command. + +![](images/linux/admin/image35.png) + +**"root" user or superuser is the most privileged user with** +**unrestricted access to all the resources on the system. It has UID 0** + +### Important files associated with users/groups + +| /etc/passwd | Stores the user name, the uid, the gid, the home directory, the login shell etc | +| -------------| --------------------------------------------------------------------------------- +| /etc/shadow | Stores the password associated with the users | +| /etc/group | Stores information about different groups on the system | + +![](images/linux/admin/image23.png) + +![](images/linux/admin/image21.png) + +![](images/linux/admin/image9.png) + +If you want to understand each filed discussed in the above outputs, you can go +through below links: + +- [https://tldp.org/LDP/lame/LAME/linux-admin-made-easy/shadow-file-formats.html](https://tldp.org/LDP/lame/LAME/linux-admin-made-easy/shadow-file-formats.html) + +- [https://tldp.org/HOWTO/User-Authentication-HOWTO/x71.html](https://tldp.org/HOWTO/User-Authentication-HOWTO/x71.html) + +## Important commands for managing users + +Some of the commands which are used frequently to manage users/groups +on linux are following: + +- useradd - Creates a new user + +- passwd - Adds or modifies passwords for a user + +- usermod - Modifies attributes of an user + +- userdel - Deletes an user + +### useradd + +The useradd command adds a new user in linux. + +We will create a new user 'shivam'. We will also verify that the user +has been created by tailing the /etc/passwd file. The uid and gid are +1000 for the newly created user. The home directory assigned to the user +is /home/shivam and the login shell assigned is /bin/bash. Do note that +the user home directory and login shell can be modified later on. + +![](images/linux/admin/image41.png) + +If we do not specify any value for attributes like home directory or +login shell, default values will be assigned to the user. We can also +override these default values when creating a new user. + +![](images/linux/admin/image54.png) + +### passwd + +The passwd command is used to create or modify passwords for a user. + +In the above examples, we have not assigned any password for users +'shivam' or 'amit' while creating them. + +\"!!\" in an account entry in shadow means the account of an user has +been created, but not yet given a password. + +![](images/linux/admin/image13.png) + +Let's now try to create a password for user "shivam". + +![](images/linux/admin/image55.png) + +Do remember the password as we will be later using examples +where it will be useful. + +Also, let's change the password for the root user now. When we switch +from a normal user to root user, it will request you for a password. +Also, when you login using root user, the password will be asked. + +![](images/linux/admin/image39.png) + +### usermod + +The usermod command is used to modify the attributes of an user like the +home directory or the shell. + +Let's try to modify the login shell of user "amit" to "/bin/bash". + +![](images/linux/admin/image17.png) + +In a similar way, you can also modify many other attributes for a user. +Try 'usermod -h' for a list of attributes you can modify. + +### userdel + +The userdel command is used to remove a user on linux. Once we remove a +user, all the information related to that user will be removed. + +Let's try to delete the user "amit". After deleting the user, you will +not find the entry for that user in "/etc/passwd" or "/etc/shadow" file. + +![](images/linux/admin/image34.png) + +## Important commands for managing groups + +Commands for managing groups are quite similar to the commands used for managing users. Each command is not explained in detail here as they are quite similar. You can try running these commands on your system. + + +| groupadd \ | Creates a new group | +| ------------------------ | ------------------------------- | +| groupmod \ | Modifies attributes of a group | +| groupdel \ | Deletes a group | +| gpasswd \ | Modifies password for group | + +![](images/linux/admin/image52.png) + +We will now try to add user "shivam" to the group we have created above. + +![](images/linux/admin/image33.png) + +## Becoming a Superuser in Linux + +**Before running the below commands, do make sure that you have set up a +password for user "shivam" and user "root" using the passwd command +described in the above section.** + +The su command can be used to switch users in linux. Let's now try to +switch to user "shivam". + +![](images/linux/admin/image37.png) + +Let's now try to open the "/etc/shadow" file. + +![](images/linux/admin/image29.png) + +The operating system didn't allow the user "shivam" to read the content +of the "/etc/shadow" file. This is an important file in linux which +stores the passwords of users. This file can only be accessed by root or +users who have the superuser privileges. + + +**The sudo command allows a** **user to run commands with the security +privileges of the root user.** Do remember that the root user has all +the privileges on a system. We can also use su command to switch to the +root user and open the above file but doing that will require the +password of the root user. An alternative way which is preferred on most +modern operating systems is to use sudo command for becoming a +superuser. Using this way, a user has to enter his/her password and they +need to be a part of the sudo group. + +**How to provide superpriveleges to other users ?** + +Let's first switch to the root user using su command. Do note that using +the below command will need you to enter the password for the root user. + +![](images/linux/admin/image44.png) + +In case, you forgot to set a password for the root user, type "exit" and +you will be back as the root user. Now, set up a password using the +passwd command. + +**The file /etc/sudoers holds the names of users permitted to invoke +sudo**. In redhat operating systems, this file is not present by +default. We will need to install sudo. + +![](images/linux/admin/image3.png) + +We will discuss the yum command in detail in later sections. + +Try to open the "/etc/sudoers" file on the system. The file has a lot of +information. This file stores the rules that users must follow when +running the sudo command. For example, root is allowed to run any +commands from anywhere. + +![](images/linux/admin/image8.png) + +One easy way of providing root access to users is to add them to a group +which has permissions to run all the commands. "wheel" is a group in +redhat linux with such privileges. + +![](images/linux/admin/image25.png) + +Let's add the user "shivam" to this group so that it also has sudo +privileges. + +![](images/linux/admin/image48.png) + +Let's now switch back to user "shivam" and try to access the +"/etc/shadow" file. + +![](images/linux/admin/image56.png) + +We need to use sudo before running the command since it can only be +accessed with the sudo privileges. We have already given sudo privileges +to user “shivam” by adding him to the group “wheel”. + + +## File Permissions in Linux + +On a linux operating system, each file and directory is assigned access +permissions for the owner of the file, the members of a group of related +users and everybody else. This is to make sure that one user is not +allowed to access the files and resources of another user. + +To see the permissions of a file, we can use the ls command. Let's look +at the permissions of /etc/passwd file. + +![](images/linux/admin/image40.png) + +Let's go over some of the important fields in the output that are +related to file permissions. + +![](images/linux/admin/image31.jpg) + + +![](images/linux/admin/image57.png) + +### Chmod command in linux + +The chmod command is used to modify files and directories permissions in +linux. + +The chmod command accepts permissions in as a numerical argument. We can +think of permission as a series of bits with 1 representing True or +allowed and 0 representing False or not allowed. + +| Permission | rwx | Binary | Decimal | +| -------------------------| ------- | ------- | --------- | +| Read, write and execute | rwx | 111 | 7 | +| Read and write | rw- | 110 | 6 | +| Read and execute | r-x | 101 | 5 | +| Read only | r-- | 100 | 4 | +| Write and execute | -wx | 011 | 3 | +| Write only | -w- | 010 | 2 | +| Execute only | --x | 001 | 1 | +| None | --- | 000 | 0 | + +We will now create a new file and check the permission of the file. + +![](images/linux/admin/image15.png) + +The group owner doesn't have the permission to write to this file. Let's +give the group owner or root the permission to write to it using chmod +command. + +![](images/linux/admin/image26.png) + +Chmod command can be also used to change the permissions of a directory +in the similar way. + +### Chown command in linux + +The chown command is used to change the owner of files or +directories in linux. + +Command syntax: chown \ \ + +![](images/linux/admin/image6.png) + +**In case, we do not have sudo privileges, we need to use sudo +command**. Let's switch to user 'shivam' and try changing the owner. We +have also changed the owner of the file to root before running the below +command. + +![](images/linux/admin/image12.png) + +Chown command can also be used to change the owner of a directory in the +similar way. + +### Chgrp command in linux + +The chgrp command can be used to change the group ownership of files or +directories in linux. The syntax is very similar to that of chown +command. + +![](images/linux/admin/image27.png) + +Chgrp command can also be used to change the owner of a directory in the +similar way. + +## SSH Command + +The ssh command is used for logging into the remote systems, transfer files between systems and for executing commands on a remote machine. SSH stands for secure shell and is used to provide an encrypted secured connection between two hosts over an insecure network like the internet. + +Reference: +[https://www.ssh.com/ssh/command/](https://www.ssh.com/ssh/command/) + +We will now discuss passwordless authentication which is secure and most +commonly used for ssh authentication. + +### Passwordless Authentication Using SSH + +Using this method, we can ssh into hosts without entering the password. +This method is also useful when we want some scripts to perform +ssh-related tasks. + +Passwordless authentication requires the use of a public and private key pair. As the name implies, the public key can be shared with anyone but the private key should be kept private. +Lets not get into the details of how this authentication works. You can read more about it +[here](https://www.digitalocean.com/community/tutorials/understanding-the-ssh-encryption-and-connection-process) + +Steps for setting up a passwordless authentication with a remote host: + +1. Generating public-private key pair + + **If we already have a key pair stored in \~/.ssh directory, we will not need to generate keys again.** + + Install openssh package which contains all the commands related to ssh. + + ![](images/linux/admin/image49.png) + + Generate a key pair using the ssh-keygen command. One can choose the + default values for all prompts. + + ![](images/linux/admin/image47.png) + + After running the ssh-keygen command successfully, we should see two + keys present in the \~/.ssh directory. Id_rsa is the private key and + id_rsa.pub is the public key. Do note that the private key can only be + read and modified by you. + + ![](images/linux/admin/image7.png) + +2. Transferring the public key to the remote host + + There are multiple ways to transfer the public key to the remote server. + We will look at one of the most common ways of doing it using the + ssh-id-copy command. + + ![](images/linux/admin/image11.png) + + Install the openssh-clients package to use ssh-id-copy command. + + ![](images/linux/admin/image46.png) + + Use the ssh-id-copy command to copy your public key to the remote host. + + ![](images/linux/admin/image50.png) + + Now, ssh into the remote host using the password authentication. + + ![](images/linux/admin/image51.png) + + Our public key should be there in \~/.ssh/authorized_keys now. + + ![](images/linux/admin/image4.png) + + \~/.ssh/authorized_key contains a list of public keys. The users + associated with these public keys have the ssh access into the remote + host. + + +### How to run commands on a remote host ? + +General syntax: ssh \@\ \ + +![](images/linux/admin/image14.png) + +### How to transfer files from one host to another host ? + +General syntax: scp \ \ + +![](images/linux/admin/image32.png) + +## Package Management + +Package management is the process of installing and managing software on +the system. We can install the packages which we require from the linux +package distributor. Different distributors use different packaging +systems. + +| Packaging systems | Distributions | +| ---------------------- | ------------------------------------------ | +| Debian style (.deb) | Debian, Ubuntu | +| Red Hat style (.rpm) | Fedora, CentOS, Red Hat Enterprise Linux | + +**Popular Packaging Systems in Linux** + +|Command | Description | +| ----------------------------- | --------------------------------------------------- | +| yum install \ | Installs a package on your system | +| yum update \ | Updates a package to it's latest available version | +| yum remove \ | Removes a package from your system | +| yum search \ | Searches for a particular keyword | + +[DNF](https://docs.fedoraproject.org/en-US/quick-docs/dnf/) is +the successor to YUM which is now used in Fedora for installing and +managing packages. DNF may replace YUM in the future on all RPM based +linux distributions. + +![](images/linux/admin/image20.png) + +We did find an exact match for the keyword httpd when we searched using +yum search command. Let's now install the httpd package. + +![](images/linux/admin/image28.png) + +After httpd is installed, we will use the yum remove command to remove +httpd package. + +![](images/linux/admin/image43.png) + +## Process Management + +In this section, we will study about some useful commands that can be +used to monitor the processes on linux systems. + +### ps (process status) + +The ps command is used to know the information of a process or list of +processes. + +![](images/linux/admin/image24.png) + +If you get an error "ps command not found" while running ps command, do +install **procps** package. + +ps without any arguments is not very useful. Let's try to list all the +processes on the system by using the below command. + +Reference: +[https://unix.stackexchange.com/questions/106847/what-does-aux-mean-in-ps-aux](https://unix.stackexchange.com/questions/106847/what-does-aux-mean-in-ps-aux) + +![](images/linux/admin/image42.png) + +We can use an additional argument with ps command to list the +information about the process with a specific process ID. + +![](images/linux/admin/image2.png) + +We can use grep in combination with ps command to list only specific +processes. + +![](images/linux/admin/image1.png) + +### top + +The top command is used to show information about linux processes +running on the system in real time. It also shows a summary of the +system information. + +![](images/linux/admin/image53.png) + +For each process, top lists down the process ID, owner, priority, state, +cpu utilization, memory utilization and much more information. It also +lists down the memory utilization and cpu utilization of the system as a +whole along with system uptime and cpu load average. + +## Memory Management + +In this section, we will study about some useful commands that can be +used to view information about the system memory. + +### free + +The free command is used to display the memory usage of the system. The +command displays the total free and used space available in the RAM +along with space occupied by the caches/buffers. + +![](images/linux/admin/image22.png) + +free command by default shows the memory usage in kilobytes. We can use +an additional argument to get the data in human-readable format. + +![](images/linux/admin/image5.png) + +### vmstat + +The vmstat command can be used to display the memory usage along with +additional information about io and cpu usage. + +![](images/linux/admin/image38.png) + +## Checking Disk Space in Linux + +In this section, we will study about some useful commands that can be +used to view disk space on linux. + +### df (disk free) + +The df command is used to display the free and available space for each +mounted file system. + +![](images/linux/admin/image36.png) + +### du (disk usage) + +The du command is used to display disk usage of files and directories on +the system. + +![](images/linux/admin/image10.png) + +The below command can be used to display the top 5 largest directories +in the root directory. + +![](images/linux/admin/image18.png) + +## Daemons + +A computer program that runs as a background process is called a daemon. +Traditionally, the name of daemon processes ended with d - sshd, httpd +etc. We cannot interact with a daemon process as they run in the +background. + +Services and daemons are used interchangeably most of the time. + +## Systemd + +Systemd is a system and service manager for Linux operating systems. +Systemd units are the building blocks of systemd. These units are +represented by unit configuration files. + +The below examples shows the unit configuration files available at +/usr/lib/systemd/system which are distributed by installed RPM packages. +We are more interested in the configuration file that ends with service +as these are service units. + +![](images/linux/admin/image16.png) + +### Managing System Services + +Service units end with .service file extension. Systemctl command can be +used to start/stop/restart the services managed by systemd. + +| Command | Description | +| ------------------------------- | -------------------------------------- | +| systemctl start name.service | Starts a service | +| systemctl stop name.service | Stops a service | +| systemctl restart name.service | Restarts a service | +| systemctl status name.service | Check the status of a service | +| systemctl reload name.service | Reload the configuration of a service | + +## Logs + +In this section, we will talk about some important files and directories +which can be very useful for viewing system logs and applications logs +in linux. These logs can be very useful when you are troubleshooting on +the system. + +![](images/linux/admin/image58.png) + +## Applications in SRE Role + +- Different users will have different permissions depending on their + roles. We will also not want everyone in the company to access our + servers for security reasons. Users permissions can be restricted + with chown, chmod and chgrp commands. + +- SSH is one of the most frequently used commands for a SRE. Logging + into servers and troubleshooting along with performing basic + administration tasks will only be possible if we are able to login + into the server. + +- What if we want to run an apache server or nginx on a server ? We + will first install it using the package manager. Package + management commands become important here. + +- Managing services on servers is another critical responsibility of a + SRE. Systemd related commands can help in troubleshooting issues. + If a service goes down, we can start it using systemctl start + command. We can also stop a service in case it is not needed. + +- Monitoring is another core responsibility of a SRE. Memory and CPU + are two important system level metrics which should be monitored. + Commands like top and free are quite helpful here. + +- If a service is throwing an error, how do we find out the root cause + of the error ? We will certainly need to check logs to find out + the whole stack trace of the error. The log file will also tell us + the number of times the error has occurred along with time when it + started. + +## Useful courses and tutorials + +- Edx Red Hat Enterprise Linux Course - [https://courses.edx.org/courses/course-v1:RedHat+RH066x+2T2017/course/](https://courses.edx.org/courses/course-v1:RedHat+RH066x+2T2017/course/) diff --git a/courses/linux_networking/conclusion.md b/courses/linux_networking/conclusion.md new file mode 100644 index 0000000..0b15e6d --- /dev/null +++ b/courses/linux_networking/conclusion.md @@ -0,0 +1,11 @@ +# Conclusion + +With this we have traversed through the TCP/IP stack completely. We hope there will be a different perspective when one opens any website in the browser post the course. + +During the course we have also dissected what are common tasks in this pipeline which falls under the ambit of SRE. + +# Post Training Exercises +1. Setup own DNS resolver in the dev environment which acts as an authoritative DNS server for example.com and forwarder for other domains. Update resolv.conf to use the new DNS resolver running in localhost +2. Set up a site dummy.example.com in localhost and run a webserver with a self signed certificate. Update the trusted CAs or pass self signed CA’s public key as a parameter so that curl https://dummy.example.com -v works properly without self signed cert warning +3. Update the routing table to use another host(container/VM) in the same network as a gateway for 8.8.8.8/32 and run ping 8.8.8.8. Do the packet capture on the new gateway to see L3 hop is working as expected(might need to disable icmp_redirect) + diff --git a/courses/linux_networking/dns.md b/courses/linux_networking/dns.md new file mode 100644 index 0000000..2ff14a8 --- /dev/null +++ b/courses/linux_networking/dns.md @@ -0,0 +1,142 @@ +# DNS +Domain Names are the simple human-readable names for websites. The Internet understands only IP addresses, but since memorizing incoherent numbers is not practical, domain names are used instead. These domain names are translated into IP addresses by the DNS infrastructure. When somebody tries to open www.linkedin.com in the browser, the browser tries to convert www.linkedin.com to an IP Address. This process is called DNS resolution. A simple pseudocode depicting this process looks this + +```python +ip, err = getIPAddress(domainName) +if err: + print(“unknown Host Exception while trying to resolve:%s”.format(domainName)) +``` + +Now let’s try to understand what happens inside the getIPAddress function. The browser would have a DNS cache of its own where it checks if there is a mapping for the domainName to an IP Address already available, in which case the browser uses that IP address. If no such mapping exists, the browser calls gethostbyname syscall to ask the operating system to find the IP address for the given domainName + +```python +def getIPAddress(domainName): + resp, fail = lookupCache(domainName) + If not fail: + return resp + else: + resp, err = gethostbyname(domainName) + if err: + return null, err + else: + return resp +``` + +Now lets understand what operating system kernel does when the gethostbyname function is called. The Linux operating system looks at the file [/etc/nsswitch.conf](https://man7.org/linux/man-pages/man5/nsswitch.conf.5.html) file which usually has a line + +```bash +hosts: files dns +``` + +This line means the OS has to look up first in file (/etc/hosts) and then use DNS protocol to do the resolution if there is no match in /etc/hosts. + +The file /etc/hosts is of format + + +IPAddress FQDN [FQDN].* + +```bash +127.0.0.1 localhost.localdomain localhost +::1 localhost.localdomain localhost +``` + +If a match exists for a domain in this file then that IP address is returned by the OS. Lets add a line to this file + +```bash +127.0.0.1 test.linkedin.com +``` + +And then do ping test.linkedin.com + +```bash +ping test.linkedin.com -n +``` + +```bash +PING test.linkedin.com (127.0.0.1) 56(84) bytes of data. +64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.047 ms +64 bytes from 127.0.0.1: icmp_seq=2 ttl=64 time=0.036 ms +64 bytes from 127.0.0.1: icmp_seq=3 ttl=64 time=0.037 ms + +``` + +As mentioned earlier, if no match exists in /etc/hosts, the OS tries to do a DNS resolution using the DNS protocol. The linux system makes a DNS request to the first IP in /etc/resolv.conf. If there is no response, requests are sent to subsequent servers in resolv.conf. These servers in resolv.conf are called DNS resolvers. The DNS resolvers are populated by [DHCP](https://en.wikipedia.org/wiki/Dynamic_Host_Configuration_Protocol) or statically configured by an administrator. +[Dig](https://linux.die.net/man/1/dig) is a userspace DNS system which creates and sends requests to DNS resolvers and prints the response it receives to the console. + +```bash +#run this command in one shell to capture all DNS requests +sudo tcpdump -s 0 -A -i any port 53 +#make a dig request from another shell +dig linkedin.com +``` + +```bash +13:19:54.432507 IP 172.19.209.122.56497 > 172.23.195.101.53: 527+ [1au] A? linkedin.com. (41) +....E..E....@.n....z...e...5.1.:... .........linkedin.com.......)........ +13:19:54.485131 IP 172.23.195.101.53 > 172.19.209.122.56497: 527 1/0/1 A 108.174.10.10 (57) +....E..U..@.|. ....e...z.5...A...............linkedin.com..............3..l. + +..)........ +``` + + +The packet capture shows a request is made to 172.23.195.101:53 (this is the resolver in /etc/resolv.conf) for linkedin.com and a response is received from 172.23.195.101 with the IP address of linkedin.com 108.174.10.10 + +Now let's try to understand how DNS resolver tries to find the IP address of linkedin.com. DNS resolver first looks at its cache. Since many devices in the network can query for the domain name linkedin.com, the name resolution result may already exist in the cache. If there is a cache miss, it starts the DNS resolution process. The DNS server breaks “linkedin.com” to “.”, “com.” and “linkedin.com.” and starts DNS resolution from “.”. The “.” is called root domain and those IPs are known to the DNS resolver software. DNS resolver queries the root domain Nameservers to find the right nameservers which could respond regarding details for "com.". The address of the authoritative nameserver of “com.” is returned. Now the DNS resolution service contacts the authoritative nameserver for “com.” to fetch the authoritative nameserver for “linkedin.com”. Once an authoritative nameserver of “linkedin.com” is known, the resolver contacts Linkedin’s nameserver to provide the IP address of “linkedin.com”. This whole process can be visualized by running + +```bash +dig +trace linkedin.com +``` + + +```bash +linkedin.com. 3600 IN A 108.174.10.10 +``` +This DNS response has 5 fields where the first field is the request and the last field is the response. The second field is the Time to Live which says how long the DNS response is valid in seconds. In this case this mapping of linkedin.com is valid for 1 hour. This is how the resolvers and application(browser) maintain their cache. Any request for linkedin.com beyond 1 hour will be treated as a cache miss as the mapping has expired its TTL and the whole process has to be redone. +The 4th field says the type of DNS response/request. Some of the various DNS query types are +A, AAAA, NS, TXT, PTR, MX and CNAME. +- A record returns IPV4 address of the domain name +- AAAA record returns the IPV6 address of the domain Name +- NS record returns the authoritative nameserver for the domain name +- CNAME records are aliases to the domain names. Some domains point to other domain names and resolving the latter domain name gives an IP which is used as an IP for the former domain name as well. Example www.linkedin.com’s IP address is the same as 2-01-2c3e-005a.cdx.cedexis.net. +- For the brevity we are not discussing other DNS record types, the RFC of each of these records are available [here](https://en.wikipedia.org/wiki/List_of_DNS_record_types). + +```bash +dig A linkedin.com +short +108.174.10.10 + + +dig AAAA linkedin.com +short +2620:109:c002::6cae:a0a + + +dig NS linkedin.com +short +dns3.p09.nsone.net. +dns4.p09.nsone.net. +dns2.p09.nsone.net. +ns4.p43.dynect.net. +ns1.p43.dynect.net. +ns2.p43.dynect.net. +ns3.p43.dynect.net. +dns1.p09.nsone.net. + +dig www.linkedin.com CNAME +short +2-01-2c3e-005a.cdx.cedexis.net. +``` +Armed with these fundamentals of DNS lets see usecases where DNS is used by SREs. + +## Applications in SRE role + +This section covers some of the common solutions SRE can derive from DNS +1. Every company has to have its internal DNS infrastructure for intranet sites and internal services like databases and other internal applications like wiki. So there has to be a DNS infrastructure maintained for those domain names by the infrastructure team. This DNS infrastructure has to be optimized and scaled so that it doesn’t become a single point of failure. Failure of the internal DNS infrastructure can cause API calls of microservices to fail and other cascading effects. +2. DNS can also be used for discovering services. For example the hostname serviceb.internal.example.com could list instances which run service b internally in example.com company. Cloud providers provide options to enable DNS discovery([example](https://docs.aws.amazon.com/whitepapers/latest/microservices-on-aws/service-discovery.html#dns-based-service-discovery)) +3. DNS is used by cloud provides and CDN providers to scale their services. In Azure/AWS, Load Balancers are given a CNAME instead of IPAddress. They update the IPAddress of the Loadbalancers as they scale by changing the IP Address of alias domain names. This is one of the reasons why A records of such alias domains are short lived like 1 minute. +4. DNS can also be used to make clients get IP addresses closer to their location so that their HTTP calls can be responded faster if the company has a presence geographically distributed. +5. SRE also has to understand since there is no verification in DNS infrastructure, these responses can be spoofed. This is safeguarded by other protocols like HTTPS(dealt later). DNSSEC protects from forged or manipulated DNS responses. +6. Stale DNS cache can be a problem. Some [apps](https://stackoverflow.com/questions/1256556/how-to-make-java-honor-the-dns-caching-timeout) might still be using expired DNS records for their api calls. This is something SRE has to be wary of when doing maintenance. +7. DNS Loadbalancing and service discovery also has to understand TTL and the servers can be removed from the pool only after waiting till TTL post the changes are made to DNS records. If this is not done, a certain portion of the traffic will fail as the server is removed before the TTL. + + + + + diff --git a/courses/linux_networking/http.md b/courses/linux_networking/http.md new file mode 100644 index 0000000..a02adff --- /dev/null +++ b/courses/linux_networking/http.md @@ -0,0 +1,129 @@ +# HTTP + +Till this point we have only got the IP address of linkedin.com. The HTML page of linkedin.com is served by HTTP protocol which the browser renders. Browser sends a HTTP request to the IP of the server determined above. +Request has a verb GET, PUT, POST followed by a path and query parameters and lines of key value pair which gives information about the client and capabilities of the client like contents it can accept and a body (usually in POST or PUT) + +```bash +# Eg run the following in your container and have a look at the headers +curl linkedin.com -v +``` +```bash +* Connected to linkedin.com (108.174.10.10) port 80 (#0) +> GET / HTTP/1.1 +> Host: linkedin.com +> User-Agent: curl/7.64.1 +> Accept: */* +> +< HTTP/1.1 301 Moved Permanently +< Date: Mon, 09 Nov 2020 10:39:43 GMT +< X-Li-Pop: prod-esv5 +< X-LI-Proto: http/1.1 +< Location: https://www.linkedin.com/ +< Content-Length: 0 +< +* Connection #0 to host linkedin.com left intact +* Closing connection 0 +``` + +Here, in the first line GET is the verb, / is the path and 1.1 is the HTTP protocol version. Then there are key value pairs which give client capabilities and some details to the server. The server responds back with HTTP version, Status Code and Status message. Status codes 2xx means success, 3xx denotes redirection, 4xx denotes client side errors and 5xx server side errors. + +We will now jump in to see the difference between HTTP/1.0 and HTTP/1.1. + +```bash +#On the terminal type +telnet www.linkedin.com 80 +#Copy and paste the following with an empty new line at last in the telnet STDIN +GET / HTTP/1.1 +HOST:linkedin.com +USER-AGENT: curl + +``` + + +This would get server response and waits for next input as the underlying connection to www.linkedin.com can be reused for further queries. While going through TCP, we can understand the benefits of this. But in HTTP/1.0 this connection will be immediately closed after the response meaning new connection has to be opened for each query. HTTP/1.1 can have only one inflight request in an open connection but connection can be reused for multiple requests one after another. One of the benefits of HTTP/2.0 over HTTP/1.1 is we can have multiple inflight requests on the same connection. We are restricting our scope to generic HTTP and not jumping to the intricacies of each protocol version but they should be straight forward to understand post the course. + +HTTP is called **stateless protocol**. This section we will try to understand what stateless means. Say we logged in to linkedin.com, each request to linkedin.com from the client will have no context of the user and it makes no sense to prompt user to login for each page/resource. This problem of HTTP is solved by *COOKIE*. A user is created a session when a user logs in. This session identifier is sent to the browser via *SET-COOKIE* header. The browser stores the COOKIE till the expiry set by the server and sends the cookie for each request from hereon for linkedin.com. More details on cookies are available [here](https://developer.mozilla.org/en-US/docs/Web/HTTP/Cookies). Cookies are a critical piece of information like password and since HTTP is a plain text protocol, any man in the middle can capture either password or cookies and can breach the privacy of the user. Similarly as discussed during DNS a spoofed IP of linkedin.com can cause a phishing attack on users where an user can give linkedin’s password to login on the malicious site. To solve both problems HTTPs came in place and HTTPs has to be mandated. + +HTTPS has to provide server identification and encryption of data between client and server. The server administrator has to generate a private public key pair and certificate request. This certificate request has to be signed by a certificate authority which converts the certificate request to a certificate. The server administrator has to update the certificate and private key to the webserver. The certificate has details about the server (like domain name for which it serves, expiry date), public key of the server. The private key is a secret to the server and losing the private key loses the trust the server provides. When clients connect, the client sends a HELLO. The server sends its certificate to the client. The client checks the validity of the cert by seeing if it is within its expiry time, if it is signed by a trusted authority and the hostname in the cert is the same as the server. This validation makes sure the server is the right server and there is no phishing. Once that is validated, the client negotiates a symmetrical key and cipher with the server by encrypting the negotiation with the public key of the server. Nobody else other than the server who has the private key can understand this data. Once negotiation is complete, that symmetric key and algorithm is used for further encryption which can be decrypted only by client and server from thereon as they only know the symmetric key and algorithm. The switch to symmetric algorithm from asymmetric encryption algorithm is to not strain the resources of client devices as symmetric encryption is generally less resource intensive than asymmetric. + +```bash +#Try the following on your terminal to see the cert details like Subject Name(domain name), Issuer details, Expiry date +curl https://www.linkedin.com -v +``` +```bash +* Connected to www.linkedin.com (13.107.42.14) port 443 (#0) +* ALPN, offering h2 +* ALPN, offering http/1.1 +* successfully set certificate verify locations: +* CAfile: /etc/ssl/cert.pem + CApath: none +* TLSv1.2 (OUT), TLS handshake, Client hello (1): +} [230 bytes data] +* TLSv1.2 (IN), TLS handshake, Server hello (2): +{ [90 bytes data] +* TLSv1.2 (IN), TLS handshake, Certificate (11): +{ [3171 bytes data] +* TLSv1.2 (IN), TLS handshake, Server key exchange (12): +{ [365 bytes data] +* TLSv1.2 (IN), TLS handshake, Server finished (14): +{ [4 bytes data] +* TLSv1.2 (OUT), TLS handshake, Client key exchange (16): +} [102 bytes data] +* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1): +} [1 bytes data] +* TLSv1.2 (OUT), TLS handshake, Finished (20): +} [16 bytes data] +* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1): +{ [1 bytes data] +* TLSv1.2 (IN), TLS handshake, Finished (20): +{ [16 bytes data] +* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384 +* ALPN, server accepted to use h2 +* Server certificate: +* subject: C=US; ST=California; L=Sunnyvale; O=LinkedIn Corporation; CN=www.linkedin.com +* start date: Oct 2 00:00:00 2020 GMT +* expire date: Apr 2 12:00:00 2021 GMT +* subjectAltName: host "www.linkedin.com" matched cert's "www.linkedin.com" +* issuer: C=US; O=DigiCert Inc; CN=DigiCert SHA2 Secure Server CA +* SSL certificate verify ok. +* Using HTTP2, server supports multi-use +* Connection state changed (HTTP/2 confirmed) +* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0 +* Using Stream ID: 1 (easy handle 0x7fb055808200) +* Connection state changed (MAX_CONCURRENT_STREAMS == 100)! + 0 82117 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 +* Connection #0 to host www.linkedin.com left intact +HTTP/2 200 +cache-control: no-cache, no-store +pragma: no-cache +content-length: 82117 +content-type: text/html; charset=utf-8 +expires: Thu, 01 Jan 1970 00:00:00 GMT +set-cookie: JSESSIONID=ajax:2747059799136291014; SameSite=None; Path=/; Domain=.www.linkedin.com; Secure +set-cookie: lang=v=2&lang=en-us; SameSite=None; Path=/; Domain=linkedin.com; Secure +set-cookie: bcookie="v=2&70bd59e3-5a51-406c-8e0d-dd70befa8890"; domain=.linkedin.com; Path=/; Secure; Expires=Wed, 09-Nov-2022 22:27:42 GMT; SameSite=None +set-cookie: bscookie="v=1&202011091050107ae9b7ac-fe97-40fc-830d-d7a9ccf80659AQGib5iXwarbY8CCBP94Q39THkgUlx6J"; domain=.www.linkedin.com; Path=/; Secure; Expires=Wed, 09-Nov-2022 22:27:42 GMT; HttpOnly; SameSite=None +set-cookie: lissc=1; domain=.linkedin.com; Path=/; Secure; Expires=Tue, 09-Nov-2021 10:50:10 GMT; SameSite=None +set-cookie: lidc="b=VGST04:s=V:r=V:g=2201:u=1:i=1604919010:t=1605005410:v=1:sig=AQHe-KzU8i_5Iy6MwnFEsgRct3c9Lh5R"; Expires=Tue, 10 Nov 2020 10:50:10 GMT; domain=.linkedin.com; Path=/; SameSite=None; Secure +x-fs-txn-id: 2b8d5409ba70 +x-fs-uuid: 61bbf94956d14516302567fc882b0000 +expect-ct: max-age=86400, report-uri="https://www.linkedin.com/platform-telemetry/ct" +x-xss-protection: 1; mode=block +content-security-policy-report-only: default-src 'none'; connect-src 'self' www.linkedin.com www.google-analytics.com https://dpm.demdex.net/id lnkd.demdex.net blob: https://linkedin.sc.omtrdc.net/b/ss/ static.licdn.com static-exp1.licdn.com static-exp2.licdn.com static-exp3.licdn.com; script-src 'sha256-THuVhwbXPeTR0HszASqMOnIyxqEgvGyBwSPBKBF/iMc=' 'sha256-PyCXNcEkzRWqbiNr087fizmiBBrq9O6GGD8eV3P09Ik=' 'sha256-2SQ55Erm3CPCb+k03EpNxU9bdV3XL9TnVTriDs7INZ4=' 'sha256-S/KSPe186K/1B0JEjbIXcCdpB97krdzX05S+dHnQjUs=' platform.linkedin.com platform-akam.linkedin.com platform-ecst.linkedin.com platform-azur.linkedin.com static.licdn.com static-exp1.licdn.com static-exp2.licdn.com static-exp3.licdn.com; img-src data: blob: *; font-src data: *; style-src 'self' 'unsafe-inline' static.licdn.com static-exp1.licdn.com static-exp2.licdn.com static-exp3.licdn.com; media-src dms.licdn.com; child-src blob: *; frame-src 'self' lnkd.demdex.net linkedin.cdn.qualaroo.com; manifest-src 'self'; report-uri https://www.linkedin.com/platform-telemetry/csp?f=g +content-security-policy: default-src *; connect-src 'self' https://media-src.linkedin.com/media/ www.linkedin.com s.c.lnkd.licdn.com m.c.lnkd.licdn.com s.c.exp1.licdn.com s.c.exp2.licdn.com m.c.exp1.licdn.com m.c.exp2.licdn.com wss://*.linkedin.com dms.licdn.com https://dpm.demdex.net/id lnkd.demdex.net blob: https://accounts.google.com/gsi/status https://linkedin.sc.omtrdc.net/b/ss/ www.google-analytics.com static.licdn.com static-exp1.licdn.com static-exp2.licdn.com static-exp3.licdn.com media.licdn.com media-exp1.licdn.com media-exp2.licdn.com media-exp3.licdn.com; img-src data: blob: *; font-src data: *; style-src 'unsafe-inline' 'self' static-src.linkedin.com *.licdn.com; script-src 'report-sample' 'unsafe-inline' 'unsafe-eval' 'self' spdy.linkedin.com static-src.linkedin.com *.ads.linkedin.com *.licdn.com static.chartbeat.com www.google-analytics.com ssl.google-analytics.com bcvipva02.rightnowtech.com www.bizographics.com sjs.bizographics.com js.bizographics.com d.la4-c1-was.salesforceliveagent.com slideshare.www.linkedin.com https://snap.licdn.com/li.lms-analytics/ platform.linkedin.com platform-akam.linkedin.com platform-ecst.linkedin.com platform-azur.linkedin.com; object-src 'none'; media-src blob: *; child-src blob: lnkd-communities: voyager: *; frame-ancestors 'self'; report-uri https://www.linkedin.com/platform-telemetry/csp?f=l +x-frame-options: sameorigin +x-content-type-options: nosniff +strict-transport-security: max-age=2592000 +x-li-fabric: prod-lva1 +x-li-pop: afd-prod-lva1 +x-li-proto: http/2 +x-li-uuid: Ybv5SVbRRRYwJWf8iCsAAA== +x-msedge-ref: Ref A: CFB9AC1D2B0645DDB161CEE4A4909AEF Ref B: BOM02EDGE0712 Ref C: 2020-11-09T10:50:10Z +date: Mon, 09 Nov 2020 10:50:10 GMT + +* Closing connection 0 +``` + +Here my system has a list of certificate authorities it trusts in this file /etc/ssl/cert.pem. Curl validates the certificate is for www.linkedin.com by seeing the CN section of the subject part of the certificate. It also makes sure the certificate is not expired by seeing the expire date. It also validates the signature on the certificate by using the public key of issuer Digicert in /etc/ssl/cert.pem. Once this is done, using the public key of www.linkedin.com it negotiates cipher TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 with a symmetric key. Subsequent data transfer including first HTTP request uses the same cipher and symmetric key. + + diff --git a/courses/linux_networking/images/arp.gif b/courses/linux_networking/images/arp.gif new file mode 100644 index 0000000..3395030 Binary files /dev/null and b/courses/linux_networking/images/arp.gif differ diff --git a/courses/linux_networking/images/closed.png b/courses/linux_networking/images/closed.png new file mode 100644 index 0000000..f1d98ed Binary files /dev/null and b/courses/linux_networking/images/closed.png differ diff --git a/courses/linux_networking/images/established.png b/courses/linux_networking/images/established.png new file mode 100644 index 0000000..5879e9e Binary files /dev/null and b/courses/linux_networking/images/established.png differ diff --git a/courses/linux_networking/images/pcap.png b/courses/linux_networking/images/pcap.png new file mode 100644 index 0000000..2209864 Binary files /dev/null and b/courses/linux_networking/images/pcap.png differ diff --git a/courses/linux_networking/intro.md b/courses/linux_networking/intro.md new file mode 100644 index 0000000..34f7484 --- /dev/null +++ b/courses/linux_networking/intro.md @@ -0,0 +1,26 @@ +# Linux Networking Fundamentals + +## Prerequisites + +This course requires high-level knowledge of commonly used jargon in TCP/IP stack like DNS, TCP, UDP and HTTP. Basic familiarity with Linux jargon is sufficient to start this course. This course also expects basic exposure to Linux command-line tools. The course will require you to install certain utilities and run them as a part of the course exercises. + +## What to expect from this course + +Throughout the course, we cover how an SRE can optimize the system to improve their web stack performance and troubleshoot if there is an issue in any of the layers of the networking stack. This course tries to dig through each layer of traditional TCP/IP stack and expects an SRE to have a picture beyond the bird’s eye view of the functioning of the Internet. + +## What is not covered under this course + +This course spends time on the fundamentals. We are not covering concepts like HTTP/2.0, QUIC, TCP congestion control protocols, Anycast, BGP, CDN, Tunnels and Multicast. We expect that this course will provide the relevant basics to understand such concepts + +## Course Content + +### Birds eye view of the course + +The course covers the question “What happens when you open linkedin.com in your browser?” The course follows the flow of TCP/IP stack.More specifically, the course covers topics of Application layer protocols DNS and HTTP, transport layer protocols UDP and TCP, networking layer protocol IP and Data Link Layer protocol + +## Table of Contents +1. [DNS](https://linkedin.github.io/school-of-sre/linux_networking/dns/) +2. [UDP](https://linkedin.github.io/school-of-sre/linux_networking/udp/) +3. [HTTP](https://linkedin.github.io/school-of-sre/linux_networking/http/) +4. [TCP](https://linkedin.github.io/school-of-sre/linux_networking/tcp/) +5. [IP Routing](https://linkedin.github.io/school-of-sre/linux_networking/ipr/) diff --git a/courses/linux_networking/ipr.md b/courses/linux_networking/ipr.md new file mode 100644 index 0000000..2e404fc --- /dev/null +++ b/courses/linux_networking/ipr.md @@ -0,0 +1,32 @@ +# IP Routing and Data Link Layer +We will dig how packets that leave the client reach the server and vice versa. When the packet reaches the IP layer, the transport layer populates source port, destination port. IP/Network layer populates destination IP(discovered from DNS) and then looks up the route to the destination IP on the routing table. + +```bash +#Linux route -n command gives the default routing table +route -n +``` + +```bash +Kernel IP routing table +Destination Gateway Genmask Flags Metric Ref Use Iface +0.0.0.0 172.17.0.1 0.0.0.0 UG 0 0 0 eth0 +172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0 +``` + +Here the destination IP is bitwise AND’d with the Genmask and if the answer is the destination part of the table then that gateway and interface is picked for routing. Here linkedin.com’s IP 108.174.10.10 is AND’d with 255.255.255.0 and the answer we get is 108.174.10.0 which doesn’t match with any destination in the routing table. Then Linux does an AND of destination IP with 0.0.0.0 and we get 0.0.0.0. This answer matches the default row + +Routing table is processed in the order of more octets of 1 set in genmask and genmask 0.0.0.0 is the default route if nothing matches. +At the end of this operation Linux figured out that the packet has to be sent to next hop 172.17.0.1 via eth0. The source IP of the packet will be set as the IP of interface eth0. +Now to send the packet to 172.17.0.1 linux has to figure out the MAC address of 172.17.0.1. MAC address is figured by looking at the internal arp cache which stores translation between IP address and MAC address. If there is a cache miss, Linux broadcasts ARP request within the internal network asking who has 172.17.0.1. The owner of the IP sends an ARP response which is cached by the kernel and the kernel sends the packet to the gateway by setting Source mac address as mac address of eth0 and destination mac address of 172.17.0.1 which we got just now. Similar routing lookup process is followed in each hop till the packet reaches the actual server. Transport layer and layers above it come to play only at end servers. During intermediate hops only till the IP/Network layer is involved. + +![Screengrab for above explanation](images/arp.gif) + +One weird gateway we saw in the routing table is 0.0.0.0. This gateway means no Layer3(Network layer) hop is needed to send the packet. Both source and destination are in the same network. Kernel has to figure out the mac of the destination and populate source and destination mac appropriately and send the packet out so that it reaches the destination without any Layer3 hop in the middle + +As we followed in other modules, lets complete this session with SRE usecases + +## Applications in SRE role +1. Generally the routing table is populated by DHCP and playing around is not a good practice. There can be reasons where one has to play around the routing table but take that path only when it's absolutely necessary +2. Understanding error messages better like, “No route to host” error can mean mac address of the destination host is not found and it can mean the destination host is down +3. On rare cases looking at the ARP table can help us understand if there is a IP conflict where same IP is assigned to two hosts by mistake and this is causing unexpected behavior + diff --git a/courses/linux_networking/tcp.md b/courses/linux_networking/tcp.md new file mode 100644 index 0000000..4a194eb --- /dev/null +++ b/courses/linux_networking/tcp.md @@ -0,0 +1,35 @@ +# TCP + +TCP is a transport layer protocol like UDP but it guarantees reliability, flow control and congestion control. +TCP guarantees reliable delivery by using sequence numbers. A TCP connection is established by a three way handshake. In our case, the client sends a SYN packet along with the starting sequence number it plans to use, the server acknowledges the SYN packet and sends a SYN with its sequence number. Once the client acknowledges the syn packet, the connection is established. Each data transferred from here on is considered delivered reliably once acknowledgement for that sequence is received by the concerned party + +![3-way handshake](images/established.png) + +```bash +#To understand handshake run packet capture on one bash session +tcpdump -S -i any port 80 +#Run curl on one bash session +curl www.linkedin.com +``` + +![tcpdump-3way](images/pcap.png) + + +Here client sends a syn flag shown by [S] flag with a sequence number 1522264672. The server acknowledges receipt of SYN with an ack [.] flag and a Syn flag for its sequence number[S]. The server uses the sequence number 1063230400 and acknowledges the client it’s expecting sequence number 1522264673 (client sequence+1). Client sends a zero length acknowledgement packet to the server(server sequence+1) and connection stands established. This is called three way handshake. The client sends a 76 bytes length packet after this and increments its sequence number by 76. Server sends a 170 byte response and closes the connection. This was the difference we were talking about between HTTP/1.1 and HTTP/1.0. In HTTP/1.1 this same connection can be reused which reduces overhead of 3 way handshake for each HTTP request. If a packet is missed between client and server, server won’t send an ack to the client and client would retry sending the packet till the ACK is received. This guarantees reliability. +The flow control is established by the win size field in each segment. The win size says available TCP buffer length in the kernel which can be used to buffer received segments. A size 0 means the receiver has a lot of lag to catch from its socket buffer and the sender has to pause sending packets so that receiver can cope up. This flow control protects from slow receiver and fast sender problem + +TCP also does congestion control which determines how many segments can be in transit without an ack. Linux provides us the ability to configure algorithms for congestion control which we are not covering here. + +While closing a connection, client/server calls a close syscall. Let's assume client do that. Client’s kernel will send a FIN packet to the server. Server’s kernel can’t close the connection till the close syscall is called by the server application. Once server app calls close, server also sends a FIN packet and client enters into time wait state for 2*MSS(120s) so that this socket can’t be reused for that time period to prevent any TCP state corruptions due to stray stale packets. + +![Connection tearing](images/closed.png) + +Armed with our TCP and HTTP knowledge lets see how this is used by SREs in their role + +## Applications in SRE role +1. Scaling HTTP performance using load balancers need consistent knowledge about both TCP and HTTP. There are [different kinds of load balancing](https://blog.envoyproxy.io/introduction-to-modern-network-load-balancing-and-proxying-a57f6ff80236?gi=428394dbdcc3) like L4, L7 load balancing, Direct Server Return etc. HTTPs offloading can be done on Load balancer or directly on servers based on the performance and compliance needs. +2. Tweaking sysctl variables for rmem and wmem like we did for UDP can improve throughput of sender and receiver. +3. Sysctl variable tcp_max_syn_backlog and socket variable somax_conn determines how many connections for which the kernel can complete 3 way handshake before app calling accept syscall. This is much useful in single threaded applications. Once the backlog is full, new connections stay in SYN_RCVD state (when you run netstat) till the application calls accept syscall +4. Apps can run out of file descriptors if there are too many short lived connections. Digging through [tcp_reuse and tcp_recycle](http://lxr.linux.no/linux+v3.2.8/Documentation/networking/ip-sysctl.txt#L464) can help reduce time spent in the time wait state(it has its own risk). Making apps reuse a pool of connections instead of creating ad hoc connection can also help +5. Understanding performance bottlenecks by seeing metrics and classifying whether its a problem in App or network side. Example too many sockets in Close_wait state is a problem on application whereas retransmissions can be a problem more on network or on OS stack than the application itself. Understanding the fundamentals can help us narrow down where the bottleneck is + diff --git a/courses/linux_networking/udp.md b/courses/linux_networking/udp.md new file mode 100644 index 0000000..351e59f --- /dev/null +++ b/courses/linux_networking/udp.md @@ -0,0 +1,15 @@ +# UDP + + +UDP is a transport layer protocol. DNS is an application layer protocol that runs on top of UDP(most of the times). Before jumping into UDP, let's try to understand what an application and transport layer is. DNS protocol is used by a DNS client(eg dig) and DNS server(eg named). The transport layer makes sure the DNS request reaches the DNS server process and similarly the response reaches the DNS client process. Multiple processes can run on a system and they can listen on any [ports](https://en.wikipedia.org/wiki/Port_(computer_networking)). DNS servers usually listen on port number 53. When a client makes a DNS request, after filling the necessary application payload, it passes the payload to the kernel via **sendto** system call. The kernel picks a random port number([>1024](https://www.cyberciti.biz/tips/linux-increase-outgoing-network-sockets-range.html)) as source port number and puts 53 as destination port number and sends the packet to lower layers. When the kernel on server side receives the packet, it checks the port number and queues the packet to the application buffer of the DNS server process which makes a **recvfrom** system call and reads the packet. This process by the kernel is called multiplexing(combining packets from multiple applications to same lower layers) and demultiplexing(segregating packets from single lower layer to multiple applications). Multiplexing and Demultiplexing is done by the Transport layer. + +UDP is one of the simplest transport layer protocol and it does only multiplexing and demultiplexing. Another common transport layer protocol TCP does a bunch of other things like reliable communication, flow control and congestion control. UDP is designed to be lightweight and handle communications with little overhead. So it doesn’t do anything beyond multiplexing and demultiplexing. If applications running on top of UDP need any of the features of TCP, they have to implement that in their application + +This [example from python wiki](https://wiki.python.org/moin/UdpCommunication) covers a sample UDP client and server where “Hello World” is an application payload sent to server listening on port number 5005. The server receives the packet and prints the “Hello World” string from the client + +## Applications in SRE role + + +1. If the underlying network is slow and the UDP layer is unable to queue packets down to the networking layer, sendto syscall from the application will hang till the kernel finds some of its buffer is freed. This can affect the throughput of the system. Increasing write memory buffer values using [sysctl variables](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/5/html/tuning_and_optimizing_red_hat_enterprise_linux_for_oracle_9i_and_10g_databases/sect-oracle_9i_and_10g_tuning_guide-adjusting_network_settings-changing_network_kernel_settings) *net.core.wmem_max* and *net.core.wmem_default* provides some cushion to the application from the slow network +2. Similarly if the receiver process is slow in consuming from its buffer, the kernel has to drop packets which it can’t queue due to the buffer being full. Since UDP doesn’t guarantee reliability these dropped packets can cause data loss unless tracked by the application layer. Increasing sysctl variables *rmem_default* and *rmem_max* can provide some cushion to slow applications from fast senders. + diff --git a/courses/python_web/intro.md b/courses/python_web/intro.md index 807f0b7..52df081 100644 --- a/courses/python_web/intro.md +++ b/courses/python_web/intro.md @@ -1,11 +1,11 @@ -# School of SRE: Python and The Web +# Python and The Web -## Pre - Reads +## Prerequisites - Basic understanding of python language. - Basic familiarity with flask framework. -## What to expect from this training +## What to expect from this course This course is divided into two high level parts. In the first part, assuming familiarity with python language’s basic operations and syntax usage, we will dive a little deeper into understanding python as a language. We will compare python with other programming languages that you might already know like Java and C. We will also explore concepts of Python objects and with help of that, explore python features like decorators. @@ -13,25 +13,25 @@ In the second part which will revolve around the web, and also assume familiarit And to introduce SRE flavour to the course, we will design, develop and deploy (in theory) a URL shortening application. We will emphasize parts of the whole process that are more important as an SRE of the said app/service. -## What is not covered under this training +## What is not covered under this course Extensive knowledge of python internals and advanced python. -## Training Content +## Course Content ### Lab Environment Setup Have latest version of python installed -### TOC +### Table of Contents -1. The Python Language +1. [The Python Language](https://linkedin.github.io/school-of-sre/python_web/intro/#the-python-language) 1. Some Python Concepts 2. Python Gotchas -2. Python and Web +2. [Python and Web](https://linkedin.github.io/school-of-sre/python_web/python-web-flask/) 1. Sockets 2. Flask -3. The URL Shortening App +3. [The URL Shortening App](https://linkedin.github.io/school-of-sre/python_web/url-shorten-app/) 1. Design 2. Scaling The App 3. Monitoring The App diff --git a/courses/python_web/python-web-flask.md b/courses/python_web/python-web-flask.md index df09e35..c797830 100644 --- a/courses/python_web/python-web-flask.md +++ b/courses/python_web/python-web-flask.md @@ -1,4 +1,4 @@ -# Python, Web amd Flask +# Python, Web and Flask Back in the old days, websites were simple. They were simple static html contents. A webserver would be listening on a defined port and according to the HTTP request received, it would read files from disk and return them in response. But since then, complexity has evolved and websites are now dynamic. Depending on the request, multiple operations need to be performed like reading from database or calling other API and finally returning some response (HTML data, JSON content etc.) diff --git a/courses/python_web/sre-conclusion.md b/courses/python_web/sre-conclusion.md index 8ced4a1..bd76f27 100644 --- a/courses/python_web/sre-conclusion.md +++ b/courses/python_web/sre-conclusion.md @@ -1,4 +1,4 @@ -# SRE Parts of The App and Conclusion +# Conclusion ## Scaling The App diff --git a/courses/security/fundamentals.md b/courses/security/fundamentals.md index 515ab24..06b1a89 100644 --- a/courses/security/fundamentals.md +++ b/courses/security/fundamentals.md @@ -42,7 +42,18 @@ - Fail securely - Applications regularly fail to process transactions for many reasons. How they fail can determine if an application is secure or not. - ![image2](images/image2.png) + ``` + + is_admin = true; + try { + code_which_may_faile(); + is_admin = is_user_assigned_role("Adminstrator"); + } + catch (Exception err) { + log.error(err.toString()); + } + + ``` - If either codeWhichMayFail() or isUserInRole fails or throws an exception, the user is an admin by default. This is obviously a security risk. - Don’t trust services @@ -102,11 +113,19 @@ - Ciphers are the cornerstone of cryptography. A cipher is a set of algorithms that performs encryption or decryption on a message. An encryption algorithm (E) takes a secret key (k) and a message (m), and produces a ciphertext (c). Similarly, a Decryption algorithm (D) takes a secret key (K) and the previous resulting Ciphertext (C). They are represented as follows: - ![image3](images/image3.png) +``` + +E(k,m) = c +D(k,c) = m + +``` - This also means that in order for it to be a cipher, it must satisfy the consistency equation as follows, making it possible to decrypt. - ![image4](images/image4.png) +``` + +D(k,E(k,m)) = m +``` Stream Ciphers: @@ -286,7 +305,7 @@ Certificate chain - What the OpenSSL command line doesn’t show here is the trust store that contains the list of CA certificates trusted by the system OpenSSL runs on. - The public certificate of GlobalSign Authority must be present in the system’s trust store to close the verification chain. This is called a chain of trust, and figure below summarizes its behavior at a high level. - ![image12](images/image12.png) + ![image122](images/image122.png) - High-level view of the concept of chain of trust applied to verifying the authenticity of a website. The Root CA in the Firefox trust store provides the initial trust to verify the entire chain and trust the end-entity certificate. @@ -305,8 +324,6 @@ At the end of the handshake, both parties possess a secret session key used to e - TLS 1.0 was released in 1999, making it a nearly two-decade-old protocol. It has been known to be vulnerable to attacks—such as BEAST and POODLE—for years, in addition to supporting weak cryptography, which doesn’t keep modern-day connections sufficiently secure. - TLS 1.1 is the forgotten “middle child.” It also has bad cryptography like its younger sibling. In most software it was leapfrogged by TLS 1.2 and it’s rare to see TLS 1.1 used. - ![image13](images/image13.png) - ### “Perfect” Forward Secrecy - The term “ephemeral” in the key exchange provides an important security feature mis-named perfect forward secrecy (PFS) or just “Forward Secrecy”. diff --git a/courses/security/images/image1.png b/courses/security/images/image1.png index 773a2d2..59f80f1 100644 Binary files a/courses/security/images/image1.png and b/courses/security/images/image1.png differ diff --git a/courses/security/images/image12.png b/courses/security/images/image12.png deleted file mode 100644 index bbb370c..0000000 Binary files a/courses/security/images/image12.png and /dev/null differ diff --git a/courses/security/images/image122.png b/courses/security/images/image122.png new file mode 100644 index 0000000..bb4a191 Binary files /dev/null and b/courses/security/images/image122.png differ diff --git a/courses/security/images/image13.png b/courses/security/images/image13.png deleted file mode 100644 index fe428c9..0000000 Binary files a/courses/security/images/image13.png and /dev/null differ diff --git a/courses/security/images/image16.png b/courses/security/images/image16.png deleted file mode 100644 index c8a10b6..0000000 Binary files a/courses/security/images/image16.png and /dev/null differ diff --git a/courses/security/images/image2.png b/courses/security/images/image2.png deleted file mode 100644 index e72bff4..0000000 Binary files a/courses/security/images/image2.png and /dev/null differ diff --git a/courses/security/images/image21.png b/courses/security/images/image21.png deleted file mode 100644 index e30b78a..0000000 Binary files a/courses/security/images/image21.png and /dev/null differ diff --git a/courses/security/images/image24.png b/courses/security/images/image24.png deleted file mode 100644 index db9deb3..0000000 Binary files a/courses/security/images/image24.png and /dev/null differ diff --git a/courses/security/images/image25.png b/courses/security/images/image25.png deleted file mode 100644 index c2db336..0000000 Binary files a/courses/security/images/image25.png and /dev/null differ diff --git a/courses/security/images/image3.png b/courses/security/images/image3.png deleted file mode 100644 index b384b24..0000000 Binary files a/courses/security/images/image3.png and /dev/null differ diff --git a/courses/security/images/image4.png b/courses/security/images/image4.png deleted file mode 100644 index ab4fc26..0000000 Binary files a/courses/security/images/image4.png and /dev/null differ diff --git a/courses/security/intro.md b/courses/security/intro.md index 7e02ed6..e05622e 100644 --- a/courses/security/intro.md +++ b/courses/security/intro.md @@ -1,47 +1,33 @@ # Security ---- -### Target Audience - -The material is suitable for new SRE hires or graduate computer science majors straight out of college, anyone who has a basic technical background, or readers who have a basic understanding of IT security and want to expand their knowledge. - -The approach being covered here deals with the fundamentals of computer security in the modern IT landscape moreover it sheds light on most of the dangerous "things" out there on public internet which are potentially a gateway to compromising systems. As an SRE, you are expected to design, build and develop products, this course will give you that ‘security knob’ into your thinking and problem-solving approach which is expected to be turned on as a critical area that always takes precedence over anything else. - ---- - -### Pre Requirements +## Prerequisites 1. Basics of Linux fundamentals & command line usage 2. Networking Module ---- -### What to expect from this training +## What to expect from this course The course covers fundamentals of information security along with touching on subjects of system security, network & web security. The aim of this course is to get familiar with the basics of information security in day to day operations & then as an SRE develop the mindset of ensuring that security takes a front-seat while developing solutions. The course also serves as an introduction to common risks and best practices along with practical ways to find out vulnerable systems and loopholes which might become compromised if not secured. ---- -### What is not covered under this training +## What is not covered under this course The courseware is not an ethical hacking workshop or a very deep dive into the fundamentals of the problems. The course does not deal with hacking or breaking into systems but rather an approach on how to ensure you don’t get into those situations and also to make you aware of different ways a system can be compromised. ---- -### Training Content +## Course Content -Part I: Fundamentals +### Table of Contents -Part II: Network Security +1. [Fundamentals](https://linkedin.github.io/school-of-sre/security/fundamentals/) +2. [Network Security](https://linkedin.github.io/school-of-sre/security/network_security/) +3. [Threats, Attacks & Defence](https://linkedin.github.io/school-of-sre/security/threats_attacks_defences/) +4. [Writing Secure Code & More](https://linkedin.github.io/school-of-sre/security/writing_secure_code/) -Part III: Threats, Attacks & Defense -PART IV: Writing Secure Code & More - ---- - -### Post Training asks/ Further Reading +## Post Training asks/ Further Reading - CTF Events like : - Penetration Testing : diff --git a/courses/security/network_security.md b/courses/security/network_security.md index 440afc8..cb41677 100644 --- a/courses/security/network_security.md +++ b/courses/security/network_security.md @@ -142,7 +142,10 @@ Let us see how we keep a check on the perimeter i.e the edges, the first layer o - Nmap is often used to determine alive hosts in a network, open ports on those hosts, services running on those open ports, and version identification of that service on that port. - More at http://scanme.nmap.org/ - ![image16](images/image16.png) +``` +nmap [scan type] [options] [target specification] +``` + Nmap uses 6 different port states: @@ -413,8 +416,17 @@ TCP Flags ![image20](images/image20.png) - Abuse of the normal operation or settings of these flags can be used by attackers to launch DoS attacks. This causes network servers or web servers to crash or hang. - ![image21](images/image21.png) - - The attacker's ultimate goal is to write special programs or pieces of code that are able to construct these illegal combinations resulting in an efficient DoS attack. + +``` +| SYN | FIN | PSH | RST | Validity| +|------|------|-------|------|---------| +| 1 |1 |0 |0 |Illegal Combination +| 1 |1 |1 |0 |Illegal Combination +| 1 |1 |0 |1 |Illegal Combination +| 1 |1 |1 |1 |Illegal Combination +``` + +- The attacker's ultimate goal is to write special programs or pieces of code that are able to construct these illegal combinations resulting in an efficient DoS attack. SYN FLOOD diff --git a/courses/security/threats_attacks_defences.md b/courses/security/threats_attacks_defences.md index c922240..593cb5c 100644 --- a/courses/security/threats_attacks_defences.md +++ b/courses/security/threats_attacks_defences.md @@ -44,7 +44,6 @@ the typical time to live (TTL) for cached entries is a couple of hours, thereby - Blackhole routes are best defence against many common viral attacks where the traffic is dropped from infected machines to/from command & control masters. - Infamous BGP Injection attack on Youtube - ![image24](images/image24.png) - EX: In 2008, Pakistan decided to block YouTube by creating a BGP route that led into a black hole. Instead this routing information got transmitted to a hong kong ISP and from there accidentally got propagated to the rest of the world meaning millions were routed through to this black hole and therefore unable to access YouTube. - Potentially, the greatest risk to BGP occurs in a denial of service attack in which a router is flooded with more packets than it can handle. Network overload and router resource exhaustion happen when the network begins carrying an excessive number of BGP messages, overloading the router control processors, memory, routing table and reducing the bandwidth available for data traffic. - Refer : @@ -101,7 +100,16 @@ BGP Security - A successful exploit will allow attackers to access, modify, or delete information in the database. - It permits attackers to steal sensitive information stored within the backend databases of affected websites, which may include such things as user credentials, email addresses, personal information, and credit card numbers - ![image25](images/image25.png) +``` +SELECT USERNAME,PASSWORD from USERS where USERNAME='' AND PASSWORD=''; + +Here the username & password is the input provided by the user. Suppose an attacker gives the input as " OR '1'='1'" in both fields. Therefore the SQL query will look like: + +SELECT USERNAME,PASSWORD from USERS where USERNAME='' OR '1'='1' AND PASSOWRD='' OR '1'='1'; + +This query results in a true statement & user gets logged in. This example depicst the bost basic type of SQL injection +``` + ### SQL Injection Attack Defenses diff --git a/courses/systems_design/availability.md b/courses/systems_design/availability.md index 48c1ad8..a4f9c65 100644 --- a/courses/systems_design/availability.md +++ b/courses/systems_design/availability.md @@ -1,4 +1,4 @@ -## HA - Availability - Common “Nines” +# HA - Availability - Common “Nines” Availability is generally expressed as “Nines”, common ‘Nines’ are listed below. | Availability % | Downtime per year | Downtime per month | Downtime per week | Downtime per day | diff --git a/courses/systems_design/conclusion.md b/courses/systems_design/conclusion.md index 9c9f3ba..5d182ac 100644 --- a/courses/systems_design/conclusion.md +++ b/courses/systems_design/conclusion.md @@ -1,3 +1,3 @@ -## Conclusion +# Conclusion Armed with these principles, we hope the course will give a fresh perspective to design software systems. It might be over engineering to get all this on day zero. But some are really important from day 0 like eliminating single points of failure, making scalable services by just increasing replicas. As a bottleneck is reached, we can split code by services, shard data to scale. As the organisation matures, bringing in [chaos engineering](https://en.wikipedia.org/wiki/Chaos_engineering) to measure how systems react to failure will help in designing robust software systems. diff --git a/courses/systems_design/fault-tolerance.md b/courses/systems_design/fault-tolerance.md index d33003a..bc97d45 100644 --- a/courses/systems_design/fault-tolerance.md +++ b/courses/systems_design/fault-tolerance.md @@ -1,4 +1,4 @@ -## Fault Tolerance +# Fault Tolerance Failures are not avoidable in any system and will happen all the time, hence we need to build systems that can tolerate failures or recover from them. diff --git a/courses/systems_design/intro.md b/courses/systems_design/intro.md index 4f74620..be222b2 100644 --- a/courses/systems_design/intro.md +++ b/courses/systems_design/intro.md @@ -1,27 +1,30 @@ # Systems Design -## Pre - Requisites +## Prerequisites Fundamentals of common software system components: - Operating Systems - Networking - Databases RDBMS/NoSQL -## What to expect from this training +## What to expect from this course Thinking about and designing for scalability, availability, and reliability of large scale software systems. -## What is not covered under this training +## What is not covered under this course Individual software components’ scalability and reliability concerns like e.g. Databases, while the same scalability principles and thinking can be applied, these individual components have their own specific nuances when scaling them and thinking about their reliability. More light will be shed on concepts rather than on setting up and configuring components like Loadbalancers to achieve scalability, availability and reliability of systems -## Training Content -- Introduction -- Scalability -- High Availability -- Fault Tolerance +## Course Content + +### Table of Contents + +- [Introduction](https://linkedin.github.io/school-of-sre/systems_design/intro/#backstory) +- [Scalability](https://linkedin.github.io/school-of-sre/systems_design/scalability/) +- [High Availability](https://linkedin.github.io/school-of-sre/systems_design/availability/) +- [Fault Tolerance](https://linkedin.github.io/school-of-sre/systems_design/fault-tolerance/) ## Introduction diff --git a/img/sos.png b/img/sos.png new file mode 100644 index 0000000..584c1b4 Binary files /dev/null and b/img/sos.png differ diff --git a/mkdocs.yml b/mkdocs.yml index ce865f4..39f0927 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -1,33 +1,53 @@ -site_name: school_of_sre +site_name: SchoolOfSRE docs_dir: courses +theme: + name: material + logo: img/sos.png + favicon: img/favicon.ico +plugins: [] nav: - Home: index.md -- Git: - - Git Basics: git/git-basics.md - - Working With Branches: git/branches.md - - Github and Hooks: git/github-hooks.md +- Fundamentals Series: + - Linux Basics: + - Introduction: linux_basics/intro.md + - Command Line Basics: linux_basics/command_line_basics.md + - Server Administration: linux_basics/linux_server_administration.md + - Git: + - Git Basics: git/git-basics.md + - Working With Branches: git/branches.md + - Github and Hooks: git/github-hooks.md + - Linux Networking: + - Introduction: linux_networking/intro.md + - DNS: linux_networking/dns.md + - UDP: linux_networking/udp.md + - HTTP: linux_networking/http.md + - TCP: linux_networking/tcp.md + - Routing: linux_networking/ipr.md + - Conclusion: linux_networking/conclusion.md - Python and Web: - - Intro: python_web/intro.md + - Introduction: python_web/intro.md - Some Python Concepts: python_web/python-concepts.md - Python, Web and Flask: python_web/python-web-flask.md - The URL Shortening App: python_web/url-shorten-app.md - - SRE Aspects of The App and Conclusion: python_web/sre-conclusion.md + - Conclusion: python_web/sre-conclusion.md +- Data: + - Big Data: + - Introduction: big_data/intro.md + - Overview of Big Data: big_data/overview.md + - Usage of Big Data techniques: big_data/usage.md + - Evolution of Hadoop: big_data/evolution.md + - Architecture of Hadoop: big_data/architecture.md + - Tasks and conclusion: big_data/tasks.md - Systems Design: - - Intro: systems_design/intro.md + - Introduction: systems_design/intro.md - Scalability: systems_design/scalability.md - Availability: systems_design/availability.md - Fault Tolerance: systems_design/fault-tolerance.md - Conclusion: systems_design/conclusion.md -- Big Data: - - Intro: big_data/intro.md - - Overview of Big Data: big_data/overview.md - - Usage of Big Data techniques: big_data/usage.md - - Evolution of Hadoop: big_data/evolution.md - - Architecture of Hadoop: big_data/architecture.md - - Tasks and conclusion: big_data/tasks.md - Security: - - Inro: security/intro.md + - Introduction: security/intro.md - Fundamentals of Security: security/fundamentals.md - - Network Securuty: security/network_security.md + - Network Security: security/network_security.md - Threat, Attacks & Defences: security/threats_attacks_defences.md - Writing Secure code: security/writing_secure_code.md +- Contribute: CONTRIBUTING.md