Notes - Vinay Hegde

Overview

Welcome to my notes on various topics related to the DevOps / SRE terminology such as -

  • Linux Operating Systems
  • Networking
  • Web / Email / DNS / Database Servers
  • GIT
  • Configuration Management Tools (Chef, Ansible, Puppet)
  • Programming (Python, Ruby, Shell Scripting)..and many more

Tech Stack:

  • reStructuredText (RST) - To write .rst files for the formatting. You can start learning via this
  • Sphinx - The engine generating content in HTML, LaTeX, ePub. A beginner’s guide can be found on their page
  • GitHub - For maintaining the source code in version control.
  • ReadTheDocs - Hosting my documentation. Refer this excellent document to help you get up and running in no time.

How-To-Use

  • Please click on any link on the Left Hand side of this webpage to read more on it such as Utilities, CPU, Monitoring among others.
  • Once you do so, you will find subsections (varies upon topics) that can be expanded by clicking on the + icon to read them.
  • You can also enter your keywords in the Search Box present in the top Left Corner of this webpage to return relevant results.

Please Note:

  • Due to dynamic content for any topic in here, this will forever be a work in progress. Feedback, suggestions and queries are always appreciated!
  • For contributions, please read the guidelines on how to share Contributions to this project.
  • At all times, please refer to the Code of Conduct guidelines on interacting with other community members in a respectful and cordial manner.

Boot Process

Some useful links to explain the concepts of the Boot Process

Commands
Troubleshooting & Log Parsing

CPU

Some useful links to explain the concepts of CPU processing

Commands
HTop
Visual representation of all HTop parameters
_images/cpu-visual-htop.png
Configuration
Troubleshooting & Log Parsing
  • Find the most number of processes running on the system
1
sudo ps -AL --no-headers | awk -F: '{print $3}' | cut -d' ' -f2 | sort | uniq -c | sort -n | tail -10
  • Check for zombie processes with PPID
1
sudo ps axo stat,ppid,pid,comm | grep -w defunct
  • View Column Headers in ps output
1
sudo ps aux | head -1 && sudo ps aux | grep <process-name> | grep -v grep

Swap

Some useful links to explain the concepts of Swap

Commands
  • Check total swap space used & sort it descending
1
2
for file in /proc/*/status ; do awk '/VmSwap|Name/{printf $2 " " $3}END{ print ""}' $file; done | sort -k 2 -nr | head -10
for file in /proc/*/status ; do awk '/VmSwap|Name/{printf $2 " " $3}END{ print ""}' $file; done | awk  '{print $1 " " $2/1024 " MB" }'|sort -k 2 -n -r | head -10
  • Alternatively - run below command
1
sudo nice top (Press Shift+o → p (To sort processes by swap usage)

SSH

Some useful links to explain the concepts of SSH

Commands
Tuning & Hardening
_images/ssh-keys-ciphers-kex.png
Troubleshooting & Log Parsing

Utilities

File Permissions

Some useful links to cover the working of File Permissions

UMask Values
Others

Editors

Some useful links to explain the concepts of Editors

VI Editor Cheat Sheet
_images/editors-vim-cheat-sheet.png
To delete all lines in vim
_images/editors-vim-delete-all-lines.png

Hardware

Some useful links to explain the concepts of Hardware (Dell / SuperMicro)

Dell OMSA
Commands
  • We use 2 commands to monitor / change parameters in Dell servers
1
2
3
omreport - Checks the server details via specified parameters.

omconfig - Modifies the server details via specified parameters.
Examples :
  • Will list all possibly available system / chassis / storage domain commands
1
sudo omreport system -?  | omreport chassis -?  | omreport storage -?
  • Retrieve general system information
1
sudo omreport system summary | less
  • Display the Hardware logs
1
sudo omreport system esmlog
  • Retrieve the RAID configuration
1
sudo omreport storage vdisk controller=0
  • Clearing the logs
1
sudo omconfig system esmlog action=clear (Replace esmlog with alertlog or cmdlog, esmlog is the hardware log)

Storage

Some useful links to explain the concepts of Storage, I/O

File Systems

Some useful links to explain the concepts of File Systems

Concepts

Package-Management

Some useful links to explain the concepts of Package-Management

Fundamentals

Some useful links to explain the concepts of Web-Servers

Apache

Some useful links to explain the concepts of Apache

Configuration
Commands
  • Find hits by IP to server from access log in ascending order
1
2
sudo tail -n 10000 <path-to-log-file> | awk '{print $2}' | sort | uniq -c | sort -n
sudo grep 'text' <path-to-access-log> | cut -d' ' -f1 | sort | uniq -c | sort -r
  • Finding connections to all server IPs source/destination & sorting in ascending order
1
2
sudo netstat -antulp | awk '{print $4}' | cut -d":" -f1 | sort | uniq -c | sort -n
sudo netstat -antulp | awk '{print $5}' | cut -d":" -f1 | sort | uniq -c | sort -n

Nginx

Some useful links to explain the concepts of NGinx

HAProxy

Some useful links to explain the concepts of HAProxy

Configuration

SSL

Some useful links to explain the concepts of SSL

Concepts
_images/web-servers-ssl-handshake.png

Monitoring

Some useful links to explain the concepts of Monitoring

Nagios
What Nagios does
  • Monitoring of network services (SMTP, POP3, HTTP, NNTP, ICMP, SNMP, FTP, SSH)
  • Monitoring of host resources (processor load, disk usage, system logs) on a majority of network operating systems, including Microsoft Windows with the NSClient++ plugin or Check MK.
  • Monitoring via remotely run scripts via Nagios Remote Plugin Executor or through SSH or SSL encrypted tunnels.
  • Contact notifications when service or host problems occur & get resolved (via e-mail, pager, SMS, or any user-defined method through plugin system)
  • The ability to define event handlers to be run during service or host events for proactive problem resolution. Automatic log file rotation. Support for implementing redundant monitoring hosts
  • An optional web-interface for viewing current network status, notifications, problem history, log files, etc. Data storage via text files rather than database

Graphing

Some useful links to explain the concepts of Graphing

Ganglia
What Ganglia does
  • Graph different properties of a server such as CPU,memory,load,etc
  • Compare the graphing trend of those properties with previous trend & identify which node or host is causing the issue easily from the trend.
  • Make custom metrics for graphing for different process.
  • Machines from different data centers which are part of one single cluster must be represented in that single cluster in a single interface.
Important points
  • Node : SINGLE machine sending data to Ganglia monitoring daemon. (All individual servers are nodes, can or can’t be part of a cluster)
  • Cluster : All nodes that are used for any particular purpose is a CLUSTER.
  • Grid : Collection of clusters is a GRID.
Parts of Ganglia Monitoring Tool
  • 1. Gmond :

    • Ganglia Monitoring daemon (Service that needs to be installed on each & every node that needs to be monitored)
    • Sends data via XML over TCP & main configuration file : /etc/gmond.conf
  • 2. Gmetad :

    • Collects data from Gmond daemons & stores in RRD (Round robin database)
    • Main configuration file is /etc/gmetad.conf & should be installed on one node of each cluster
  • 3. RRD tool :

    • Used by Ganglia to store data for visualization (graphing) & store data of particular time intervals & then graphs the same.
  • 4. PHP Front-End :

    • A web interface on the master node that displays graphs and metrics from data in the RRD tool.

Logging

Some useful links to explain the concepts of Logging

Networking

Some useful links to explain the concepts of Networking Protocols

Concepts
MAC Addresses : Explained
_images/networking-mac-address.png
UDP
Commands
NMap

Nmap Command - Free IPs in a particular subnet

1
for i in `sudo nmap -sP <subnet/CIDR> | grep -i 'Nmap scan report for' | awk '{print $5}'`;do ping -c 1 $i;done | grep from
Configuration

Security

Some useful links to explain the concepts of Security in Linux OS

Concepts

IPTables

Some useful links to explain the concepts of IPTables

Generic

Some useful links to explain the concepts of Programming

Configuration
Online Interpreter for Multiple Languages

Python

Some useful links to explain the concepts of Python

Bash

Some useful links to explain the concepts of Bash Scripting

GIT

Some useful links to explain the concepts of GIT

Random Notes

Puppet

Some useful links to explain the concepts of Puppet

Chef

Some useful links to explain the concepts of Chef

Configuration
Troubleshooting

Ansible

Some useful links to explain the concepts of Ansible

Configuration

Docker

Some useful links to explain the concepts of Docker

Official Docker Documentation

AWS

Some useful links to explain the concepts of AWS

Official AWS Documentation

MySQL

Some useful links to explain the concepts of MySQl

Configuration

PostgreSQL

Some useful links to explain the concepts of PostgreSQl

NoSQL

Some useful links to explain the concepts of NoSQL

Configuration
Troubleshooting

Redis

Some useful links to cover the concepts of Redis

Concepts
Official Redis Documentation
Configuration
Troubleshooting & Log Parsing

Kafka

Some useful links to cover the concepts of Kafka

Configuration
Tuning & Hardening
Troubleshooting & Log Parsing

Email

Some useful links to explain the concepts of Email

Postfix

Some useful links to explain the concepts of Postfix

Concepts
Commands
  • Sorting queued mails by From address
1
sudo mailq | awk '/^[0-9,A-F]/ {print $7}' | sort | uniq -c | sort -n
  • Holding queued mails by From address
1
sudo mailq| grep '^[A-Z0-9]'| grep <sender-ID>| cut -f1 -d' ' | tr -d \*|sudo postsuper -h -
  • Holding queued mails by To address
1
sudo mailq | tail -n +2 | grep -v '^ *(' | awk  'BEGIN { RS = "" } { if ($8 == "<recipient>") print $1 } ' | tr -d '*!' | sudo postsuper -h -
  • Holding queued mails by Domain
1
sudo mailq| grep '^[A-Z0-9]'| grep @<domain>| cut -f1 -d' ' | tr -d \*|sudo postsuper -h -
  • Holding emails from the [active|deferred] queue based on subject
1
sudo find /var/spool/postfix/[active|deferred]/ -type f  -exec grep -il '<subject>' '{}' \; | xargs -n1 basename | sudo postsuper -h -
  • Removing Mails based on sender Address
1
sudo mailq| grep '^[A-Z0-9]'| grep <sender-ID>| cut -f1 -d' ' | tr -d \*|sudo postsuper -d -
  • Removing Mails based on Domain
1
sudo mailq| grep '^[A-Z0-9]'| grep @<domain>| cut -f1 -d' ' | tr -d \*|sudo postsuper -d -
  • Delete mails to a specific mail address
1
sudo mailq | tail -n +2 | grep -v '^ *(' | awk  'BEGIN { RS = "" } { if ($8 == "<recipient-ID>") print $1 } ' | tr -d '*!' | sudo postsuper -h -

DNS

Some useful links to explain the concepts of DNS

Concepts
  • Authoritative NS
    • When a DNS query is made to a server which has the domain’s data, it is an authoritative NS, otherwise it will point to other NS or serve cached copies of other NS
  • Zone file
    • simple text file containing the mapping between domain names and IP addresses, e.g : www.google.com
  • Root Servers
    • 13 servers - a to h, routed to the nearest mirror of the server
  • TLD servers :
    • .com [others are : .org, .net, .edu etc]
  • Domain Level NS
    • the server containing the actual records of the requested domain (ns1.google.com, ns2.google.com etc)
  • TTL - Time to live
    • A timer. Caching name servers can use this until the TTL runs out
  • Records
1
2
3
4
5
6
7
domain.com.  IN SOA ns1.domain.com. admin.domain.com. (
12083 ; serial number  - incremented on zone file change, slave NS checks if master NS serial > cached serial & if yes, slave NS requests for updated zone else serves same zone file.
3h; refresh interval -  Slave NS waits this period to poll the master NS for changes
30m; retry interval -  Slave NS will retry querying master NS every this period for zone transfer updates
3w; expiry period -   if slave NS can not contact master for this time, it will no longer return authoritative response for the queried zone
1h ; negative TTL -  a NS will cache errors for this period
)
Domain Transfer (AXFR)
  • The original DNS specifications RFC-1034 & RFC-1035 envisaged that slave (or secondary) DNS servers would poll the master.
  • The time between such ‘polling’ is determined by the refresh value on the domain’s SOA Resource Record
  • The polling process is accomplished by the ‘slave’ sending a query to the master and requesting its current SOA record.
  • If serial number of this record is higher than the current one maintained by the slave a zone transfer (AXFR) is requested & done on TCP Port 53.
DNS uses UDP for DNS queries over port 53
  • DNS uses UDP for to replying to client DNS queries such as client asking DNS server for a Name to IP or IP to NAME resolution.
  • The reason is that UDP is not connection oriented, so its light-weight & fast, resulting in faster data transmission of results to client compared to TCP.
  • At the same time, if needed then DNS can also work over TCP to serve the DNS queries, but UDP is always preferred because of greater speed.
Why DNS uses TCP for Zone files transfer over port 53
  • DNS uses a master & slave architecture, in which one main authoritative Name server having all the entries & others are replicated (zone files transferred) from master & also serve DNS queries.
  • As there can’t be any inconsistency in Zone files, so to transfer these Zone files DNS uses TCP as the communication protocol, which makes sure that the zone files are transferred reliably.
Resource Records
  • A record
    • map a host to an IP address

host     IN      A          IPv4_address host     IN      AAAA    IPv6_address

  • MX Record
    • map a mail exchange used for the domain

IN  MX  10  mail.domain.com. (where 10 is record priority. Priority is given to MX with lower values at DNS lookup)

  • PTR
    • maps an IP address to a reverse name
How do resolvers work
  • What happens when you set resolvers in PC (Windows) And / Or Router
    • A browser 1st checks its internal cache of recent queries which it checks initially otherwise it asks the system resolver for DNS queries (/etc/hosts) else it forwards requests to another resolver.
alternate text
Types of DNS Servers