Day01: HLD-Basics (OSI/TCP/IP Model, TCPvsUDP, HTTPvsHTTPs)

Here we start! So on the first day itself, let's understand networking from scratch, starting from server-client architecture, UDP/TCP, etc to sharding, load balancing, scaling, distributed transactions, rate limiting, gRPC, quic, etc Let's first understand the difference between client and server architecture.

Client-Server Architecture

In the simplest explanation possible, a client is any device that can send a request to a server. A mobile phone a laptop or a computer, everything is a client that requests a payload. Compared to this, a server is a kind of black box that takes a request and sends a response to a client. Simply understand a server as a machine that is ready to serve an incoming request, the server accepts the request and after required computation, it sends a response to the client that had requested the payload to the server.

Client and server are relative terms, as we may encounter cases when we may have to call the same machine as client and server. To understand this better let's take an example. Let us suppose, we have a client device that requests a backend server of a 'XYZ' company. When we perform a request operation from a client device, the request reaches to the server. Now let's suppose that this server contacts the database which is kept in a different continent (like MongoDB atlas :) ). Now, in this case, our server acts like a client for the database service, which in turn acts like a server and sends back a response to the client (that is the server originally). Understand this example closely, you will get to know about this hierarchy.

So, we can conclude that a Client–Server model is a distributed application structure that divides tasks or workloads between the providers of a resource or service, called Servers, and service requesters, called Clients. Applications that make use of the client–server architecture are email, network printing, World Wide Web and many more.

Why do we need OSI/TCP models for client-server interaction?

What is a protocol?

a protocol is a special set of rules that end points in a telecommunication connection use when they communicate. Protocols specify interactions between the communicating entities.

Let’s Understand it in real real-world scenario.

Suppose you met a person who can only speak in Spanish and no other language but unfortunately the only language you know is French and both of you want to communicate, so this is not possible as you won't be able to understand what the other is saying. Now one of the possible solutions is that you will also learn Spanish but this is not a feasible solution as if tomorrow you meet someone who just knows Chinese. Now you see how big this problem is.

The most feasible solution is that all of you can learn English as it is a global standard of communication language.

The same is true with networking, in a network there are various types of devices, each having a different way of working and providing input and output. So to make the network popular, we came with a set of rules or standards that any device will have to follow if you want to participate in a network communication.

So basically OSI &TCP/IP model provides a set of rules using which a uniform communication method can be established in a network.

Why do we refer to the TCP/IP model more as compared to the OSI model?

In simple words, We use the TCP/IP model because that’s reality. We use the OSI model only because some people are very, very out of touch with reality.

OSI Model is a logical representation of different stages in the transmission of data from source to destination. TCP/IP is a more generalized form of the OSI Model.
For example -
Layer 7 (Application), 6 (Presentation) and 5 (Session) are combined into a single Layer in TCP/IP called the Application Layer. The user in this layer directly interacts with the applications (like surfing the web using a web browser).
Similarly, layer 2 (Data Link) and layer 1 (Physical) are combined into the Network Access layer as both of these layers provide a way to access the network by physical means or components.

So TCP/IP makes it easy for us to understand things by making it more sort of "physical" instead of logical. The upper layer of the model uses the lower layer properties to enhance the networking process.

For system design, we must focus mostly on the Transportation layer and the Application layer.

Source: GFG*

Transportation Layer - TCP/UDP

TCP (Transmission Control Protocol) and UDP (User Datagram Protocol) both are internet protocols that transfer data across the internet. TCP is a connection-oriented protocol, which means the connection is established before the data transmission and will close after data transmission whereas, UDP is a Datagram-oriented protocol, which means there is no connection before, during and after the data transmission.

Confused right? Let us understand this in simpler terms. TCP is a model where we take care of reliable message transfer protocol, In this type of protocol we want the reliability that whatever data we are sending or receiving is in the perfect order and sequence, for example, the mailing service of Gmail, banking management, etc (cases where we have to store/compute all of the incoming data)

Compared to this, in UDP, we are only concerned about the speed factor of the message transfer protocol. In UDP we need faster response time, resulting in a speedy response. In this design discussion, we have a trade-off (meaning, that the 'reliability' and 'speed' factors are inversely proportional to each other, if we require reliability, then we will have to reduce the speed of the response and vice versa). Examples of UDP include: game live streaming, movie streaming, music streaming, etc (cases where we may have some ping issues, but we are more concerned about the current state)

Why TCP is slower than UDP?

Transmission Control Protocol, or TCP, is often referred to as a connection-oriented protocol. It provides a high level of reliability in data transmission by establishing a connection between the sender and the receiver before any data is exchanged.

How TCP works source: nordvpn.com/blog/tcp-or-udp-which-is-better/

How UDP works Source: nordvpn.com/blog/tcp-or-udp-which-is-better/

In TCP we have a concept of handshaking which is one of the main factors because of which is a slower protocol as compared to UDP. In TCP we have two types of handshakings,

Three-Way Handshaking: For TCP, a three-way handshake is required before transmitting the data between the client and the server. This process ensures the connection is reliable and data will not be lost. This can be understood by the following diagram, where the client sends a synchronize request to the server to initiate the TCP connection, following which, the server sends a response acknowledging the synchronize request by sending the synchronize-acknowledgement. As the client receives the synchronize-acknowledgement, it also sends an acknowledgement handshake, and now reliable communication can be started between the two.
Four-Way Handshaking: In a TCP connection when we want to gracefully terminate the communication between the server and the client, we follow the following four steps which form the basis of the four-way handshake in a TCP connection:
1. Firstly, from one side of the connection, either the client or the server (whosoever initiates a termination call, is referred to as Initiator) the FIN flag (meaning finish) will be sent as the request for the termination of the connection.
2. In the second step, whoever receives the FIN flag (also called a Reciever) will then send an ACK flag as the acknowledgement for the closing request to the other side. This step is instantaneous, as the receiver receives a FIN flag, it acknowledges it.
3. And, at the Later step, the server will also send a FIN flag as the closing signal to the other side after the internal computation/file transfer ends. (This process may take a few ms to finish)
4. In the final step, the TCP, who received the final FIN flag (Initiator), will be sending an ACK flag as the final acknowledgement for the suggested connection closing (Reciever).
This procedure can be illustrated by:

Because of this Handshaking concept in TCP, an overhead of establishing a connection is also involved hence it becomes a slower connection than UDP where no overhead connection is involved.

TCP is reliable, hence we have a gaurantee of delivery of data, however UDP is not reliable, hence delivery of data is not gauranteed.

We have an extensive error checking, sequencing and ordering of data in TCP but we have basic error checkings and no sequencing in case of UDP. Hence TCP is slower and UDP is faster. Because of extensive checking in case of TCP, retransmission of lost packets is possible but it is not possible in case of UDP.

Broadcasting is not supported in TCP as we will have to make three-way handshakes to initiate communication, contrary to this, UDP can support broadcasting

Application Layer - HTTP/HTTPS/FTP/SMTP/SNMP/DNS/etc

In this application layer topic, we will understand each of them in upcoming blogs, for this blog lets have a basic discussion on HTTP and HTTPS. So let's start with HTTP, Hypertext Transfer Protocol (HTTP) is an application protocol for distributed, collaborative, and hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web. The main difference between HTTP and HTTPS is that HTTPS is a secure version of HTTP. HTTPS stands for Hypertext Transfer Protocol Secure, and it uses a secure protocol called SSL/TLS to encrypt data between the client and the server. HTTP, on the other hand, does not provide any encryption for data transmitted between the client and the server.

In HTTPS when the client tries to connect with the server. The server will return the SSL certificate. It contains a public key and a digital signature. It will be used to identify the certificate is valid or not. The public key will be used for the asymmetric encryption process. First, let’s try to understand asymmetric and symmetric encryption.

Asymmetric encryption is known as public-key cryptography. In this public key is being used to encrypt the data while the private key is being used to decrypt the data.

Symmetric encryption is a type of encryption where only one key is being used to encrypt and decrypt the data. The entities will share the key during the communication for encrypting and decrypting the data.

So let's understand how HTTPS uses these encryptions:

During the handshake, the server sends an SSL certificate that has an asymmetric public key to the client. It has a private key that is stored at the web server end.
The client will create a session key based on algorithms. This session key will be encrypted by using the public key. Then it will be sent to the server.
The server will use the asymmetric private key to decrypt the encrypted session key and will get the session key.
Now the browser will use the session key for encrypting and decrypting the data for the session. This is known as symmetric encryption. Now the data is secured as the session key will be known by the client and server. Once the session expires the process will be repeated from step 1 as the session key will be no longer valid.

Hijacking the session key will be tough as it will be valid for a very short period. Symmetric encryption is relatively fast compared to asymmetric. As it involves very little computation.

There's a lot of information in this blog to digest, so let's wrap up for today, we will continue further discussion on more interesting topics in upcoming blogs.

Happy Coding!