Elastic Search Nodes Communication Process

DISTRIBUTED SYSTEM FT. ELASTIC SEARCH

What is Elasticsearch

Elasticsearch is a distributed search and analytics engine commonly used for indexing and searching large volumes of data. It is designed to be highly scalable and fault-tolerant.

What is Distributed System

A distributed system is a group of computers that work together to achieve a shared goal. The computers are located on different machines and communicate with each other over a network. The system appears as a single coherent system to the end-user.

Why Elasticsearch is a Distributed System Marvel

Elasticsearch is a perfect example of a distributed system because it exhibits all of the key characteristics of distributed systems:

Scalability: Elasticsearch can be scaled up or down to meet the needs of your application. You can simply add or remove nodes from your cluster to increase or decrease capacity.
Reliability: Elasticsearch is highly reliable. Even if one node fails, the other nodes can continue to operate without interruption.
Fault tolerance: Elasticsearch can tolerate the failure of individual nodes without losing data. Availability: Elasticsearch is highly available. You can access your data from anywhere in the world, at any time.

Before we start discussing how the nodes of elastic search cluster communicate with each other we need to understand what nodes are and why they do need to communicate.

A cluster is a collection of one or more nodes that work together to store and manage data. All nodes within a cluster share the same cluster name and work together to provide a unified interface to clients. So the nodes need to communicate with each other to store and manage the data , also to maintain the health of the cluster.

A node is a single instance of Elasticsearch running on a server or machine. Each node can be considered as an independent Elasticsearch server.

There are two types of nodes in an Elasticsearch cluster:

Data nodes: Data nodes store the data and perform search and indexing operations. Master nodes: Master nodes are responsible for managing the cluster, such as adding and removing nodes, and rebalancing data.

Now, coming to the main part ,

In distributed systems like Elasticsearch clusters, effective communication among nodes is crucial for maintaining cluster health, discovering node status, and achieving fault tolerance. One of the key communication mechanisms used in Elasticsearch is Gossip Protocol.

How does Gossip Protocol work :

Elasticsearch uses the Zen Discovery mechanism, which incorporates a Gossip Protocol, for node discovery and cluster management.

Zen Discovery in Elasticsearch

Zen Discovery is the default discovery mechanism used in Elasticsearch, and it relies on Gossip Protocol for cluster coordination. Here's a simplified overview of how Zen Discovery and the Gossip Protocol work in Elasticsearch:

Node Startup: When an Elasticsearch node starts, it initializes its Zen Discovery component.
Initial Seed Nodes: The node contacts one or more initial seed nodes. These seed nodes are typically specified in the Elasticsearch configuration, and they serve as entry points into the cluster. The new node learns about the cluster's existence and other seed nodes from this initial contact.
Gossiping Information: The node starts gossiping with other nodes it knows about, sending and receiving gossip messages. These messages contain information about the node's identity, state, and the cluster's current state.
Fault Detection: Nodes use the Gossip Protocol to detect the availability of other nodes in the cluster. If a node stops responding or leaves the cluster gracefully, this information is gossiped to the other nodes.
Leader Election: Zen Discovery also plays a role in leader election, where nodes compete to become the cluster's master node. The master node is responsible for cluster-wide decisions and coordination.

Zen Discovery Example :

cluster.name: my-elasticsearch-cluster 
node.name: Node1 
network.host: 192.168.1.101 
discovery.seed_hosts: ["192.168.1.101", "192.168.1.102"] 
cluster.initial_master_nodes: ["Node1"]

cluster.name: my-elasticsearch-cluster 
node.name: Node2 
network.host: 192.168.1.102 
discovery.seed_hosts: ["192.168.1.101", "192.168.1.102"] 
cluster.initial_master_nodes: ["Node1"]

In this configuration:

cluster.name: Specifies the name of the Elasticsearch cluster.
http.host: Specifies the network address that Elasticsearch binds to. It's set to the IP address of new node.
discovery.seed_hosts: Lists the seed nodes that Elasticsearch uses for initial discovery. Both Node1 and Node2 are listed.
cluster.initial_master_nodes: Specifies the initial master node when the cluster starts. In this case, 10.12.0.157,10.12.0.148,10.12.0.156 is the initial master.

Node Discovery Process:

Both Node1 and Node2 start with the specified configurations. They use the discovery.seed_hosts list to find seed nodes (Node1 and Node2 themselves in this case). Gossip messages are exchanged between the nodes, allowing them to discover each other's presence and learn about the cluster's state. The cluster.initial_master_nodes setting designates Node1 as the initial master node. The cluster forms with Node1 as the master node and Node2 as a data node. They continuously exchange gossip messages to maintain a consistent view of the cluster state and react to changes dynamically (e.g., when new nodes join).

The key takeaway is that Zen Discovery and the Gossip Protocol enable nodes to discover each other, coordinate cluster activities, and maintain cluster integrity in Elasticsearch.

Hope you liked this article. If you did, please please don't forget to put a ❤️. And guess what, in the next part of the Elasticsearch Series , I will address another interesting topic that you will love as well. So, Please Stay Tuned and let me know how you feel about this blog!! 🧘