Elasticsearch Cluster Tutorial
Video Lecture
Part | Video Link |
---|---|
1 | ![]() |
2 | ![]() |
Description
This is a small tutorial about creating a Cluster of Elasticsearch Servers with Metricbeat instances.
I will create 3 identical Ubuntu 20.04 servers in different regions of the world.
I will install Elasticsearch and Metricbeat on them and configure them with identical settings. Note that I am using Metricbeat as an example collector. You can install other beats, such as Filebeat and other collectors instead or in addition to Metricbeat. There are many possibilities.
My servers will from Digital Ocean.
I will select the basic droplets being $10 a month - Ubuntu 20.04, 2GB Ram, 1 CPU, 50GB SSD servers and start them in New York, Amsterdam and Singapore.
I will give them hostnames of ES1, ES2 and ES3.
They all have unique IP addresses which I will need to use in the Elasticsearch and Metricbeat configurations.
I will also name the nodes in the cluster, as node-1
, node-2
and node-3
.
Hostname | Node Name | IP Address |
---|---|---|
ES1 | node-1 | 203.0.113.1 |
ES2 | node-2 | 203.0.113.2 |
ES3 | node-3 | 203.0.113.3 |
Note
The ip addresses used in the above example table are for demonstration only. Replace with the IPs or domain names for each of your Elasticsearch server addresses.
Install Elasticsearch
SSH onto all 3 servers and enter the following commands.
Download and install the Elasticsearch public signing key.
1 |
|
Install dependencies
1 |
|
Save the repository definition
1 |
|
Update and install the Elasticsearch package
1 |
|
Edit the Elasticsearch configuration.
1 |
|
Modify properties in each elasticsearch.yml
by adding your node names and IP addresses.
ES1
1 2 3 4 5 6 |
|
ES2
1 2 3 4 5 6 |
|
ES3
1 2 3 4 5 6 |
|
Note that I named the cluster mycluster
. You can name it anything you want containing the letters a-z
,-
or.
Also, in the above settings, I have chosen my node-1
to be the initial master node. This is only important for when starting the servers for the first time. I will start node-1 first and confirm it has started before starting node-2
and node-3
. This is to ensure that all nodes register using the same cluster UUID. After the cluster has started, and all nodes are connected, any of the nodes can be chosen as master node if the current master node in use goes offline for any period of time. In poor network conditions, your master node may change regularly, and all the other nodes will re synchronise with the new agreed master.
Start Elasticsearch Master Node
Start Elasticsearch on ES1 first, wait and confirm its status as active
1 2 |
|
Check its default response and cluster health.
1 |
|
1 |
|
There should be no errors.
Take note of the cluster_uuid
of the master mode.
1 |
|
Start Elasticsearch Data Nodes
Start the other nodes and confirm statuses are active
1 2 |
|
Check Health
Now on any of the nodes (master or data), check the cluster health.
1 |
|
It should show that number_of_nodes
is > 1 and if you have 3 nodes in total, it should say number_of_nodes : 3
If not, then the nodes have probably not detected the master and created there own cluster UUID.
On each other node,
1 |
|
and check if the cluster_uuid
matches the cluster_uuid
on the master node that you started first. node-1
in my case.
If the cluster_uuid
doesn't match, then delete the nodes folder in the data node server,
1 |
|
and restart
1 |
|
Check again the cluster health for the correct value
1 |
|
In the end, when all nodes are running, they should all agree on the same cluster_uuid
when running and all have chosen the same master node.
1 |
|
To see a list of node UUIDs that are active in the cluster,
1 |
|
IP Rules
If your Elasticsearch servers are all public on the internet, then you should create some ip rules to block access.
In my example the IP address of the ES nodes are 203.0.113.1
, 203.0.113.2
and 203.0.113.3
so I will create IP rule that allow only them to communicate between each other.
Elasticsearch will use ports 9200
and 9300
by default.
On all 3 ES nodes execute,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
Note
Replace my example IP address above with your real IP addresses or domain names.
Install Metricbeat
Now that the cluster is confirmed running, its time to start ingesting data into it. I will use Metricbeat since it is a very popular solution and quick to setup.
On each of the master and data nodes, install the Metricbeat service.
1 2 |
|
Edit the configurations to point to all of the elastic search nodes.
1 |
|
Confirm that the system module is enabled
1 2 |
|
Start and test status
1 2 |
|
Check for indices
1 |
|
Check for cluster health
1 |
|
Check who is the master node
1 |
|
Check the ids of each node
1 |
|
Add An Elasticsearch Datasource in Grafana
Key | Value | Notes |
---|---|---|
Name | Elasticsearch |
Or whatever name you want to use |
URL | http://203.0.113.1:9200 |
IP address or Domain name of your ES Server |
Index name | metricbeat-7.10.* |
check the correct index name using the curl http://localhost:9200/_cat/indices from the ES server |
Version | 7.0 | Elasticsearch version 7.10 was used in this tutorial |
Save and Test
Do you have problems connecting?
It is probably ip/firewall rules or the particular ES server is not running.
Add a new rule to each ES server to allow your Grafana server to access port 9200.
Get IP rule line numbers
1 |
|
Insert a rule for your Grafana server at 5 for example. Your Grafana IP address or domain name will be different than mine.
1 |
|