Building a Linux Cluster

D

DevynCJohnson

Guest
Sometimes, businesses and people may need extra computing power. Thankfully, computers can combine their resources and act as one system. This single entity is called a cluster and the act of making a cluster is called clustering. In a cluster, computers are connected together on a Local Area Network (LAN). Each computer is called a node and each node acts as a server. The server must have an operating system running on it. Linux is one of many operating systems that supports clustering.

Each node is not required to be physically equal. In other words, some of the nodes can have a 2GHz processor and 3GB of RAM while other nodes can have 3.5GHz and 4GB of memory. The nodes can even be different brands of computers (Dell, Toshiba, ThinkPenguin, etc.).

The most simple type of cluster is a beowulf cluster. Such a cluster has nodes connected together on a LAN. All of the nodes work together as a single machine as opposed to a "Cluster of Workstations" (a COW cluster) where the nodes work together, but not as a single machine. However, in a Beowulf cluster, each node has its own operating system, so it may appear to users that the machines are not working together.

To setup a Linux MPICH1 Beowulf cluster, obtain some computers and connect them together on the same network. Make a note as to which computer will be what node. The best hardware should be node0. Remember the node numbers that are assigned to each computer. Next, select a Linux distro and install that distro on each computer. After the install, follow the below steps on each computer.

Create a user on each node. Be sure to use the same username so that the user is the same across all of the nodes.

Add the IP address of each node to /etc/hosts. Remember to assign the same IP to the same node on each computer's /etc/hosts file. For instance, in all of the /etc/hosts files, node0 would have the IP address of 192.168.2.1.

Now, login as the user that was created on all of the nodes. Once logged in, open a terminal and type "ssh-keygen -t dsa". If the cluster is not connected to the Internet and security is not a concern, then the generated SSH key can lack a password. When done, open ~/.ssh/id_dsa.pub and copy the public key. Next, in the same directory, create a file called "authorized_keys" and place the public key inside.

Afterwards, download MPICH1 ( http://dcjtech.info/wp-content/uploads/2015/03/mpich.tar.gz ) and uncompress it. Open a terminal in the MPICH1 source code folder (the folder that was in the compressed file). Once inside, run the following commands which will compile and install MPICH1. GCC may need to be installed prior to running the commands.
Code:
mkdir ~/mpich1
./configure --prefix=~/mpich1
make
make install

NOTE: Remember to follow these steps on all of the nodes.

After the installation, open ~/.bashrc and place the below code inside. If the file does not exist, then create it and insert the below code.
Code:
export PATH=~/mpich1/bin:$PATH
export PATH
LD_LIBRARY_PATH="~/mpich1/lib:$LD_LIBRARY_PATH"
export LD_LIBRARY_PATH

Next, with Root privileges, execute "sudo echo ~/mpich1/bin >> /etc/environment". When finished, logout and then log back in so that the BASHRC and environment scripts take effect.

MPICH1 needs to be configured. To do so, find the file called "machines.LINUX". It will be in ~/mpich1/share/ or ~/mpich1/util/machines/. In the file, list all of the hostnames of the nodes. Place each hostname on its own line. Also, do not add the hostname of the node that owns the file. For instance, on node3, the file should not contain node3's hostname. In the same file, after each hostname type a space followed by a colon and then another space. After this, type the number of cores owned by that node. For illustration, if node1 is a quad-core, then the line would look like "node1 : 4".

The cluster is now complete. To take advantage of the clustering, open a terminal and type "mpirun -np # PROGRAM" where "#" is the number of processes/threads to create and "PROGRAM" is the program or script to run on the cluster.

Further Reading

Setting up an MPICH2 cluster - https://help.ubuntu.com/community/MpichCluster
 

Attachments

  • slide.jpg
    slide.jpg
    22.4 KB · Views: 101,470


I am confused. I have heard a little bit about DRBD cluster. It seems different from Beowulf cluster .. Can anyone Explain the difference between the Gluster, DRBD and Beowulf ?
 
I am confused. I have heard a little bit about DRBD cluster. It seems different from Beowulf cluster .. Can anyone Explain the difference between the Gluster, DRBD and Beowulf ?

A Gluster is a GNU Cluster (all of the computers use Linux) where all the computers share the same storage using GlusterFS. Glusters are used to make cloud systems.
http://www.gluster.org/

A Beowulf cluster is a group of separate computers that may be using different hardware and platforms. However, they act as one, but they are separate. Think of beowulf clusters like a corporation or government body, each being comprised of many individual people, but still acting as one unit.

A Distributed Replicated Block Device (DRDB) cluster is similar to a Gluster, but uses a different storage type. DRDB clusters are specific to Linux and the code is in the vanilla (official, main-stream, and unchanged) Linux kernel
http://drbd.linbit.com/
https://en.wikipedia.org/wiki/Distributed_Replicated_Block_Device

More differences may exist. Plus, I am new to Gluster, and I never heard of DRDB until you mentioned it.
 
Where is says create the file ~/.bashrc where do I place it if its missing?
I have a filed named bash.bashrc is that good also
 
Last edited:
As I know MPICH2 is better than MPICH1, why do you install the MPICH1 instead of 2?
 

Members online


Top