Installing a large Linux cluster, Part 2: Management server configuration and node installation

Getting started with a large Linux cluster

Graham White (gwhite@uk.ibm.com), Systems Management Specialist, IBM, Software Group

Mandie Quartly (mandie_quartly@uk.ibm.com), IT Specialist, IBM

Summary: Create a working Linux? cluster from manyseparate pieces of hardware and software, including System x? and IBMTotalStorage? systems. This second part in a multipart series describesconfiguring the management server and installing the nodes in thecluster.

View more content in this series

Tag this!

Update My dW interests (Log in | What's this?) Skip to help for Update My dW interests

Date: 25 Jan 2007
Level: Advanced
Also available in: Chinese Russian

Activity: 10488 views
Comments: 0 (View | Add comment - Sign in)

Average rating (6 votes)
Rate this article

Introduction

Thisis the second of several articles that cover the installation and setupof a large Linux computer cluster. The purpose of the series is tobring together in one place up-to-date information from various sourcesin the public domain about the process required to create a workingLinux cluster from many separate pieces of hardware and software. Thesearticles are not intended to provide the basis for the complete designof a new large Linux cluster; refer to the relevant reference materialsand Redbooks? mentioned throughout for general architecture pointers.

Thisseries addresses systems architects and systems engineers to plan andimplement a Linux cluster using the IBM eServer? Cluster 1350 framework(see Resourcesfor more information about the framework). Some parts might also berelevant to cluster administrators for educational purposes and duringnormal cluster operation.

Part 1of the series provides detailed instructions on setting up the hardwarefor the cluster. This second part takes you through the next stepsafter hardware configuration: software installation using the IBMsystems management software, Cluster Systems Management (CSM), and nodeinstallation.

Additional parts of the series deal with thestorage back-end of the cluster. The articles will cover the storagehardware configuration and the installation and configuration of the IBMshared file system, General Parallel File System (GPFS).

Configuring the management server

Thesoftware side of setting up a cluster is a two-stage process: first,install the cluster management server, as described in the first part ofthis article; and second, install the rest of the cluster, as describedbeginning in section "Installing nodes".Following this process enables you to use the management server to helpconfigure the rest of the cluster and to prepare for post-installationmaintenance and operation.

Installing Linux

Installa fresh operating system installation on the management server.Determine the following specifics of the management server. This exampleuses a System x 346 machine, which is a typical IBM management server,running Red Hat Enterprise Linux (RHEL) 3. However, this could beanother type of computer running a different Linux distribution, such asSuse Linux Enterprise Server (SLES). The System x 346 machine is 64-bitcapable, so the operating system installed is the x86_64 architectureversion of RHEL 3 across a two-disk mirror using a ServeRAID 7k card.Again, your environment might be slightly different, but the basics ofinstalling the CSM management server should be roughly the same.

Bootthe server with the latest IBM ServeRAID support CD to configure theon-board disks for RAID 1 (mirroring). This assumes you have at leasttwo disks in the server and you require protection from disk failure foryour operating system.

With the disks configured as a singlemirror, boot the server with the first RHEL CD to install the RHELoperating system. Depending on your console, you might need to alter theappearance of the installation. For example, for low-resolutionconsoles, you might need to boot the CD by typing linux vga=normal at the boot prompt. After you see the Linux installation GUI, install as normal, following these instructions:

Select your language, keyboard map, mouse type, and so on.
Configure disk partitions, as follows:
- 128Mb /boot primary partition.
- 2 GB swap partition.
- Allocate the remaining space to an LVM partition without formatting.
Perform logical volume (LVM) setup, as follows:
- Name the volume group system.
- Add logical volumes as shown in Table 1.
Set up the network interfaces, as follows:
- Activate eth0 on boot with fixed IP address 192.168.0.253/24 according to our example hosts file above.
- Set the hostname to mgmt001.cluster.com..
- No gateway/DNS is required at this stage, but if you have external IP information, you can configure it during installation.
Set firewall to no firewall to allow all connections. Again, if you have a requirement for IP tables, you can configure this later.
Apply your local settings and select the appropriate time zone.
Set the root password; our example password is cluster.
Customize the package installation to include the following:
- X Window system
- KDE (that is, K desktop environment)
- Graphical internet
- Server configuration tools
- FTP server
- Network servers
- Legacy software development
- Administration tools
Start the installation.

Table 1. Logical volume layout

Logical volume	Mount point	Size
Root	`/`	8192 MB
Var	`/var`	8192 MB
Usr	`/usr`	8192 MB
Opt	`/opt`	4096 MB
Tmp	`/tmp`	2048 MB
Csminstall	`/csminstall`	10240 MB

Onceyou complete installation, you need to navigate through anypost-installation setup screens. Make any custom post-installationchanges to the management server for your environment. For example, youmight need to configure the X server to work comfortably with your KVM(keyboard, video, and mouse) setup.

Installing CSM

Installingthe Cluster Systems Management (CSM) software is generally trivial on asupported system. Good documentation is available in HTML and PDFformats in the IBM Linux Cluster documentation library (see Resources).

Thefirst step is to copy software onto the management server. Because youmust perform the installation as the root user, you can store all thisin the root home directory. Table 2 shows an appropriate directorystructure.

Table 2. CSM software

Directory	Description
/root/manuals/csm/	The CSM documentation PDFs
/root/manuals/gpfs/	The GPFS documentation PDFs
/root/manuals/rsct/	The RSCT documentation PDFs
/root/csm/	The CSM software (contents of the CSM tar package)
/root/csm/downloads/	Your open source download RPMS for CSM (for example `autorpm`)

Toinstall CSM, install the csm.core i386 RPM package. This package worksfor the x86_64 architecture too. After you install that package, youhave the command available to install the CSM management server. First,source the /etc/profile.d/Csm.sh into your current shell to pick up the new path setting. Then, run the installms command and apply the CSM license to the system. Here are the commands you need to enter:

rpm -ivh /root/csm/csm.core*.i386.rpm            . /etc/profile.d/Csm.sh            installms -p /root/csm/downloads:/root/csm            csmconfig -L <Your License File>

Note: If you do not have a CSM license file, you can run the same csmconfig -Lcommand without the license file to accept the 60-day CSM try-and-buylicense. You must then apply a full CSM license to continue CSM functionafter the 60-day period expires.

Optimizing for large cluster usage

CSMis designed to be scalable. Also, Red Hat Linux works well in moststandard situations. However, you can make some tweaks to the managementserver in order to make a large cluster environment run a littlesmoother. Here are some examples of ways to optimize the system:

Listen for DHCP requests on a specific interface.

Edit the /etc/sysconfig/dhcpd DHCPDconfiguration file so the DHCPDARGS is set to the appropriateinterface. The variable DHCPDARGS is used in Red Hat Linux in the /etc/init.d/dhcpdDHCPD start-up script to start the DHCP daemon with the specifiedarguments. Ensure multiple arguments are all contained in quotes inorder to listen on eth0 set, like this:

DHCPDARGS="eth0"

Increase ARP table size and timeouts.

Ifyou have a single large network with much of or the entire cluster onthe same subnet, the ARP table can become overloaded, giving theimpression that CSM and network requests are slow. To avoid this, makethe following changes to the running system, and add them as entries tothe /etc/sysctl.conf file, too, in order to make the changes persistent:

net.ipv4.conf.all.arp_filter = 1            net.ipv4.conf.all.rp_filter = 1            net.ipv4.neigh.default.gc_thresh1 = 512            net.ipv4.neigh.default.gc_thresh2 = 2048            net.ipv4.neigh.default.gc_thresh3 = 4096            net.ipv4.neigh.default.gc_stale_time = 240

Increase the number of NFS daemons.

Bydefault, the standard CSM fanout value is set to 16. This meanscommands run across the cluster are run 16 nodes at a time, includingnode installation. The standard Red Hat Linux setting for NFS is eightrunning daemons. You can make NFS scale better by increasing the numberof NFSD threads to 16 to match default CSM fanout value. However, if youincrease the fanout value you might also want to increase the number ofNFS threads. Typically, a fanout of 32 with 32 NFS threads is enough togive good speed and reliability and also allows the installation of asingle rack of 32 nodes to be installed concurrently. To do this, createthe configuration file /etc/sysconfig/nfs, and add the following line:

RPCNFSDCOUNT=16

Set up an NTP server.

The default Red Hat Linux configuration should work for the NTP server. Add a configuration line to the /etc/ntp.confNTP configuration file to allow nodes on your cluster network tosynchronize their time to the management server clock, as shown here:

restrict 192.168.0.253 mask 255.255.255.0 notrust nomodify notrap

If your management server can reach an outside timeserver, add this line to synchronize the management server clock to it:

server server.full.name

Ensure the NTP server is running and starts automatically at boot time using the following instruction:

chkconfig ntpd on            service ntpd start

Installing nodes

TheCSM management server is now installed with all setup and configurationsteps completed. Before you can install any nodes, however, you need tocomplete additional configuration on the CSM management server todefine how the nodes are installed. Do the installation steps in thissection on the CSM management server.

Defining nodes

Youcan define nodes using any method described in the definenode manualpage. However, an easy way to define large numbers of nodes is to usethe node definition file. With this method, you create a stanza file andpass it as an argument to CSM to define all the nodes listed. It iseasy to script the creation of the stanza file.

Listing 1 shows ashort example node definition file. If you have similar properties fordifferent nodes, you can define them at the top of the file in a defaultstanza. After this, each stanza should represent a node name withnode-specific attributes listed under it. The example shows how todefine three machines in the example cluster -- two compute nodes andone storage server.

Listing 1. Example node definition file

  default:            ConsoleMethod = mrv            ConsoleSerialDevice = ttyS0            ConsoleSerialSpeed = 9600            InstallAdapterName = eth0            InstallCSMVersion = 1.4.1            InstallMethod = kickstart            InstallOSName = Linux            InstallPkgArchitecture = x86_64            ManagementServer = mgmt001.cluster.com            PowerMethod = bmc            node001.cluster.com:            ConsolePortNum = 1            ConsoleServerName = term002            HWControlNodeId = node001            HWControlPoint = node001_d.cluster.com            InstallDistributionName = RedHatEL-WS            InstallDistributionVersion = 4            InstallServiceLevel = QU1            node002.cluster.com:            ConsolePortNum = 2            ConsoleServerName = term002            HWControlNodeId = node002            HWControlPoint = node002_d.cluster.com            InstallDistributionName = RedHatEL-WS            InstallDistributionVersion = 4            InstallServiceLevel = QU1            stor001.cluster.com:            ConsolePortNum = 2            ConsoleServerName = term001            HWControlNodeId = stor001            HWControlPoint = stor001_d.cluster.com            InstallDistributionName = RedHatEL-AS            InstallDistributionVersion = 3            InstallServiceLevel = QU5

The node definition file youcreated with your script would be far longer than this for a large scalecluster. However, when passed to CSM, this command creates nodesquickly:

definenode -f <node-def-filename>

Note that node-def-filename should be changed to match the name of the file where you stored the node definition file just described, for example, definenode -f //tmp/my_nodes.def.

TheCSM node database should now contain a list of all your nodes. For thesmall example cluster, this would consist of 16 compute nodes, a usernode, a scheduler node, and a storage server. The CSM management serverdoes not appear in the CSM database. You can see a list of the nodeswith the lsnodes command. You can use the lsnode -Fcommand to see a more detailed list that you can also use to backup yourCSM node definitions. Redirect the output from this command to a file,and you can redefine nodes by using the definenode -f command again.

Defining node groups

CSMallows nodes to be grouped together using some arbitrary conditionsthat later allow you to use other CSM commands against a particulargroup of nodes. This can be particularly useful for referring to certaintypes of nodes with similar attributes.

CSM allows the use ofboth dynamic and static node groups. Static node groups contain a setlist of node names that the administrator maintains manually. Forexample, when using static node groups, you must manually add any newlydefined nodes to any relevant node groups. Dynamic node groups are ofmore use in a large cluster and, with a little thought into carefulsetup, can save significant time and minimize typing on the commandline. Dynamic node groups define a list of nodes. Members of the listare defined by a condition such that if a node meets the definedcondition, it is automatically placed in that node group, includingnewly defined nodes. Table 3 shows some example dynamic node groupdefinitions.

Table 3: Dynamic node groups

Definition command	Comments
Nodegrp -w "Hostname like 'node%'" ComputeNodes	Create a `ComputeNodes` node group
Nodegrp -w "Hostname like 'schd%'" SchedulerNodes	Create a `SchedulerNodes` node group
Nodegrp -w "Hostname like 'stor%'" StorageNodes	Create a `StorageNodes` node group
Nodegrp -w "Hostname like 'user%'" UserNodes	Create a `UserNodes` node group
Nodegrp -w "Hostname like 'node%' && ConsoleServerName=='term002'" Rack02 nodegrp -w "Hostname like 'node%' && ConsoleServerName=='term003'" Rack03 nodegrp -w "Hostname like 'node%' && ConsoleServerName=='term...'" Rack...	Create node groups for each rack based on `Hostname` and `ConsoleServerName` Assumes one console server for each rack `autorpm`)

Preparing Linux distributions

TheCSM management server should contain the CD contents of all the Linuxdistributions you will install across the cluster. It should also beprepared for CSM installation on client machines, which you should dobefore any installations. CSM provides two commands to help, which mustbe run for each Linux distribution you are going to install.

To prepare the /csminstall/Linux tree with the required CSM data, run the copycsmpkgs command. For example:

copycsmpkgs -p /path/to/csm:/path/to/downloads InstallDistributionName=RedHatEL-WS            InstallDistributionVersion=4 InstallServiceLevel=QU1            copycsmpkgs -p /path/to/csm:/path/to/downloads InstallDistributionName=RedHatEL-AS            InstallDistributionVersion=3 InstallServiceLevel=QU5

To prepare the /csminstall/Linux tree with the required Linux distributions CDs, run the copycds command. For example:

copycds InstallDistributionName=RedHatEL-WS InstallDistributionVersion=4            InstallServiceLevel=QU1            copycds InstallDistributionName=RedHatEL-AS InstallDistributionVersion=3            InstallServiceLevel=QU5

Once you have the directorystructure set up for the CDs, you can add any customized packages toinstall or upgrade during system installation, such as the following:

Copy to /csminstall/Linux/.../x86_64/install to ensure they are installed.
Copy to /csminstall/Linux/.../x86_64/updates to install only if an existing version is present.

Youcan create subdirectories with the name of a node group to install orupdate only RPMS on a particular group of nodes, if required.

Setting up CFM

CSMprovides a mechanism called Configuration File Manager (CFM) that youcan use to distribute files across the cluster. You can use CFM to sendsimilar files across the cluster. If you set this up before nodeinstallation, the files will be distributed during the installationprocess.

CFM can contain links to files in other directories onthe management server. The links are followed, rather than copied, whensent to the nodes. This is particularly useful for files such as thehosts file, as shown here:

mkdir /cfmroot/etc            ln -s /etc/hosts /cfmroot/etc/hosts

Instead of linking files, you can copy files into CFM instead. For example:

Copy the default NTP configuration file to /cfmroot/ntp.conf

Add a server line for your management server using the following:

/cfmroot/etc/ntp                echo "management.server.full.name" gt; /cfmroot/etc/ntp/step-tickers

The file will be distributed across the cluster.

UseCFM when you need to transfer a few files to specific locations on thecluster. However, it is generally not a good idea to overload CFM withlarge numbers of files on a large cluster. For example, do not use CFMto install extra software from a tar archive. When used in this wayacross a large cluster, CFM takes a long time to run, making it painfulto use. Stick to getting software installed using the supportedinstallation mechanisms. For example, use RPMs instead of tar files toinstall software, and copy just the configuration files (that is, filesthat are likely to change over time) into CFM.

Customizing node build

CSMinterfaces with the standard network installation mechanism for theoperating system you plan to install on each node. For example, thismight be NIM on AIX?, autoYaST on Suse Linux, and kickstart on Red Hat Linux. Again, Red Hat is used here for the example of installing a node with kickstart and a kickstart configuration file.

Before starting kickstart setup, check that you have rpowercontrol over all the nodes. This helps CSM get the UUID for eachcomputer in later CSM versions. If the UUID is not available, or youhave a CSM version older than 1.4.1.3, CSM tries to get the MAC addressfrom the first Ethernet device of the nodes. In order for CSM MACaddress collection to work, the terminal server configuration must matchthe settings in the BIOS of the nodes. Check the terminal serverconnection using the rconsole command. When you are satisfied you have established rpower control and a terminal server connection (if appropriate), continue kickstart configuration.

CSM provides default kickstart templates in the file /opt/csm/install/kscfg.tmpl.*.You can copy these templates to a different filename and customize themto better suit your environment, if required. The templates are a goodstarting point, and you should generally customize a template filerather than any other standard kickstart file. This isbecause the templates contain macros for a variety of CSM function, suchas running a post-installation script. CSM contributes to the kickstartprocess by analyzing your kickstart template file before producing thefinal kickstart file for each node to use. The final file contains allthe parsed macros and includes full scripts for everything defined inthe template.

Generally, you might want to change the template in the following ways:

Alter the disk partitioning, perhaps to include LVM
Change the default password
Edit the package list to be installed

Once you have edited the kickstart template, run the CSM setup command to generate the end kickstart file and do the initial UUID or MAC address collection as follows:

csmsetupks -n node001 -k /opt/csm/install/your.kickstart.file -x

Note: Use the -x switch because the copycds command was run earlier.

Updating drivers

Ifyou have hardware to be installed in the cluster that the operatingsystem does not directly support, there might still be driversavailable. This procedure also works for driver upgrades, when required.CSM can include additional or replacement drivers automatically forboth the final installation and the RAM disk used when installing theoperating system.

When using the System x hardware example, youmight want to gain the performance and stability that the Broadcomnetwork driver for the on-board Broadcom Ethernet adapters gives. To dothis, follow these steps, which show how to use the Broadcom bcm5700 driver instead of the standard tg3 network driver Red Hat Linux provides:

Because you are building a kernel module, ensure the kernel source installed for your target system matches the kernel level and type (UP or SMP).
Download the latest bcm57xx driver from Broadcom (see Resources), and unpack the driver source code.
Run make from the src directory of the bcm driver you have unpacked to build against the running kernel.
Copy the build driver (bcm5700.ko for 2.6 kernel or bcm5700.o for 2.4 kernels) to /csminstall/csm/drivers/lt;kernel versiongt;/x86_64 on the management server.
If you want to build against other kernel versions, you can run make clean to wipe the current build and then run make LINUX=/path/to/your/kernel/source.

CSM uses the drivers from the directory structure under /csminstall/csm/drivers/lt;kernel versiongt;/lt;architecturegt;when building RAM disk images. These images are used to boot the systemduring installation if the kernel version matches the RAM disk kernelversion. Be careful when creating drivers for installation images: somekernel version numbers are different for the installation kernel. Forexample, Red Hat typically appends the word BOOT to the endof the version string. If the kernel version matches the running kernelof the installed system, the driver is made available to the runningoperating system, as well. If you are unsure about the kernel versions,investigate inside the RAM disk images as described in the followingsection.

Modifying the RAM disk

Thisstep is not typically recommended. However, some situations requireit, such as when you are unsure about kernel versions. The command listbelow can also be helpful when investigating the RAM disk images whenmaking new drivers and in other circumstances.

When storage isdirectly attached to a Red Hat Linux system using a host bus adapter(HBA) to the installation target, the storage driver (such as the qlogic qla2300drivers) might be loaded before the ServeRAID drivers (used for theinternal system disk that is the operating system disk). If thishappens, the installation takes place on the wrong disk. /dev/sdarepresents an LUN on the attached storage rather than the local disk.In this situation, beware of overwriting data on your SAN instead of on alocal disk when installing a new operating system. To avoid this,remove the qlogic drivers from the default Red Hat RAM diskimage CSM uses to create the boot image for installation. Of course,you need drivers when the system is running, so use another mechanism,such as a post installation script, to install the drivers for the running operating system. This is recommended because the default Red Hat qlogic drivers are generally not failover drivers.

For example, remove the qla2300drivers from the default RAM disk image for Red Hat Enterprise LinuxAdvanced Server Version 3. Table 4 shows the commands to do this.

Table 4: RAM disk commands

Command	Purpose
cd /csminstall/Linux/RedHatEL-AS/3/x86_64/RedHatEL-AS3-QU5/images/pxeboot	Change to the directory containing the RAM disk image you need to modify.
cp initrd.img initrd.img.orig	Back up the original image.
mkdir mnt	Create a mount point.
gunzip -S .img initrd.img	Unpack the image.
mount -o loop initrd.img /mnt	Mount the image on the mount point
manual step	Manually remove all references to `qla[23]00` in `mnt/modules/*`.
cp mnt/modules/modules.cgz	Copy the modules archive from the image to the current directory.
gunzip -c modules.cgz \| cpio -ivd	Unpack the modules archive.
rm modules.cgz	Delete the modules archive.
rm 2.4.21-32.EL/ia32e/qla2*	Delete the `qlogic` modules from the unpackaged modules archive.
find 2.4.21-32.EL -type f \| cpio -a€“o -H crc \| gzip -c -9 > modules.cgz	Pack up the modules archive with `qlogic` modules removed.
rm -rf 2.4.21-32.EL	Delete the unpacked modules archive.
mv -f modules.cgz mnt/modules	Replace the old modules archive for the new one.
umount mnt	Unmount the RAM disk image.
rmdir mnt	Remove the mount point.
gzip -S .img initrd	Pack up the RAM disk image again.

Note: To modify the RAM disk for Suse or SLES, make sure ips (the ServeRAID driver) appears before any HBA drivers in the /etc/sysconfig/kernel file under the INITRD_MODULES stanza. The Suse or SLES mechanism for creating RAM disk images ensures that the drivers are loaded in order.

Installing pre- and post-reboot scripts

Becauseeach environment and cluster is different, you might need to apply somepost-installation scripting to customize the operating systeminstallation to your specific requirements. You can do this eitherbefore or after the first reboot into a newly installed system. This canbe particularly useful for configuring secondary network adapters, andCSM provides an example script for this purpose. A secondary adapterconfiguration is required for the example cluster because of the dualnetwork setup: one compute network and one storage network to eachnode. Follow these steps for secondary adapter configuration:

Copy the default adapter configuration script CSM provides into the installprereboot script execution directory to enable the script to run at installation time, ensuring it can run, as follows:

cp /csminstall/csm/scripts/adaptor_config_Linux /csminstall/csm/scripts/                installprereboot/100_adapter_config._LinuxNodes                chmod 755 /csminstall/csm/scripts/installprereboot/100_adaptor_config._LinuxNodes

Generate the adapter stanza file /csminstall/csm/scripts/data/Linux_adapter_stanza_file by writing a header such as the following:

default:                machine_type=secondary                network_type=eth                interface_name=eth1                DEVICE=eth1                STARTMODE=onboot                ONBOOT=yes                BROADCAST=192.168.1.255                NETMASK=255.255.255.0                MTU=9000

This configures all secondary (eth1) adapters to start on boot, and it takes the default settings for the broadcast address, network mask, and MTU size. You can configure the computer-specific network details in additional stanza lines in a similar way to the node definition files, as shown here:

for node in $(lsnodes)                do                ip=$(grep $node /etc/hosts | head -n 1 | awk '{print $1}')                echo -e "$node:\n  IPADDR=$ip" gt;gt; Linux_adaptor_stanza_file                done

This appends output similar to the following to the adapter stanza file to configure each computer with a different IP address, as follows:

node001.cluster.com:                IPADDR: 192.168.1.1                node002.cluster.com:                IPADDR: 192.168.1.2                node003.cluster.com:                IPADDR: 192.168.1.3

Installing

There are two main shell environment variables applicable during node installation: CSM_FANOUT and CSM_FANOUT_DELAY.The former variable controls how many nodes are sent CSM instructionssimultaneously, such as how many nodes are rebooted from the managementserver. The latter variable controls how long (in seconds) CSM waitsbefore rebooting the next set of nodes to be installed. These variablesare set to 16 nodes for fanout and to wait 20 minutes before rebootingthe next set of nodes. These default values are acceptable for mostinstallations but can be increased for large clusters.

To install the cluster in the classic way, complete the following steps:

Configure the installation and install the compute nodes as follows:

csmsetupks -N ComputeNodes -k                /opt/csm/install/your.kickstart.file.nodes -x                installnode -N ComputeNodes

Configure the installation and install the user nodes as follows:

csmsetupks -N UserNodes -k /opt/csm/install/your.kickstart.file.user -x                installnode -N UserNodes

Configure the installation and install the scheduler nodes as follows:

csmsetupks -N SchedulerNodes -k                /opt/csm/install/your.kickstart.file.schd -x                installnode -N SchedulerNodes

Configure the installation and install the storage nodes as follows:

csmsetupks -N StorageNodes -k                /opt/csm/install/your.kickstart.file.stor -x                installnode -N StorageNodes

For large clusterinstallations, use installation servers to stage the installationprocess, and parallelize the installation process as follows:

Set the InstallServer attribute in CSM. For each node you want to install from an installation server, set the InstallServer attribute to the hostname of the installation server to use for that node. Any nodes without this attribute set defaults to installing from the central management server. In a large cluster environment where, for example, you have 32 nodes per rack, you could select the bottom node in each rack to be an installation server for the cluster. In this case, to configure node002 through node032 in rack 1 to install from node001 and have node001 install from the management server, use this command:
chnode -n node002-node032 InstallServer=node001

Create a dynamic node group containing all installation servers and another containing the clients as follows:

nodegrp -w "InstallServer like '_%'" InstallServers                nodegrp -w "InstallServer not like '_%'" InstallClients

Configure the installation, and install the installation servers as follows:

csmsetupks -N InstallServers -x                installnode -N InstallServers

Increase the CSM fanout value to reboot more nodes concurrently in order to take advantage of the increased bandwidth using installation servers provides. In the 32-nodes-per-rack example, the most efficient value for the CSM fanout is 32 multiplied by the number of installation servers (or racks, if one node per rack). In the example, you could also increase the number of NFS threads on each installation server to 32 to scale NFS a little better with each rack. Using this method, you can install hundreds or thousands of machines concurrently.

Configure the installation and install the installation clients as follows:

csmsetupks -N InstallClients -x                installnode -N InstallClients

Conclusion

Aftercompleting all the steps detailed in the first two parts of thisseries, you have completed the hardware and software setup for yourcluster, including setting up the systems management software andcompleting the node installation. The concluding parts of the serieswill guide you through setting up the storage back-end; specifically,performing storage hardware configuration and installing and configuringthe IBM shared file system, General Parallel File System (GPFS).

Resources

Learn

RSS feed for this series: Request notification for the upcoming articles in this series. (Find out more about RSS feeds of developerWorks content.)
IBM eServer Cluster 1350 framework: Learn about the IBM eServer Cluster 1350 framework.
IBM Linux Cluster documentation library: Access the IBM Linux Cluster documentation library.
IBM Support: Query IBM Support about hardware, software, and updates.
IBM Systems?: Want more? The developerWorks IBM Systems zone hosts hundreds of informative articles and introductory, intermediate, and advanced tutorials.
developerWorks technical events and webcasts: Stay current with developerWorks technical events and webcasts.

Get products and technologies

Broadcom drivers: Download the latest Broadcom drivers.
IBM: Obtain all IBM software and hardware mentioned in this article.
Extreme Networks, Force 10 Networks, and Cisco Systems: Visit these sites for more information about performance network switches.
Broadcom firmware: Download the firmware mentioned in this article.
IBM trial software: Build your next development project with software for download directly from developerWorks.

Discuss

Cluster Systems Management: Join the discussion forum on developerWorks.
IBM Systems forums and developerWorks blogs: Exchange information with other developers.

About the authors

GrahamWhite is a systems management specialist in the Linux IntegrationCentre within Emerging Technology Services at the IBM Hursley Parkoffice in the United Kingdom. He is a Red Hat Certified Engineer, and hespecializes in a wide range of open-source, open-standard, and IBMtechnologies Graham's areas of expertise include LAMP, Linux, security,clustering, and all IBM Systems hardware platforms. He received a BScwith honors in Computer Science with Management Science from ExeterUniversity in 2000.

MandieQuartly is an IT specialist with the IBM UK Global Technology Servicesteam. Mandie performs a cross-brand role, with current experience inboth Intel and POWER? platform implementations as well as AIX and Linux(Red Hat and Suse). She specializes in the IBM product General ParallelFile System (GPFS). She received a PhD in astrophysics from theUniversity of Leicester in 2001.

本站仅提供存储服务，所有内容均由用户发布，如发现有害或侵权内容，请点击举报。