Php文档 Php问答行业资讯 Php论坛 Php手册 Php博客

游戏榜单

软件榜单

关闭导航

热搜榜

热门下载

热门标签

关闭搜索

php爱好者> php文档>jdmail双机热备配置（3）

jdmail双机热备配置（3）

时间：2010-06-09 来源：libo20100322

4.2.1 热备软件

实施使用的群集配置如图 2 所示。该设置包括一对构成群集服务器（ha1 和 ha2），两者都可以访问包含多个物理磁盘的磁盘盒；服务器处于冷备份模式。应用程序数据需要位于两个节点都可访问的共享设备上。该设备可以是一个共享磁盘，或者网络文件系统。为了防止数据被破坏，设备本身应该被镜像或者具有数据保护。这种配置经常被称作共享磁盘群集，不过，实际上，这是一个什么都不共享的体系结构，因为在同一时刻任何磁盘都只能被一个节点访问。

图 2. 产品环境中的 heartbeat 群集配置

在测试设置中，我使用的共享磁盘机制是 NFS，如图 3所示，不过，建议使用如图2 所示的选项，尤其是在产品环境中时。两个系统的串口之间的直连线缆用来在两个节点间传输 heartbeat。

图 3. 使用 NFS 作为共享文件系统的 heartbeat 群集配置

为了适应Red Hat 9，需要安装的是heartbeat for Red Hat 9的rpm包heartbeat-1.0.4。主要包括三个组件：

heartbeat-pils-1.0.4-2.rh.9.i386.rpm

heartbeat-stonith-1.0.4-2.rh.9.i386.rpm

heartbeat-1.0.4-2.rh.9.i386.rpm

在安装时可能会遇到依赖性的错误，可以用--nodeps参数进行安装。总之把它提示所需要的rpm包（在dependancies目录下）全部装上。在dependancies目录下其它一些其它rpm包，如果不用相关功能可以不装。

4.2.2 Heartbeat的安装

用root身份登陆进入REDHAT LINUX的命令行下，输入

#rpm -ivh heartbeat-pils-1.0.4-2.rh.9.i386.rpm

#rpm -ivh heartbeat-stonith-1.0.4-2.rh.9.i386.rpm

#rpm -ivh heartbeat-1.0.4-2.rh.9.i386.rpm

注意安装有先后顺序。

4.2.3 配置 Heartbeat

需要配置的有三个文件：ha.cf haresources(在每个节点必须相同) Authkeys，应该将它们放置在/etc/ha.d目录下。范例配置在/usr/shared/doc/heartbeat-1.0.4目录下，你可以修改后拷贝到/etc/ha.d目录下使用。

4.2.3.1. 配置/etc/ha.d/ha.cf

4.2.3.1. 配置ha.cf

debugfile /var/log/ha-debug

logfile /var/log/ha-log

logfacility local0

keepalive 2

deadtime 10

warntime 8

initdead 120

#nice_failback on

#hopfudge 1

#baud 19200

# serial serialportname ...

#serial /dev/ttyS0 # Linux

#serial /dev/cuaa0 # FreeBSD

#serial /dev/cua/a # Solaris

udpport 694

#bcast eth0 # Linux

bcast eth1 # Linux

##mcast eth1 225.0.0.1 694 1 0

##ucast eth0 192.168.0.112

watchdog /dev/watchdog

# node nodename ... -- must match uname -n

node ha-1

node ha-2

ping 192.168.2.1

这个配置文件告诉heartbeat 使用的是什么介质和如何配置它们。ha.cf 包含你将到的所有的选项，内容如下：

serial /dev/ttyS0

使用串口heartbeat - 如果你不使用串口heartbeat, 你必须选择其它的介质，比如以太网bcast (ethernet) heartbeat。如果你使用其它串口heartbeat，修改/dev/ttyS0 为其它的串口设备。

watchdog /dev/watchdog

可选项：watchdog功能提供了一种方法能让系统在出现故障无法提供"heartbeat"时，仍然具有最小的功能，能在出现故障1分钟后重启该机器。这个功能可以帮助服务器在确实停止心跳后能够重新恢复心跳。如果你想使用该特性，你必须在内核中装入"softdog" 内核模块用来生成实际的设备文件。想要达到这个目的, 首先输入 "insmod softdog" 加载模块。然后，输入"grep misc /proc/devices" 注意显示的数字 (should be 10).然后, 输入"cat /proc/misc | grep watchdog" 注意输出显示出的数字(should be 130)。现在你可以生成设备文件使用如下命令："mknod /dev/watchdog c 10 130" 。

bcast eth1

指定使用的广播heartbeat 的网络接口eth1(修改为eth0, eth2, 或你所使用的接口)

keepalive 2

设置心跳间隔时间为2两秒。

warntime 10

在日志中发出最后心跳"late heartbeat" 前的警告时间设定。

deadtime 30

在30秒后明确该节点的死亡。

initdead 120

在一些配置中，节点重启后需要花一些时间启动网络。这个时间与"deadtime"不同，要单独对待。至少是标准死亡时间的两倍。

hopfudge 1

可选项：用于环状拓扑结构,在集群中总共跳跃节点的数量。

baud 19200

串口波特率的设定(bps).

udpport 694

bcast和ucast通讯使用的端口号694 。这是缺省值，官方IANA 使用标准端口号。

nice_failback on

可选项：对那些熟悉Tru64 Unix, 心跳活动就像是"favored member"模式。主节点获取所有资源直到它宕机,同时备份节点启用。一旦主节点重新开始工作, 它将从备份节点重新获取所有资源。这个选项用来防止主节点失效后重新又获得集群资源。

node linuxha1.linux-ha.org

强制选项：通过`uname -n`命令显示出的集群中的机器名。

node linuxha2.linux-ha.org

强制选项：通过`uname -n`命令显示出的集群中的机器名。

respawnuseridcmd

可选项：列出可以被spawned 和监控的命令。例如：To spawn ccm 后台进程，可以增加如下内容：

respawn hacluster /usr/lib/heartbeat/ccm

通知heartbeat 重新以可信任userid身份运行(在我们的例子中是hacluster) 同时监视该进程的"健康"状况，如果进程死掉，重启它。例如ipfail, 内容如下：
respawn hacluster /usr/lib/heartbeat/ipfail

NOTE: 如果进程以退出代码100死掉, 这个进程将不会respawned。

pingping1.linux-ha.orgping2.linux-ha.org ....

可选项：指定ping 的节点。这些节点不是集群中的节点。它们用来检测网络的连接性，以便运行一些像ipfail的模块。

4.2.3.2. 配置 haresources

一旦你配置好了ha.cf文件，下面就需要设置haresources文件，这个文件指定集群所提供的服务以及谁是缺省的主节点。注意，该配置文件在所有节点应该是相同的。

# This is a list of resources that move from machine to machine as

# nodes go down and come up in the cluster. Do not include

# "administrative" or fixed IP addresses in this file.

# <VERY IMPORTANT NOTE>

# The haresources files MUST BE IDENTICAL on all nodes of the cluster.

# The node names listed in front of the resource group information

# is the name of the preferred node to run the service. It is

# not necessarily the name of the current machine. If you are running

# nice_failback OFF then these services will be started

# up on the preferred nodes - any time they're up.

# If you are running with nice_failback ON, then the node information

# will be used in the case of a simultaneous start-up.

# BUT FOR ALL OF THESE CASES, the haresources files MUST BE IDENTICAL.

# If your files are different then almost certainly something

# won't work right.

# </VERY IMPORTANT NOTE>

# We refer to this file when we're coming up, and when a machine is being

# taken over after going down.

# You need to make this right for your installation, then install it in

# /etc/ha.d

# Each logical line in the file constitutes a "resource group".

# A resource group is a list of resources which move together from

# one node to another - in the order listed. It is assumed that there

# is no relationship between different resource groups. These

# resource in a resource group are started left-to-right, and stopped

# right-to-left. Long lists of resources can be continued from line

# to line by ending the lines with backslashes ("\").

# These resources in this file are either IP addresses, or the name

# of scripts to run to "start" or "stop" the given resource.

# The format is like this:

#node-name resource1 resource2 ... resourceN

# If the resource name contains an :: in the middle of it, the

# part after the :: is passed to the resource script as an argument.

# Multiple arguments are separated by the :: delimeter

# In the case of IP addresses, the resource script name IPaddr is

# implied.

# For example, the IP address 135.9.8.7 could also be represented

# as IPaddr::135.9.8.7

# The given IP address is directed to an interface which has a route

# to the given address. This means you have to have a net route

# set up outside of the High-Availability structure. We don't set it

# up here -- we key off of it.

# The broadcast address for the IP alias that is created to support

# an IP address defaults to the highest address on the subnet.

# The netmask for the IP alias that is created defaults to the same

# netmask as the route that it selected in in the step above.

# The base interface for the IPalias that is created defaults to the

# same netmask as the route that it selected in in the step above.

# If you want to specify that this IP address is to be brought up

# on a subnet with a netmask of 255.255.255.0, you would specify

# this as IPaddr::135.9.8.7/24 .

# If you wished to tell it that the broadcast address for this subnet

# was 135.9.8.210, then you would specify that this way:

# IPaddr::135.9.8.7/24/135.9.8.210

# If you wished to tell it that the interface to add the address to

# is eth0, then you would need to specify it this way:

# IPaddr::135.9.8.7/24/eth0

# And this way to specify both the broadcast address and the

# interface:

# IPaddr::135.9.8.7/24/eth0/135.9.8.210

# The IP addresses you list in this file are called "service" addresses,

# since they're they're the publicly advertised addresses that clients

# use to get at highly available services.

# For a hot/standby (non load-sharing) 2-node system with only

# a single service address,

# you will probably only put one system name and one IP address in here.

# The name you give the address to is the name of the default "hot"

# system.

# Where the nodename is the name of the node which "normally" owns the

# resource. If this machine is up, it will always have the resource

# it is shown as owning.

# The string you put in for nodename must match the uname -n name

# of your machine. Depending on how you have it administered, it could

# be a short name or a FQDN.

#-------------------------------------------------------------------

# Simple case: One service address, default subnet and netmask

# No servers that go up and down with the IP address

#just.linux-ha.org 135.9.216.110

#-------------------------------------------------------------------

# Assuming the adminstrative addresses are on the same subnet...

# A little more complex case: One service address, default subnet

# and netmask, and you want to start and stop http when you get

# the IP address...

#just.linux-ha.org 135.9.216.110 http

#-------------------------------------------------------------------

# A little more complex case: Three service addresses, default subnet

# and netmask, and you want to start and stop http when you get

# the IP address...

#just.linux-ha.org 135.9.216.110 135.9.215.111 135.9.216.112 httpd

#-------------------------------------------------------------------

# One service address, with the subnet, interface and bcast addr

# explicitly defined.

#just.linux-ha.org 135.9.216.3/28/eth0/135.9.216.12 httpd

#-------------------------------------------------------------------

# An example where a shared filesystem is to be used.

# Note that multiple aguments are passed to this script using

# the delimiter '::' to separate each argument.

#node1 10.0.0.170 Filesystem::/dev/sda1::/data1::ext2

# Regarding the node-names in this file:

# They must match the names of the nodes listed in ha.cf, which in turn

# must match the `uname -n` of some node in the cluster. So they aren't

# virtual in any sense of the word.

ha-a 192.168.2.50 jdmail

Heartbeat 会在下面的路径搜索同名的启动脚本：
/etc/ha.d/resource.d
/etc/rc.d/init.d

这里的服务脚本的使用是符合Init标准语法，所以可以在这里通过Heartbeat方便地运行、停止/etc/rc.d/init.d下标准的服务后台进程。

针对本次项目，需要在/etc/ha.d/resource.d/下增加金笛邮件的启动脚本jdmail:

/jdmail/startjd.sh &

/jdmail/web/bin/startup.sh &