首页 诗词 字典 板报 句子 名言 友答 励志 学校 网站地图
当前位置: 首页 > 教程频道 > 软件管理 > VSTS >

LVS+Keepalived+Nginx的奇怪有关问题

2012-07-23 
LVS+Keepalived+Nginx的奇怪问题最近因为项目中服务器架构要升级,考虑到高可用性,决定采用keepalived做LVS

LVS+Keepalived+Nginx的奇怪问题

最近因为项目中服务器架构要升级,考虑到高可用性,决定采用keepalived做LVS Server的双机互备,然后LVS作为DB和前端Nginx的load balancer。

我的环境:
VIP 10.8.12.200?
PostgreSql RealServer1 10.8.12.208?
PostgreSql RealServer2 10.8.12.209
Tomcat 1 10.8.12.203
Tomcat 2 10.8.12.204
LVS Server1 & Nginx RealServer1 10.8.12.201
LVS Server2 & Nginx RealServer2 10.8.12.202

gateway 10.8.12.254

上述服务器都只配一块网卡,Ubuntu 11.04 Server

这些都是用vmware创建的虚拟机,考虑到生产环境的服务器数量有限,所以LVS Server和Nginx RealServer是安装在同一台机器上的。ipvsadm、keepalived安装在10.8.12.201(LVS Server1 & Nginx RealServer1)和10.8.12.202(LVS Server2 & Nginx RealServer2)机器上。


我准备了两套方案如下:
方案一:
前端采用Nginx作反向代理服务器并同时作动静分离,load balances到后端的tomcat集群和web服务器。后端用LVS作为PostgreSql?Server的load balancer。keepalived做双机互备。

keepalived master上的配置文件内容如下:

global_defs {     router_id Nginx_Id_1 } vrrp_script Monitor_Nginx {     script "/usr/local/keepalived/etc/keepalived/scripts/monitor_nginx.sh"     interval 2     weight 2 } vrrp_instance VI_1 {     state MASTER     interface eth0     virtual_router_id 33     priority 101     advert_int 1     authentication {         auth_type PASS         auth_pass 1111     }     #VIP     virtual_ipaddress {         10.8.12.200     }     track_script {         Monitor_Nginx     } } virtual_server 10.8.12.200 5432 {     delay_loop 6     lb_algo rr     lb_kind DR     persistence_timeout 0     protocol TCP     real_server 10.8.12.208 5432 {         weight 1         TCP_CHECK {             connect_port 5432             connect_timeout 10         }     }     real_server 10.8.12.209 5432 {         weight 1         TCP_CHECK {             connect_port 5432             connect_timeout 10         }     } } 

?

keepalived backup配置文件此处省略...

LVS Server的路由信息:
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
->RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.8.12.200:5432 rr
->10.8.12.208:5432 Route 1 0 0
->10.8.12.209:5432 Route 1 0 0


PostgreSql?RealServer的lvs脚本如下:

/bin/bash #Description : RealServer Start! VIP=10.8.12.210 LVS_TYPE=DR . /lib/lsb/init-functions case "$1" in     start)         echo "start LVS of REALServer"         /sbin/ifconfig lo:0 $VIP broadcast $VIP netmask 255.255.255.255 up         /sbin/route add -host $VIP dev lo:0         echo "1" > /proc/sys/net/ipv4/conf/lo/arp_ignore         echo "2" > /proc/sys/net/ipv4/conf/lo/arp_announce         echo "1" > /proc/sys/net/ipv4/conf/all/arp_ignore         echo "2" > /proc/sys/net/ipv4/conf/all/arp_announce         ;;     stop)         route del -host $VIP dev lo:0         /sbin/ifconfig lo:0 down         echo "close LVS Directorserver"         echo "0" > /proc/sys/net/ipv4/conf/lo/arp_ignore         echo "0" > /proc/sys/net/ipv4/conf/lo/arp_announce         echo "0" > /proc/sys/net/ipv4/conf/all/arp_ignore         echo "0" > /proc/sys/net/ipv4/conf/all/arp_announce        ;;     *)         echo "Usage $0 {start|stop}"         exit 1        ;; esac exit 0 

?


DB RealServer上的route:
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
10.8.12.200 * 255.255.255.255 UH 0 0 0 lo
10.8.12.0 * 255.255.255.0 U 0 0 0 eth0
default 10.8.12.254 0.0.0.0 UG 100 0 0 eth0


这套方案经测试没有问题。



方案二:
前端采用LVS作为Nginx的load balancer,Nginx再作反向代理服务器并同时作动静分离,load balances到后端的tomcat集群和web服务器,keepalived做LVS双机互备。后端用LVS作为PostgreSql?Server的load balancer。PostgreSql的LVS Server和Nginx的LVS Server是同一个,只是端口不同。LVS Server和Nginx RealServer共享同一台机器,PostgreSql?RealServer是另外两台机器。
相比方案一,方案二只是在前端Nginx上又加了一层LVS的load balancer,Nginx的角色本身没有变化。

keepalived master配置文件内容如下:

global_defs {      router_id Nginx_Id_1 } vrrp_script Monitor_Nginx {     script "/usr/local/keepalived/etc/keepalived/scripts/monitor_nginx.sh"     interval 2     weight 2 } vrrp_instance VI_1 {     state BACKUP     interface eth0     virtual_router_id 33     priority 100     advert_int 1     authentication {         auth_type PASS         auth_pass 1111     }     #VIP     virtual_ipaddress {         10.8.12.200     }     track_script {         Monitor_Nginx     } } virtual_server 10.8.12.200 80 {     delay_loop 6     lb_algo rr     lb_kind DR     persistence_timeout 60     protocol TCP     real_server 10.8.12.201 80 {         weight 1         TCP_CHECK {             connect_port 80             connect_timeout 10         }     }     real_server 10.8.12.202 80 {         weight 1         TCP_CHECK {             connect_port 80             connect_timeout 10         }     } } virtual_server 10.8.12.200 5432 {     delay_loop 6     lb_algo rr     lb_kind DR     persistence_timeout 0     protocol TCP     real_server 10.8.12.208 5432 {         weight 1         TCP_CHECK {             connect_port 5432             connect_timeout 10         }     }     real_server 10.8.12.209 5432 {         weight 1         TCP_CHECK {             connect_port 5432             connect_timeout 10         }     } } 

?可以看出来,相比方案一,只是多了10.8.12.200 80端口的LVS配置。


此时的LVS Server路由信息如下:
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.8.12.200:80 rr persistent 60
->10.8.12.201:80 Route 1 0 0
-> 10.8.12.202:80 Route 1 0 0
TCP 10.8.12.200:5432 rr
-> 10.8.12.208:5432 Route 1 0 0
-> 10.8.12.209:5432 Route 1 0 0


由于LVS Server同时又是Nginx RealServer节点,所以在10.8.12.201(LVS Server1 & Nginx RealServer1)和10.8.12.202(LVS Server2 & Nginx RealServer2)机器上还创建了lvs脚本如下:

/bin/bash #Description : RealServer Start! VIP=10.8.12.210 LVS_TYPE=DR . /lib/lsb/init-functions case "$1" in     start)         echo "start LVS of REALServer"         /sbin/ifconfig lo:0 $VIP broadcast $VIP netmask 255.255.255.255 up         /sbin/route add -host $VIP dev lo:0         echo "1" > /proc/sys/net/ipv4/conf/lo/arp_ignore         echo "2" > /proc/sys/net/ipv4/conf/lo/arp_announce         echo "1" > /proc/sys/net/ipv4/conf/all/arp_ignore         echo "2" > /proc/sys/net/ipv4/conf/all/arp_announce         ;;     stop)         route del -host $VIP dev lo:0         /sbin/ifconfig lo:0 down         echo "close LVS Directorserver"         echo "0" > /proc/sys/net/ipv4/conf/lo/arp_ignore         echo "0" > /proc/sys/net/ipv4/conf/lo/arp_announce         echo "0" > /proc/sys/net/ipv4/conf/all/arp_ignore         echo "0" > /proc/sys/net/ipv4/conf/all/arp_announce        ;;     *)         echo "Usage $0 {start|stop"         exit 1        ;; esac exit 0 

?
Nginx RealServer上的route:
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
10.8.12.200 * 255.255.255.255 UH 0 0 0 lo
10.8.12.0 * 255.255.255.0 U 0 0 0 eth0
default 10.8.12.254 0.0.0.0 UG 100 0 0 eth0


DB RealServer节点的配置与方案一相同,此处省略。


问题现象如下:
LVS服务器刚启动时,访问10.8.12.200一切正常。
在服务器运行一段时间后(其实也就几分钟,这期间没有做页面访问),再次访问10.8.12.200访问失败,返回502,多次刷新问题依旧。检查LVS Server的路由信息没有变化,route也是正常的。然后我尝试直接访问10.8.12.201上的nginx,访问正常;再尝试直接访问10.8.12.203:8080(后端tomcat),访问也正常。这就是说LVS Server load balance DB RealServer此时是正常的,只是load balance Nginx RealServer不正常。停掉keepalived主机后,备机可以正常接管,接管后再访问10.8.12.200正常。再把keepalived主机启起来后,主机又接管了VIP,但10.8.12.200依然不能访问。在ipvsadm -C之后,访问又正常了(偶然的发现,实在想不明白为什么这样)。


思考:
网上的教程说LVS Server和RealServer节点完全可以共享同一台机器,但在这里只是共享的Nginx RealServer无法访问,DB RealServer是正常的,实在不知道问题出在哪里。

?

?

问题解决了,是keepalived的配置问题,80端口由于是由nginx负责做load balance,所以针对80端口的lvs配置就是多余的了,删掉就可以了。

?

热点排行