Before kernel 2.6.30 [2]:
702 rp_filter - BOOLEAN
703 1 - do source validation by reversed path, as specified in RFC1812
704 Recommended option for single homed hosts and stub network
705 rout問題現(xiàn)象:
tunnel模式的lvs在系統(tǒng)從fedora8升級成fc17之后,同樣的lvs配置,同樣的架構下,lvs工作不正常。
具體在現(xiàn)象(fc17內(nèi)核3.3.7)
1.從客戶端ping vip地址,OK;
2.Telnet訪問vip的應該用端口80現(xiàn)象為連接timeout;
3.完全同樣的架構和配置在fc8內(nèi)核2.6.26下則工作正常;
系統(tǒng)構成說明圖:
2
上圖中整個數(shù)據(jù)流的過程應該是:
2.1
客戶端訪問lvs的vip地址,此時數(shù)據(jù)包源地址為客戶端地址192.168.91.196,目的地址為lvs
vip地址192.168.91.204;
2.2
lvs通過其配置的算法,決定請求發(fā)給底下的某臺real
server,采用的是tunnel模式,此后往下轉發(fā)的數(shù)據(jù)包應該是被封裝的,源地址為LVS 轉發(fā)器director
的內(nèi)部地址192.168.91.209,目的地址為real
server的th0地址192.168.91.78(此處假設請求被分配各左邊的這臺relaserver)
2.3 real
server的eth0上接收到目的地址為自己ip數(shù)據(jù)包后,解封裝,并且交給上層處理,協(xié)議棧對payload分析后是ipip封裝包,則進一步把該解封裝后的數(shù)據(jù)包交給tunl0處理,因此tunel0上收到的數(shù)據(jù)包應該是解封裝后的數(shù)據(jù)包,源地址為真實客戶端地址192.168.91.196目的地址為vip 192.168.91.204,tunl0接受該數(shù)據(jù)包后交由上層tcp處理。
2.4
針對該請求的返回數(shù)據(jù)包則按照以下的動作流程:應用層生成返回的信息之后交給tcp/ip層,最后在ip層生產(chǎn)數(shù)據(jù)包,此時的數(shù)據(jù)包源地址為VIP192.168.91.204,目的地址為真實客戶端地址192.168.91.196
2.5 該數(shù)據(jù)包在參考適當?shù)穆酚桑蠄D中實際情況則是參考192.168.0.0/16
via 192.168.91.65 dev eth0這條路由,通過eth0發(fā)送該數(shù)據(jù)包到網(wǎng)關C-Router192.168.91.65(如果內(nèi)核允許),直接發(fā)送到外部世界,最終被真實客戶端接收。
分析過程
1
從客戶端ping VIP192.168.91.204
通,說明vip地址已經(jīng)起來;
2
從客戶端不斷的telnet 192.168.91.204的80端口,模擬訪問應用程序,并且同時在上圖左邊的realserver上抓數(shù)據(jù)包,得到結果如下:
.在th0端口上,能夠抓到從lvs-director發(fā)給realersver的封裝后的ip數(shù)據(jù)包,源地址和目的地址分別為192.168.91.209,192.168.91.78;
.而在tunlo上,能夠抓到解封裝后的ip數(shù)據(jù)包源和目的分別是192.168.91.196,192.168.91.204。
由此可見,來自客戶端來的數(shù)據(jù)包,已經(jīng)被lvs
director正確的封裝后轉發(fā)到后端真實的服務器上,而后端真實服務器上的ip棧也已經(jīng)正確的接收了封裝后的數(shù)據(jù)包,并且正確解封裝交由ipip棧處理,但是返回的數(shù)據(jù)包沒有在eth0上抓到,可見問題出在返回數(shù)據(jù)包上。
3
確認真實服務器上的tcp端口正常啟動之后,問題懷疑點就落在了返回數(shù)據(jù)包,沒有被正確路由或者干脆丟棄了的假設上。 于是很快想到了內(nèi)核參數(shù)rp_filter(關于什么是rp_filter請參看
http://en./wiki/Reverse_path_filtering)總結起來就是說如果rp_filter被啟用,則服務器在某個接口上接收到了某個數(shù)據(jù)包,則目的地址為該數(shù)據(jù)包源地址的返回包必須通過同樣的接口發(fā)送出去,也就是用于返回數(shù)據(jù)包的路由項的出口接口如果和該接口不一致,則返回的數(shù)據(jù)包就直接被內(nèi)核丟棄
于是查看realserver的內(nèi)核配置
$ sudo
/sbin/sysctl -a|fgrep .rp_filter
net.ipv4.conf.all.rp_filter = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.lo.rp_filter = 1
net.ipv4.conf.eth0.rp_filter = 1
net.ipv4.conf.tunl0.rp_filter = 1
<---果然開啟了rp_filter,需要關閉。
####關閉的命令如下
echo 0
>
/proc/sys/net/ipv4/conf/tunl0/rp_filter
|
4
修改rp_filter的配置之后,再次從客戶端連接vip的80端口,顯示正常,于是問題解決??蛇€有個問題沒有解釋清楚的是,為什么在fc8下同樣的配置沒有出問題呢?
5 很快在網(wǎng)上查到了以下的知識:
Linux kernel 2.6.30 和 kernel
2.6.31,內(nèi)核參數(shù)rp_filter的定義和計算其值的算法發(fā)生了變化:
這種變化包含,
Before kernel 2.6.30 [2]:
702 rp_filter - BOOLEAN
703 1 - do source validation by reversed path, as specified in RFC1812
704 Recommended option for single homed hosts and stub network
705 routers. Could cause troubles for complicated (not loop free)
706 networks running a slow unreliable protocol (sort of RIP),
707 or using static routes.
708
709 0 - No source validation.
710
711 conf/all/rp_filter must also be set to TRUE to do source validation
712 on the interface
713
714 Default value is 0. Note that some distributions enable it
715 in startup scripts.
Since kernel 2.6.31 [3]:
702 rp_filter - INTEGER
703 0 - No source validation.
704 1 - Strict mode as defined in RFC3704 Strict Reverse Path
705 Each incoming packet is tested against the FIB and if the interface
706 is not the best reverse path the packet check will fail.
707 By default failed packets are discarded.
708 2 - Loose mode as defined in RFC3704 Loose Reverse Path
709 Each incoming packet's source address is also tested against the FIB
710 and if the source address is not reachable via any interface
711 the packet check will fail.
712
713 Current recommended practice in RFC3704 is to enable strict mode
714 to prevent IP spoofing from DDos attacks. If using asymmetric routing
715 or other complicated routing, then loose mode is recommended.
716
717 conf/all/rp_filter must also be set to non-zero to do source validation
718 on the interface
719
720 Default value is 0. Note that some distributions enable it
721 in startup scripts.
Before kernel 2.6.31 :
Actual rp_filter for <interface> = net.ipv4.conf.<interface>.rp_filter AND net.ipv4.conf.all.rp_filter
I.e. reverse path filtering is enabled in strict mode if rp_filter=1 for both "all" and the interface.
Since kernel 2.6.31 :
Actual rp_filter for <interface> = MAX(net.ipv4.conf.<interface>.rp_filter, net.ipv4.conf.all.rp_filter)
I.e. reverse path filtering is enabled in strict mode if rp_filter=1 for either "all" or the interface.
由上面的內(nèi)容可以看出,在 2.6.31之前的版本中,判斷一個端口的rp_filter是不是有效,是對該端口的rp_filter和net.ipv4.conf.all.rp_filter的求與運算,而在之后的版本里面,是對該端口的rp_filter值和net.ipv4.conf.all.rp_filter的求最大值。
6
再次來分析系統(tǒng)默認的參數(shù)配置
FC8和fc17都是一下的默認內(nèi)核參數(shù)配置
$ sudo
/sbin/sysctl -a|fgrep .rp_filter
net.ipv4.conf.all.rp_filter = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.lo.rp_filter = 1
net.ipv4.conf.eth0.rp_filter = 1
net.ipv4.conf.tunl0.rp_filter = 1
|
因此 關于net.ipv4.conf.tunl0.rp_filter
最終值的計算過程如下
- 內(nèi)核2.6.26;
對net.ipv4.conf.all.rp_filter = 0
和 net.ipv4.conf.tunl0.rp_filter =
1 求余運算后結果為0,所以在tunl0上rp_filter并沒有被啟用,因此返回的數(shù)據(jù)包能夠通過eth0端口發(fā)送出去,LVS工作正常;
- 內(nèi)核3.3.7; 對net.ipv4.conf.all.rp_filter = 0
和 net.ipv4.conf.tunl0.rp_filter = 1 求最大值得到1
,表示在該tunl0上 rp_filter是被啟用的,也就是說在tunl0上收到的數(shù)據(jù)包,針對該數(shù)據(jù)包的返回數(shù)據(jù)包(目的地址為接受到的數(shù)據(jù)包的源地址,而源地址為接收到數(shù)據(jù)包的目的地址),必須通過tunl0發(fā)送出去,否則丟棄。
在本case中,由于路由選擇為eth0而非tunl0所以,數(shù)據(jù)包丟棄。 從而lvs工作不正常。
總結:
為了確保不出錯,在lvs-tunnel配置的時候統(tǒng)一配置內(nèi)核如下
net.ipv4.conf.all.rp_filter
= 0
net.ipv4.conf.default.rp_filter = 0
net.ipv4.conf.lo.rp_filter = 0
net.ipv4.conf.eth0.rp_filter = 0
net.ipv4.conf.tunl0.rp_filter = 0
|
參考資料:
- [1]. http://en./wiki/Reverse_path_filtering
- [2].
http://lxr./source/Documentation/networking/ip-sysctl.txt?v=2.6.29#L702
- [3].
http://lxr./source/Documentation/networking/ip-sysctl.txt?v=2.6.30#L702
- [4]. http://www./lists/linux-net/msg17162.html
- [5]. http://www./lists/netfilter/msg47124.html
- [6].http://patchwork./patch/23513/
?。璭dit by andy.chouers. Could cause troubles for complicated (not loop free)
706 networks running a slow unreliable protocol (sort of RIP),
707 or using static routes.
708
709 0 - No source validation.
710
711 conf/all/rp_filter must also be set to TRUE to do source validation
712 on the interface
713
714 Default value is 0. Note that some distributions enable it
715 in startup scripts.
Since kernel 2.6.31 [3]:
702 rp_filter - INTEGER
703 0 - No source validation.
704 1 - Strict mode as defined in RFC3704 Strict Reverse Path
705 Each incoming packet is tested against the FIB and if the interface
706 is not the best reverse path the packet check will fail.
707 By default failed packets are discarded.
708 2 - Loose mode as defined in RFC3704 Loose Reverse Path
709 Each incoming packet's source address is also tested against the FIB
710 and if the source address is not reachable via any interface
711 the packet check will fail.
712
713 Current recommended practice in RFC3704 is to enable strict mode
714 to prevent IP spoofing from DDos attacks. If using asymmetric routing
715 or other complicated routing, then loose mode is recommended.
716
717 conf/all/rp_filter must also be set to non-zero to do source validation
718 on the interface
719
720 Default value is 0. Note that some distributions enable it
721 in startup scripts.