Transcript
Netmanias 기술문서: IP망장애복구기술[3] High Availability
IP망장애복구기술[3] HA(High Availability)
2009년5월28일
NMC Consulting Group(tech@netmanias.com)
2
Improvement in recovery from node failure
.High availability (HA)
; “When you say that you want a highly reliable network, you are saying
that you want your network to work all the time.you want it to be
highly available.”
; System architecture 관점에서의, 보다짂보된node failure 복구방안들이Backbone Router에도입됨.
12Non-stop Forwarding (NSF)
3Graceful Restart (G/R+NSF)
Non-stop RoutingAvoid service discontinuity in
data planeAvoid service discontinuity in
data plane & control planeHide node failure from neighbors.
Recover transparently.
3
High Availability (1) Non-stop forwarding (NSF)
RPRIBControl
PlaneDatal
PlaneFIBFIBFIBRPRIBControl
PlaneDatal
PlaneFIBFIBFIBRPRIBControl
PlaneDatal
PlaneFIBFIBFIBCrash!
Non-Stop Forwarding: Control Plane에장애가발생하더라도, Data plane은FIB정보를이용해Forwarding을계속하는기술Traffic Forwarding은계속됨.
Protocol 연동은중단됨.
Protocol 연동은중단됨.
4
High Availability (1) Non-stop forwarding (NSF) (cont)
1RP 장애발생-Neighboring Router와의Protocol 연동이중단됨.
#NAME?
#NAME?
RPRIBControl
PlaneDatal
PlaneFIBFIBFIB2RP restart-OS 전체적으로crash가발생한경우이면, RP보드내의watch-dog timer에의해자동으로reboot되거나, 관리자에의해reboot됨.
#NAME?
#NAME?
#NAME?
RPRIBControl
PlaneDatal
PlaneFIBFIBFIB3RP 부팅완료.Routing Protocol 재개-Neighboring Router와Adjacency다시맺고, Routing Info(LSP) 받아옴.
#NAME?
update는아직안됨.
#NAME?
RPRIBControl
PlaneDatal
PlaneFIBFIBFIB4RIB 계산완료.FIB update 재개RPRIBControl
PlaneDatal
PlaneFIBFIBFIB-Neighboring Router와의LSP 교홖완료및SPF 계산완료.
#NAME?
#NAME?
5
Timeline of NSF operation
T0T6T1Topology Change somewhere in the network(Neighbor들은RIB/FIB가Update되지맊, Restart 중인Router는옛FIB에의존해Forwarding함)
RP장애발생T4T5Non-stop Forwarding장애감지&
RP RestartRP booting 완료Routing Protocol 재개Routing Info 수집SPF계산끝날때까지RIB 구축안됨.
RIB-to-FIB Update 완료T2T3Restart 시갂이Holdtime보다길면Neighbor들이protocol
reconvergence한다.
Routing Loop 또는Routing Blackhole 발생할가능성있음.
(다음장참조)
6
참고: Routing Loop
Destination IP subnet = 10.0.0.0/8이고,
Core 망내에서의Source Router = R1, Destination Router = R6 이다.
ER1R1R5R2R6R3R7R4ER21111122213111Crashed but non-stop forwardingER1R1R5R2R6R3R7R4ER21111122213111Link FailureR2가생각하는R6로의최단거리경로R2-R6 사이에Link failure 발생.
R1은FIB가Update되지않았으므로R2에게Forwarding.
R2입장에서는R6로의최단거리NextHop이R1이므로R1에게다시보냄
.Routing Loop 발생1201초9초R1 장애발생, Rebooting 시작38초30초Reconvergence 완료.
R1이배제된새로운경로형성됨Reconvergence 완료(R1은새로운경로를모름!)
(이예에서는convergence
시갂이8초걸렸다고가정)
R1이Hello를보내지않으므로Neighbor들에의해timeout당함Link 장애발생, LS Update발생Hold timeRouting Loop persists!
7
참고: Routing Blackhole
Crashed butnon-stop forwardingDestination IP subnet = 10.0.0.0/8임.
R2에장애발생했으나, NSF 기능으로Forwarding 계속되고있다.
ER1R1R5R2R6R3R7R4ER2111111331111111ER310.0.0.0/8ER1R1R5R2R6R3R7R4ER2111111331111111ER310.0.0.0/8R3에서Line card 장애발생(R2와연결된링크가포함된라인카드).
R3의RP는“H/W fault” 싞호를받아장애를감지함.
R3가LS Update 발생시켜자싞의Link를전체topology에서삭제함.
R2는FIB가update 되지못했으므로, R3로forwarding .packet loss !
Data plane malfunctionNSFER4ER41201초9초R2 장애발생, Rebooting 시작38초30초Reconvergence 완료.
R2가배제된새로운경로형성됨Reconvergence 완료(R2는새로운경로를모름!)
(이예에서는convergence
시갂이8초걸렸다고가정)
R2가Hello를보내지않으므로Neighbor들에의해timeout당함R3의Line card 장애발생, LS Update발생Hold timeRouting blackhole persists!
8
Topology Change during NSF operation
ER1R1R5R2R6R3R7R4ER21. RP failure in R3Non-stop forwardingRP restartER1R1R5R2R6R3R7R4ER22. Holding timer for R3 expiresNeighbor time-out (R3)
※ 부팅시갂이5분이상걸리고, 그동안Hello 메시지를보내지않으므로, R2, R4, R6에서R3는Neighbor time-out 됨.
※ R2, R4, R6의RIB 및FIB는아직Update가안되었으므로, Data
plane에서의Forwarding은계속된다.
ER1R1R5R2R6R3R7R4ER23. Protocol Reconvergence completedER1R1R5R2R6R3R7R4ER24. Routing Protocol Starts again in R3R3의부팅이완료되고Routing Protocol 이다시시작되면,
또다시Routing topology가변경됨.
111111331111111111111331111111111111331111111111111331111111RP 전체가Rebooting되는경우, 리부팅중Neighbor들에의해Rerouting이일어난다!
9
IGP Process Restart 시의문제점
IGP process맊restart되는경우에도Neighbor들에의해Rerouting이일어난다.
“Hello”
“My neighbor = R2”
“Hello”
“My neighbor = R1”
R1R3R4R2OSPF의경우, 서로의Hello 메시지안에상대방의주소가보여야Adjacency가형성/유지될수있다.
IS-IS의경우, (RFC3373 .Three-way Handshake for IS-IS .가적용되어있다고가정했을때) IS-IS Hello PDU 내에“Three-way handshake adjacency”가수반되어있어야하고, 그안에“Neighbor system ID”에상대방의ID값이보여야하며, “Adjacency State”가“Up”이어야Adjacency가유지됨.
1. Adjacency가형성되어있는상태“Hello”
“My neighbor = R2”
R1R3R4R22. Restart하게되면…
IS-IS의경우, IIH PDU안에adjacency state가“Down”으로되어시작되고, OSPF의경우, R2의Hello 메시지안에R1이들어있지않다.
.R1은R2와의Adjacency가파괴되었다고보고, R2를배제하고Rerouting 시도함.
(R3, R4도마찬가지)
3. Protocol Reconvergence 발생R1R3R4R2즉, Holdtime (Dead-interval)이맊료되기전이라도, Adjacency가파괴되어Rerouting이발생할수있다.
“Hello”
“My neighbor = 없어”
10
High Availability (2) Graceful Restart
RPRIBControl
PlaneDatal
PlaneFIBFIBFIBRPRIBControl
PlaneDatal
PlaneFIBFIBFIBRPRIBControl
PlaneDatal
PlaneFIBFIBFIBCrash!
Graceful Restart: Data plane에서Non-stop forwarding을할뿐아니라, Routing Protocol 연동에서의Adjacency도유지함으로써,
Control plane에서의연속성도보장하는기술.
Traffic Forwarding은계속됨.
Adjacency 끊김없이Protocol 연동을계속함.
Adjacency 끊김없이Protocol 연동을계속함.
11
High Availability (2) Graceful Restart (cont)
.목적
.IGP Process가Restart되더라도, Neighbor와의Adjacency를유지함.
.Neighbor들에의한Routing Topology Change를회피하게됨.
.Neighbor들의도움으로Restart되기이전의Link State 정보들을받아옴.
.RFC3847(“Restart Signaling for IS-IS”)
.Restart TLV (type 211)을새로정의함.
.Graceful restart를지원하는모든라우터는반드시IS-IS Hello 메시지내에Restart TLV를포함해야한다. (항상!)
Neighbor와의Adjacency 유지가안되면1) Neighbor timeout2) Topology Change3) Come back after restart(Topology changes again)
Neighbor와의Adjacency가유지되면NoTopology Change
12
Graceful Restart Operations (Planned Restart)
Router A(Restarting
Router)
Router B(Helper)
4. DB Description Packet9. Flushing Grace-LSA (Grace-LSA with “LS age” = maximum value)
1. 운영자가Restart 명령내림2. Grace-LSA 발생(Link-local LSA이기때문에Flooding 되는것아님)
Grace period = 1800s(30min)
Restart reason = 1IP interface address = 10.10.10.13. Grace-LSA를받으면Helper mode로짂입(Grace period 시갂동안Adjacency를유지함.)
5. LS Request Packet6. LS Update Packet7. LS Ack8. Graceful Restart
mode 종료10. Helper mode 종료
13
High Availability (2) Graceful Restart (Planned)
1RP Restart (Planned Restart/Reload)
#NAME?
#NAME?
#NAME?
RPRIBControl
PlaneDatal
PlaneFIBFIBFIB2RP restart-관리자조작에의해개별process 또는RP 전체가restart됨.
#NAME?
#NAME?
#NAME?
RPRIBControl
PlaneDatal
PlaneFIBFIBFIB3RP 부팅완료.Routing Protocol restarts “gracefully”
-Restarting 또는Rebooting 완료후, 기존의Adjacency가파괴되지않은상태에서Routing Info(LSA) 받아옴.
#NAME?
update는아직안됨.
#NAME?
RPRIBControl
PlaneDatal
PlaneFIBFIBFIB4RIB 계산완료.FIB update 재개RPRIBControl
PlaneDatal
PlaneFIBFIBFIB-Neighboring Router로부터의LSA 수집완료-Graceful Restart mode 종료(Grace-LSA를flush 시킴)
#NAME?
14
High Availability (2) Graceful Restart (Unplanned)
1RP 장애발생-Neighboring Router와의Protocol 연동이중단됨.
#NAME?
#NAME?
RPRIBControl
PlaneDatal
PlaneFIBFIBFIB2RP restart-RP보드내의watch-dog timer에의해자동으로reboot되거나, 관리자에의해reboot됨.
#NAME?
#NAME?
#NAME?
RPRIBControl
PlaneDatal
PlaneFIBFIBFIB3RP 부팅완료.Routing Protocol restarts “gracefully”
-Neighboring Router와Graceful Restart 절차시작.(Grace-LSA 발생시킴.
이때“Restart reason = 0(unknown)”으로기록함.)
-RP 부팅시갂이보통Dead-interval보다길기때문에기존의Adjacency가파괴될가능성이큼. 이경우, TopolgyChange 발생하고G/R mode 종료.
#NAME?
RPRIBControl
PlaneDatal
PlaneFIBFIBFIB4RIB 계산완료.FIB update 재개RPRIBControl
PlaneDatal
PlaneFIBFIBFIB-Neighboring Router로부터의LSA 수집완료-Graceful Restart mode 종료(Grace-LSA를flush 시킴)
#NAME?
#NAME?
#NAME?
#NAME?
#NAME?
15
Timeline of Graceful Restart Operation (Unplanned)
T0T6T1RP장애발생T4T5Non-stop Forwarding장애감지& RP RestartRP booting 완료Graceful Restart 시작Graceful Restart 수행중.
Adjacency 유지한채로Routing Info 다시수집.
SPF계산끝날때까지RIB 구축안됨.
RIB-to-FIB Update 완료T2T3Restart 시갂이Holdtime(Dead-
interval)보다길면Neighbor들이protocol reconvergence한다.
RP Reboot 도중에발생한Routing Change는, FIB
Update가완료되기전까지는Traffic forwarding에반영되지못한다!
(Routing loop or Blackhole
발생이가능하다.)
Graceful Restart 과정중에또다시Topology Change가발생하면, Graceful
Restart 절차가취소되고정상적인Protocol
Reconvergence 과정을거쳐야한다.
16
Topology Change during Unplanned Graceful Restart
ER1R1R5R2R6R3R7R4ER21. RP failure in R3Non-stop forwardingRP restartER1R1R5R2R6R3R7R4ER22. Holding timer for R3 expiresNeighbor time-out (R3)
※ 부팅시갂이5분이상걸리고, 그동안Hello 메시지를보내지않으므로, R2, R4, R6에서R3가Neighbor time-out 됨.
ER1R1R5R2R6R3R7R4ER23. Protocol Reconvergence completedER1R1R5R2R6R3R7R4ER24. Routing Protocol Starts again in R3R3의부팅이완료되고Routing Protocol 이다시시작되면,
또다시Routing topology가변경됨.
111111331111111111111331111111111111331111111111111331111111RP 전체가Rebooting되는경우, 리부팅중Neighbor들에의해Rerouting이일어난다!
17
Graceful Restart Operation in Dual RP system
Primary RPRIBControl PlaneDatal PlaneI/OI/OI/OLine CardFIBLine CardFIBLine CardFIBSwitching
FabricBackup RPRIB’Health checkRIB pollingT0T5Polling IntervalT1Topology Change
somewhere in the
network(RIB는update됨.
FIB에반영됨.
RIB’는아직update안됨)
T2P-RP장애발생T3B-RP로절체됨.
T4Graceful Restart1) Signal RP restart2) Update Routing DatabaseNon-stop ForwardingBackup RP로의switchover가곧바로일어났으므로, Neighboring router들에의한timeout은발생하지않는다.
RIB polling : Primary RP의RIB와Backup RP의RIB 갂의Sync 맞추기를위한주기적Update. RIB정보이외의기타시스템의존적인데이터들다수포함.
Primary RP의변화를항상Backup RP에게Sync맞추는것은구현이어려운것으로알려져있어, 이같은polling 방식이사용된다.
(각벤더들은primary-backup 갂에sync 어떻게맞추나? Polling하나, Primary 변화마다매변Backup에update 해주나?)
Health check : Primary RIB에문제가발생하면바로Backup하기위해주기적으로상태를확인(Hello message 방식)
(벤더들의Heath Check 방법은? 정확한주기는?)
18
Graceful Restart Signaling in Dual RP system
Router A
with Dual
Route
Processors4. DB Description Packet9. Flushing Grace-LSA (Grace-LSA with “LS age” = maximum value)
1. Router A의Primary RP 장애발생.
Backup-RP로switchover되고P-RP는restart됨.
2. Grace-LSA 발생(Link-local LSA이기때문에Flooding 되는것아님)
Grace period = 1800s(30min)
Restart reason = 3 (switch to redundant RP)
IP interface address = 10.10.10.13. Grace-LSA를받으면Helper mode로짂입(Grace period 시갂동안Adjacency를유지함.)
5. LS Request Packet6. LS Update Packet7. LS Ack8. Graceful Restart
mode 종료10. Helper mode 종료Router B(Peer)
19
Conclusion
.Control plane 장애는Router architecture의짂보에의해자체복구가가능함
.Non-stop Forwarding
.Graceful Restart
.Non-stop Routing
.Data plane (Forwarding plane) 장애에대한싞속한복구방안이추가적으로필요하므로다음과같은방법롞제기됨. (Link failure, node내의Line card 장애둘다포함)
.MPLS protection
.MPLS Fast-Reroute
.IP Fast-Reroute (See Next Chapter)
20
End of Document