Chapter 2 OSPF Router em broadcast elege um DR, usar comando interface-type p2p para segmentos point-to-point eliminando o DR election IP Protocol number 89 Todos routers escutam 224.0.0.5 Todos routers DR/BDR escutam 224.0.0.6 user@router# set protocols ospf area 0.0.0.0 interface ge-0/0/0.0 interface-type p2p O critério para eleger o DR e com base na priority + alta Dijkstra, algoritmo usado pelo OSPF 5 packet types dos OSPF: Hello - type 1 Database description - type 2 Link-state request - type 3 Link-state update - type 4 Link-state acknowledgment - type 5 Hello: Sent by each router to form and ma intain adjacencies with its neighbors. Database description: Used by the router during the adjacency formation process. It contains the header information for the contents of the LSDB on the router. Link-state request : Used by the router to request an updated copy of a neighbor’s LSA. Link-state update: Used by the router to advertise LSAs into the network. Link-state acknowledgment : Used by the router to ensure the reliable flooding of LSAs throughout the network. Area border router (ABR): An OSPF router with links in two areas, the ABR is responsible for connecting OSPF areas to the backbone. It transmits network information between the backbone and other areas. • Autonomous system boundary router (ASBR): An OSPF router that injects routing information from outside the OSPF autonomous system (AS), an ASBR is typically located in the backbone. However, the OSPF specification allows an ASBR to be in other areas as well. • Backbone router: Defined as any OSPF router with a link to Area 0 (the backbone), this router can be completely internal to Area 0 or an ABR depending on whether it has links to other, nonbackbone areas. • Internal router: An internal router is an OSPF router with all its links within an area. If that router is located within the backbone area (Area 0.0.0.0), it is also known as a backbone router. user@router# set protocols ospf area 0.0.0.0 interface ge-0/0/0.0 interface-type p2p metric 100 !Quando existe + do que um IP na interface e possivel adverter apenas um IP user@router# set interface lo0.0 family inet address 172.16.1.1/32 user@router# set interface lo0.0 family inet address 172.16.2.1/32 user@router# set interface lo0.0 family inet address 172.16.3.1/32 user@router# set protocols ospf area 0.0.0.0 interface lo0.0 user@router# run show ospf database detail OSPF database, Area 0.0.0.0 Type ID Adv Rtr Seq Age Opt Cksum Len Router *172.16.1.1 172.16.1.1 0x80000002 34 0x22 0x8ec6 60 bits 0x0, link count 3 id 172.16.3.1, data 255.255.255.255, Type Stub (3) Topology count: 0, Default metric: 0 id 172.16.2.1, data 255.255.255.255, Type Stub (3) Topology count: 0, Default metric: 0 id 172.16.1.1, data 255.255.255.255, Type Stub (3) Topology count: 0, Default metric: 0 user@router# set protocols ospf area 0.0.0.0 interface 172.16.1.1 user@router# run show ospf database detail OSPF database, Area 0.0.0.0 Type ID Adv Rtr Seq Age Opt Cksum Len Router *172.16.1.1 172.16.1.1 0x80000004 5 0x22 0xf4fe 36 bits 0x0, link count 1 id 172.16.1.1, data 255.255.255.255, Type Stub (3) Topology count: 0, Default metric: 0 Multiarea OSPF Configuration user@router# set protocols ospf area 0.0.0.0 interface ge-0/0/0.0 user@router# set protocols ospf area 0.0.0.0 interface ge-0/0/1.0 user@router# set protocols ospf area 0.0.0.1 nssa interface ge-0/0/3.0 user@router# set protocols ospf area 0.0.0.2 stub interface ge-0/0/4.0 Link-State Update Packets Pode conter + do que 1 LSA no mesmo update Pacotes consistem em: (24-byte) OSPF header (4-byte) Number of advertisements (Variable) LSAs LSA Types Router LSAs (Type 1) Network LSAs (Type 2) Summary LSAs (Type 3 and 4) AS External LSAs (Type 5) Group membership LSAs (Type 6) NSSA LSAs (Type 7) External attributes LSAs (Type 8) Opaque LSAs (Type 9 , 10 and 11) LSAs do tipo 6,8 e 11 não são suportados LSA Header Link-state age (2 bytes) : Measured in seconds, the LS age is the time from when the LSA was first originated. Each router increments this field prior to reflooding the LSA. Options (1 byte) : Indicates support for OSPF options. Within the context of an individual LSA, the E bit (position 7) is set in all external LSAs and the P bit (position 5) is set in all NSSA external LSAs. Link-state type (1 byte) : Encodes the specific LSA type. Link-state ID (4 bytes): Describes various portions of the OSPF domain. Each LSA type uses this field in a different manner. Advertising router (4 bytes): The router ID of the router that first originated the LSA. Link-state sequence number (4 bytes) : Verifies that each router has the most recent version of an LSA. This field is incremen ted each time a new version is generated. Values range from 0x80000000 to 0x7FFFFFFF. Link-state checksum (2 bytes) : The checksum of the entire LSA contents, minus the LS age field. This field is used to ensure data integrity in the LSDB. Length (2 bytes): The entire length of the individual LSA, including the header. Router LSA V, E, and B bits (1 byte) : Following five bits set to a value of 0, the V, E, and B bits represent the characteristics of the originating router. The V bit is set when a virtual link is established. An ASBR sets the E bit. An ABR sets the B bit. Reserved (1 byte): Reserved field. Value is always 0. Number of links (2 bytes): Gives the total number of links represented by the remaining six fields. Link ID (4 bytes) : Represents the type of link the far end of the link is connected to. Link data (4 bytes) : Represents what the near side of the link is connected to. Link type (1 byte): Describes the type of link. Used wi th Link ID and Link data fields. Number of type of service (ToS) metrics (1 byte): Lists the number of type of service metrics encoded. Only a value of 0 is supported. Metric (2 bytes): Contains the cost to transmit data out of the interface. Additional ToS data (4 bytes): This field is unused. Link ID and Link Data Fields Point-to-point (Type 1): On a point-to-point interface, an OSPF router always forms an adjacency with its peer over an unnumber ed connection. As such , the link ID field contains the neighbor’s router ID. The link data field contains the IP address of the interface on the local router. Transit (Type 2) : A connection to a broadcast segment is always noted as a transit link. The link ID field contains the interface IP address of the segment’s DR. The link data field contains the interface IP address of the local router. Stub (Type 3): A router advertises a stub network when a subnet does not connect to any OSPF neighbors. Advertising a stub network occurs for the loopback interface and any passive interfaces. In addition, the IP subnet for any point-to-point interface is advertised as a stub because the adjacency was formed over an unnumbered interface. The link ID field for a stub network contains the IP network number and the link data field contains the subnet mask. Virtual link (Type 4): A virtual link operates between an ABR connected to Area 0 and an ABR that is not connected to Area 0. Once es tablished, the virtual link appears in the Area 0 router LSA of each endpoint. The link ID field contains the neighbor’s router ID, and the link data field contains the in terface IP address of the local router. root@R2# run show ospf database router extensive OSPF database, Area 0.0.0.0 Type ID Adv Rtr Seq Age Opt Cksum Len Router *10.10.10.2 10.10.10.2 0x80000017 2753 0x22 0x474b 36 bits 0x1, link count 1 id 10.1.23.3, data 10.1.23.2, Type Transit (2) Topology count: 0, Default metric: 1 Topology default (ID 0) Type: Transit, Node ID: 10.1.23.3 Metric: 1, Bidirectional Gen timer 00:04:06 Aging timer 00:14:07 Installed 00:45:53 ago, expires in 00:14:07, sent 00:45:47 ago Last changed 02:36:04 ago, Change count: 11, Ours Router 10.10.10.3 10.10.10.3 0x80000008 3535 0x22 0x6c02 48 bits 0x0, link count 2 id 10.1.23.3, data 10.1.23.3, Type Transit (2) Topology count: 0, Default metric: 1 id 10.10.10.3, data 255.255.255.255, Type Stub (3) Topology count: 0, Default metric: 0 Topology default (ID 0) Type: Transit, Node ID: 10.1.23.3 Metric: 1, Bidirectional Aging timer 00:01:04 Installed 00:58:54 ago, expires in 00:01:05, sent 00:58:54 ago Last changed 00:58:54 ago, Change count: 5 OSPF database, Area 0.0.0.1 Type ID Adv Rtr Seq Age Opt Cksum Len Router 10.10.10.1 10.10.10.1 0x8000000a 2423 0x20 0x5737 48 bits 0x0, link count 2 id 10.1.12.1, data 10.1.12.1, Type Transit (2) Topology count: 0, Default metric: 1 id 10.10.10.1, data 255.255.255.255, Type Stub (3) Topology count: 0, Default metric: 0 Topology default (ID 0) Type: Transit, Node ID: 10.1.12.1 Metric: 1, Bidirectional Aging timer 00:19:37 Installed 00:40:18 ago, expires in 00:19:37 Last changed 02:32:51 ago, Change count: 2 Router *10.10.10.2 10.10.10.2 0x8000000f 128 0x20 0x6420 48 bits 0x1, link count 2 id 10.1.12.1, data 10.1.12.2, Type Transit (2) Topology count: 0, Default metric: 1 id 10.10.10.2, data 255.255.255.255, Type Stub (3) Topology count: 0, Default metric: 0 Topology default (ID 0) Type: Transit, Node ID: 10.1.12.1 Metric: 1, Bidirectional Gen timer 00:47:52 Aging timer 00:57:52 Installed 00:02:08 ago, expires in 00:57:52, sent 00:02:05 ago Last changed 02:32:46 ago, Change count: 3, Ours [edit] Router LSA Example This router is both an ABR as well as an ASBR. We see this by the setting of bits 0x3. Recall that position 7 (0x2) is for the E bit, which is set when the originating router is an ASBR. Bit position 8 (0x1) is for the B bit, wh ich is set when the originating router is an ABR. Combining these two fields results in a value of 0x3, which we see in the database capture. This router has three links connected to Area 0, which we can determine because of two factors. First, the link count field is set to a value of 3. Second, the LSA is shown in the database within the Area 0.0.0.0 section. Recall that a router LSA has area scope, so a separate LSA is generated for each area repr esenting the links only within that area. A single point-to-point link exists, and two links are connected to stub networks. This fact is clearly visible from the information in the type fields. This router LSA was originated by the same router from which the capture was taken. Notice the asterisk (* ) next to the link-state ID value of 192.168.16.1. Also note that the last line of the capture states that this LSA is Ours. The router LSA was installed 15 minutes and 47 seconds ago. If not refreshed, the LSA will expire in 44 minutes and 13 seconds when its 3600 second maximum age is exceeded, and the LSA was last flooded 15 minutes and 47 seconds ago. These details are shown in the Installed, expires, and sent fields, and they are present for every LSA in the show ospf database extensive output. Network LSA Gerado pelo DR num link broadcast Network mask (4 bytes) : This field denotes the IP subnet mask for the interface connected to the broadcast network. Attached router (4 bytes): This field is repeated for each router connected to the broadcast network. The value of each instance is the router ID of the attached routers. You can deduce the total number of routers listed by the length of the LSA. Summary LSA (Type 3) Network mask (4 bytes) : This field represents the subnet mask associated with the network advertised. It is used in conjunction with the link-state ID field, which encapsulates the network address in a Type 3 LSA. Metric (3 bytes): This field provides the cost of the route to the network destination. When the summary LSA is representing an aggregated route (using the area-range command), this field is set to the largest current metric of the contributing routes. ToS (1 byte) : This field describes an y optional type of service information encoded within the network described. The Junos OS does not use this field. ToS metric (3 bytes): This field is not used. ASBR Summary LSA (Type 4) Network mask (4 bytes) : This field has no meaning in a Type 4 LSA and is set to 0.0.0.0.The address of the ASBR is encoded in the link-state ID field. Metric (3 bytes): This field provides the cost of the route to the ASBR. ToS (1 byte) : This field describes any optional type of service information used to reach the ASBR described. This field is not used. ToS metric (3 bytes): This field is not used. AS External LSA (Type 5) Network mask (4 bytes) : This field represents the subnet mask associated with the network advertised. It is used in conjunction with the link-state ID field, which encapsulates the network address in a Type 5 LSA. E bit (1 byte): The E bit determines the type of external metric represented by the metric field. It is followed by 7 bits, all set to 0 to make up the entire byte. A value of 0, the default value, indicates that this is a Type 2 external metric. Thus, any local router should use the encoded metric as the total cost for the route when performing an SPF calculation. A value of 1 indicates that this is a Type 1 external metric. Therefore, the encoded metric of the route should be added to the cost to reach the advertising ASBR. This additive value then represents the total cost for the route. Metric (3 bytes): This field represents the cost of the network as set by the ASBR. Forwarding address (4 bytes): This field provides the address toward which packets should be sent to reach the network. A value of 0.0.0.0 represents the ASBR itself. External route tag (4 bytes): This 32-bit value field can be assigned to the external route. OSPF does not use this value, but it might be interpreted by other protocols. Optional ToS fields (4 bytes) : These fields are unused. NSSA External LSA (Type 7) Originado pelo ASBR numa NSSA O ABR faz translate do LSA 7 para LSA 5 e propaga para as restantes areas. quando existe mais do que 1 ABR o com maior RID faz a translation A diferença entre os 2 LSAs é que o LSA type 7 tem o forwarding address Por definição apenas os routers NNSA conseguem interpretar o LSA Type 7 Quando 1 router é ABR e ASBR numa NSSA area o LSA type 7 é exportado by default para a area NSSA, para desativar este export usar o comando set protocols ospf no-nssa-abr Opaque LAS (Types 9-11) Permite extensibilidade futura ao OSPF: type 9 graceful restart capability type 10 MPLS traffic engineering type 11 not supported Flooding scope: type 9 tem link-local scope type 10 tem area scope type 11 tem domain scope RFC 2370 define o OSPF Opaque LSA Opaque LSA Format Tem o formato standard seguido de alguns octectos com info especifica da aplicação Link-state ID field é segmentado em 1 byte opaque type field e 3 bytes opaque ID field O IANA tem a resposabilidade de assignar novos opaque type codes OSPF Database Protection Limitar o numero de LSAs na database através do comando set protocols ospf database-protection maximum-lsa 1 SPF Algorithm Baseado no Dijkstra usa as seguintes databases no cálculo: link-state database candidate database tree database Executado per-area em cada router Dijkstra Algorithm: 1.Evaluated cada tuplo da candidate database e removidos os neighbor IDs que já constam na tree database e cujo custo para o root seja superior ao existente (na tree) O local router move o seu próprio tuplo para a tree database e para a candidate database todos os tuplos para os links. São efetuados os seguintes passos até a candidate database ficar vazia 2.cada nova entrada na candidate é escolhido o com menor custo do root até ao neighbor ID e movido o tuplo para a tree. é escolhido 1 tuplo random caso existem vários com o mesmo cost 3.Se existir um novo neighbor ID na tree, move todos os tuplos da LSDB com o router ID igual para a candidate database O OSPF permite bloquear que as rotas dos LSA type 5 e 7 sejam instalados na routing table, mas os LSAs continuam a existir na database O SPF pode ser executado 3 vezes sem o hold-down ocorrer mantem a rede estável durante a mudança é possivel agora configurar um 5-second timer Permite valores no range 2000 até 20000 ms 200ms de delay [range entre 50 e 8000 ms]no back-to-back SPfs, alterado através do comando spf-options delay OSPF Cost Default cost 10^8/bandwidth (bps) Links com bandwidth superior/igual a 100mbps tem o cost 1 !Definindo cost per-interface set protocols ospf area 0 interface cost ge-0/0/1.0 metric 12 !Reference bandwidth, suporta em k (kilobits per second),m (megabits per second),g (gigabits per second) set protocols ospf reference-bandwidth 10g Overload Settings Usado para transit traffic apenas se não existir um path disponivel Define a metric 65535 no router LSA em todos os transit links Util para manutenção, não influenciar o forwarding traffic atual Pode ser definido permanentemente ou com um timeout value Timer varia entre 60 e 1800 seconds Timer inicia apenas quando o RPD iniciar Este conceito de overload é herdado do IS-IS, e não é nativo do OSPF (software modificado) Definindo a métrica máxima garantimos que o router não fica em transito. Ao contrário do IS-IS , o overload router é usado para transit traffic se não exisitr uma caminho alternativo par ao destino set protocols ospf overload Se for omitido o timer as metric values mudam após commit e só retornam ao normal quando for removido o overload. Se for definido o timer as métricas não são modificadas automaticamente. Quando o timer expirar as metricas voltam ao normal, mas a config continua a ter opção overload Após o rpd iniciado é escolhido o RID com base num IP de um loopback (non-Martian) !Definir estaticamente o RID, a stub route já não é advertida by default set routing-options router-id 192.168.1.1 Após a versão 8.5, o Junos não adverte mais o loopback na LSDB quando esta interface não é config no OSPF OSPF Authentication 3 formas de autneticação none, simple e MD5 set protocols ospf are 0.0.0.20 interface ge-0/0/2.0 authentication simple-password ihjkn81y3497yasdad7123971y24nad O método default é none o simple é uma plain-test password MD5 Authentication Cada interface requer uma authentication key, as mesmas estão sempre criptografadas na config Cada key requer um ID no range entre 0-255 set protocols ospf area 0.0.0.20 interface ge-0/0/2.0 authentication md5 30 key poaidpoaskdpoikasdpokasppoaskd A autenticação com MD5 permite multiplas keys, é usada a com maior ID by default. Para facilitar a transição assignar cada key ID a um start time set protocols ospf area 0.0.0.1 interface ge-0/0/0.0 authentication md5 1 key "$9$MeAWNbHkP36AGD0IEhKv" set protocols ospf area 0.0.0.1 interface ge-0/0/0.0 authentication md5 2 key "$9$JpZHmpu1ylM/CX-Vboa369CtO1Ic" start-time "2012-1.20.12:00:00 +0000" O OSPF começa a usar apartir do start-time a nova key. Também é possivel usar a keyword now, este define o start-time com a hora/data local OSPFv3 Definido no RFC 5340 RFC 5781 OSPFv3 Graceful Restart RFC 4552 OSPFv3 Authentication/Confidentiality Differences from OSPFv2 Processamento do protocolo por link e não por subnet. O endereçamento nas interfaces podem ser diferentes Router LSA e Network LSA não têm endereçamento (não têm prefixo) Flooding scope é generalizado Suporta multiplas instancias por link, instance ID no header do OSPF Um novo LSA intra-area-prefix tem a info do endereçamento ipv6 A autenticação é feita ao nivel do IPv6, usa a framework do IPsec Adicionados ao v6 bit e R bits O ASBR Summary LSA tem o nome de inter-area-router Unknown LSA handling, faz o handling por intermédio de 1 bit, o LSA é tratado como pertencendo ao local scope apenas ou guardado e forwarded caso seja "entendido" O options field foi expandido de 8 para 24 bits. o v6 bit serve para indicar onde a rota ou link deve ser excluida dos calculos de routing O R bit é usado com o IS-IS overload e indica se o originator é um active router. Se o R bit não existir (isto é 0) no option field do OSPF, o router pode participar no OSPF sem ser usado como transit LSA Type 1.Router 2.Network 3.Inter-Area-Prefix 4.Inter-Area-Router 5.AS-External 6.Group Membership 7.Type-7 8.Link 9.Intra-Area Prefix Suporta Virtual Links, Stub, NSSA, e totally stubby areas Graceful restart External prefix limits BFD Stub Areas - o ABR não faz flood de LSAs tipo 4,5 Totally Stub Area - o ABR não faz flood de LSAs tipo 3,4,5 NSSA - o ABR não faz flood de LSA type 4,5 Stub Areas O ABR não faz flood de LSA type 4 e 5 para a stub area Não é possivel ter virtual links Stub Areas with No Summaries ABR não injeta LSA type 3,4,5 Permite que o ABR inject uma default route, este passo é feito manualmente !Injeta a default route manualmente set protocols ospf area 0.0.0.1 stub default-metric 10 !Stop LSA type 3,4,5 set protocols ospf area 0.0.0.1 stub no-summaries OSPF Not-so-Stubby Areas ASBR injeta LSAs Type 7 no NSSA O ABR converte LSA Type 7 to Type 5 com destino ao backbone Default route disponvel mas de forma manual, advertendo-o como LSA type 3 (opcionalmente suporte para o LSA type 7) Não é possivel usar virtual links O router sendo ABR e ASBR com nssa com a keyword no-nssa-abr não são enviados LSA type 7 OSPF NSSA No Summaries ABR não injeta LSA type 3,4,5 ABR converte Type 7 para Type 5 e injeta no backbone ASBR injeta Type 7 !Injectar a Default route como LSA type 3 set protocols ospf area 0.0.0.1 nssa default-lsa default-metric 10 !Injectar a Default route como LSA type 7 set protocols ospf area 0.0.0.1 nssa default-lsa default-metric 10 type-7 A metric-type é = 1 no inject da default Summarizing Routes Aplicável apenas aos ABRs !Apenas afeta as rotas LSA type 7 set protoocols ospf area 1 nssa area-range 192.168.0.0/29 Usando a keyword restrict permite restringir (suprimir) as rotas propagadas Type 5 para o backbone set protoocols ospf area 1 nssa area-range 192.168.0.0/29 restrict IGP Transition Overlay - ter os 2 protocolos em paralelo, definir o antigo protocolo como preferido Route Distribution - Integrated - criar uma nova rede com o novo protocolo e interligar á antiga Overlay Config todos os routers dando preferencia ao protocolo antigo Configurar o novo IGP nos routers Remover o antigo IGP set protocols rip group peer-routers preference 7 !Removendo o antigo IGP root@R1#deactivate protocols rip root@R1#commit confirmed 5 root@R1#run show route root@R1#commit root@R1#delete protocols rip root@R1#commit and-quit Multiarea Adjacencies Cada multiarea adjacency é anunciada num LAS Type 1 como link point-to-point Não é advertido Type 3 Link (stub) para uma multiarea adjacency Uma interface lógica é tratada como primary, as outras designadas como secondary By default uma interface pode pertencer apenas a uma area RFC 5185 OSPF Multi-Area Adjacency - os ABR estabelecem multiplas adjacencias por diferentes areas sob a mesma interface lógica Cada multiarea adjacency e anunciada com um point-to-point unnumbered link na area configurada pelos routers ligados ao link OSPF External Reachability OSPF default export policy, routers não fazem redistribute entre protocolos by default Qualquer policy apenas afeta os LSAs Type 5 e 7 !Redistribuit com o metric-type 1 set policy-options policy-statement all from protocol static set policy-options policy-statement all then external type 1 set policy-options policy-statement all then accept set protocols ospf external-preference 90 Os import filters apenas podem ser usados em external routes !Limitar o n de prefixos a exportar set protocols ospf prefix-export-limit 500 Atingido este limite o router entra em condição de overload, definindo os links locais com metrica 65535 Virtual Links Não faz tunneling de data packets Cria um virtual ABR entre os routers !Virtual Link na area 1 set protocols ospf area 0 virtual-link neighbor-id 10.10.10.1 transit-area 0.0.0.1 root@R1# run show ospf neighbor Address Interface State ID Pri Dead 10.1.12.2 em0.0 Full 10.10.10.2 128 33 10.1.12.2 vl-10.10.10.2 Full 10.10.10.2 0 34 root@R1# run show ospf interface Interface State Area DR ID BDR ID Nbrs em0.0 BDR 0.0.0.1 10.10.10.2 10.10.10.1 1 vl-10.10.10.2 PtToPt 0.0.0.0 0.0.0.0 0.0.0.0 1 BGP RFC 4271 BGP neighbor States BGP neighbor states: Idle state : The Idle state is the initial state when all incoming BGP connections are refused. A start event is required for the lo cal system to initialize BGP resources and prepare for a transport connection with the other BGP peer. Connect state : In the Connect state, BGP is waiting for the transport protocol connection to be completed. If the transport protocol connection succeeds, the local system sends an OPEN message and transitions to the OpenSent state. If the transport protocol connection fails, the local system restarts the ConnectRetryTimer, listens for a connection initiated by the remote BGP peer, and changes its state to Active. Continued on the next page. Active state : In the Active state, BGP is trying to acquire a peer by initiating a transport protocol connection. If the transport protocol connection succeeds, the local system sends an OPEN message to its peer and transi tions to the OpenSent state. If the local system’s BGP state remains in the Active state, you should check physical connectivity as well as the configuration on both peers. OpenSent state: In the OpenSent state, BGP waits for an OPEN message from its peer. When an OPEN message is received, it is checked and verified to ensure that no errors exist. If an error is detected, the system transi tions back to the Idle state. If no errors are detected, BGP sends a Keepalive message. OpenConfirm state : In the OpenConfirm state, BGP waits for a KEEPALIVE or NOTIFICATION message. If no KEEPALIVE me ssage is received before the negotiated hold timer expires, the local system sends a NOTIFICATION message stating that the hold timer has expired and changes its state to Idle. Likewise, if the local system receives a NOTIFICATION message, it change s its state to Idle. If the local system receives a KEEPALIVE me ssage, it changes its state to Established. Established state: In the Established state, BGP can exchange UPDATE, NOTIFICATION, and KEEPALIVE messages with its peer. When the local system receives an UPDATE or KEEPALIVE message, and when the negotiated hold timer value is nonzero, it restarts its hold timer. If the negotiated hold timer reaches zero, the local system sends out a KEEPALIVE message and restarts the hold timer. BGP Message Types Open message: The open message is sent once the TCP three-way handshake is complete. The open message initiates the BGP session and contains details about the BGP neighbor and information about supported and negotiated options. Update message: BGP uses update messages to transport routing information between BGP peers. Depending on the receiving device’s routing policy, this routing information is either added to the routing table or ignored. Keepalive message: BGP does not use keepalives at the Transport Layer. TCP fills this need. Instead, peers exchange keepalives as often as needed to ensure that the hold timer does not expire. Notification message : BGP uses notification messages to signal when something is wrong with the BGP session. A notification is sent when an unsupported option is sent in an open message and when a peer fails to send an update or keepalive. When an error is detected, the BGP session is closed. Refresh: Normally a BGP speaker cannot be made to readvertise routes that have already been sent and acknowledged (using TCP). The route refresh message supports soft clearing of BGP sessions by allowing a peer to readvertise routes that have already been sent. This soft clearing has some very specific uses when working with MPLS-based VPNs and adding new customer si tes to existing customer VPN structures. Common BGP Attributes Next-hop Local Preference AS Path Origin MED Community !Config neighboring BGP set protocols bgp group external type external set protocols bgp group external neighbor 10.1.23.2 peer-as 23 Caso não seja definido o type, é detetado durante a sessão BGP É possivel user a keyword local-as num group ou neighbor, é usada o mais especifico relativo á hierarquia keywork local-address - source da sessão BGP BGP Authentication set protocols bgp group internal neighbor 10.1.12.2 authentication-key "$9$ccKyrKMWX7NV" BGP using Key Chain set security authentication-key-chains key-chain bgp key 1 secret "$9$eZ6vX7" set security authentication-key-chains key-chain bgp key 1 start-time "2014-10-15.04:02:16 +0000" set security authentication-key-chains key-chain bgp key 2 secret "$9$g/aDi" set security authentication-key-chains key-chain bgp key 2 start-time "2014-11-15.04:02:21 +0000" set protocols bgp group internal authentication-key-chain bgp BGP Operation Adjacency-RIB-IN - Contains all received routes from each peer RIB-LOCAL - Contains local routes used to forwarding traffic Adjacency-RIB-OUT - Contains all advertised routes sent to each peer Apenas as rotas BGP activas na routing table são advertidas, apenas o best-path é advertido Usando a keyword advertise-inactive quando uma rota BGP não está activa, mas apenas o best path single inactive BGP path é advertido Hidden BGP Routes Razões para as rotas não estarem instaladas na RIB-LOCAL Martian Import policy Unresolvable next hop show route hidden extensive IBGP Next-Hop Propagation By default o IBGP router não muda o next-hop das rotas recebidas dos EBGP peers set protocols bgp export bgp set policy-options policy-statement bgp from protocol static set policy-options policy-statement bgp then next-hop self set policy-options policy-statement bgp then accept BGP Next-Hop Resolution Next-hop self Export direct routes no IGP IGP passive interface Static routes IGP adjacency formada nos links inter-AS para o EBGP peers root@R2# run show route protocol bgp terse inet.0: 9 destinations, 9 routes (9 active, 0 holddown, 0 hidden) + = Active Route, - = Last Active, * = Both A Destination P Prf Metric 1 Metric 2 Next hop AS path * 1.1.1.1/32 B 170 100 >10.1.12.1 I * 1.1.1.2/32 B 170 100 >10.1.12.1 I * 1.1.1.3/32 B 170 100 >10.1.12.1 I BGP Multipath BGP pode ignorar a comparação dp router ID e peer ID quando o multipath está configurado O multipath faz com que o algoritmo de selecção ignore o criterio router id e peer id O junOS usa também link bandwidth extended community para unequally load-balance O multipath permite multiplas copias de uma route do mesmo router ou diferentes routers no mesmo AS !Sem Multipath root@R3# run show route protocol bgp terse inet.0: 8 destinations, 11 routes (8 active, 0 holddown, 0 hidden) + = Active Route, - = Last Active, * = Both A Destination P Prf Metric 1 Metric 2 Next hop AS path * 1.1.1.1/32 B 170 100 >10.1.23.2 12 I B 170 100 >10.1.233.2 12 I !Com Multipath root@R3# run show route protocol bgp terse inet.0: 8 destinations, 11 routes (8 active, 0 holddown, 0 hidden) + = Active Route, - = Last Active, * = Both A Destination P Prf Metric 1 Metric 2 Next hop AS path * 1.1.1.1/32 B 170 100 10.1.23.2 12 I >10.1.233.2 B 170 100 >10.1.233.2 12 I root@R3# run show route 1.1.1.1/32 detail inet.0: 8 destinations, 11 routes (8 active, 0 holddown, 0 hidden) 1.1.1.1/32 (2 entries, 1 announced) *BGP Preference: 170/-101 Next hop type: Router Address: 0x9374010 Next-hop reference count: 3 Source: 10.1.23.2 Next hop: 10.1.23.2 via em0.0 Next hop: 10.1.233.2 via em0.0, selected State: Local AS: 23 Peer AS: 12 Age: 1:17:03 Task: BGP_12.10.1.23.2+179 Announcement bits (1): 0-KRT AS path: 12 I Accepted Multipath Localpref: 100 Router ID: 10.10.10.2 BGP Preference: 170/-101 Next hop type: Router, Next hop index: 553 Address: 0x93342e0 Next-hop reference count: 5 Source: 10.1.233.2 Next hop: 10.1.233.2 via em0.0, selected State: Inactive reason: Not Best in its group - Update source Local AS: 23 Peer AS: 12 Age: 2:57 Task: BGP_12.10.1.233.2+179 AS path: 12 I Accepted MultipathContrib Localpref: 100 Router ID: 10.10.10.2 Multihop Peering set interfaces em0 unit 0 family inet address 10.1.23.3/24 set interfaces em0 unit 0 family inet address 10.1.233.3/24 set routing-options static route 10.10.10.2/32 next-hop [10.1.23.2 10.1.233.2] set routing-options autonomous-system 23 set protocols bgp group external neighbor 10.10.10.2 multihop ttl 2 set protocols bgp group external neighbor 10.10.10.2 local-address 10.10.10.3 set protocols bgp group external neighbor 10.10.10.2 peer-as 12 root@R3# run show route inet.0: 12 destinations, 18 routes (12 active, 0 holddown, 0 hidden) + = Active Route, - = Last Active, * = Both 1.1.1.1/32 *[BGP/170] 00:03:06, localpref 100 AS path: 12 I > to 10.1.23.2 via em0.0 [BGP/170] 00:03:10, localpref 100 AS path: 12 I > to 10.1.233.2 via em0.0 [BGP/170] 00:02:14, localpref 100, from 10.10.10.2 Caso não seja definido nenhum TTL no multihop, é usado o valor 64 por default As rotas são labeled como Indirect mesmo partilhando a mesma subnet Accepting Remote Next Hops accept-remote-nexthop option - permite aceitar um next hop remoto onde este não partilha a mesma subnet default behaviour faz discard á rota Elimina a necessidade de config o multihop O accept-remote-nexthop e multihop não podem ser configurados no mesmo peer Pode ser usado para forçar um peer bgp ipv4 a aceitar rotas com um next hop em ipv6 O multihop e multipath rotas com vários next-hops, isto permite ao junOS fazer load-balancing per-prefix !Para alterar o comportamento do junOS instalando os 2 next-hop (install 2 entradas na fwrd table) set protocols bgp export bgp set protocols bgp group external multipath set policy-options policy-statement bgp from protocol static set policy-options policy-statement bgp then load-balance per-packet set policy-options policy-statement bgp then accept O tráfego é agora distribuido pelos 2 next-hops usando microflow hashing algorithm. Os default inputs para o algoritmo são: incoming router interface,source e destination IP !Usar como input o source e destination port root@R3# set forwarding-options hash-key family inet layer-4 Selecting the Active BGP Route 1.High local-preference 2.Short AS-path length 3.Low origin value (I [IGP] < E [EGP] < ? [Incomplete]) 4.Low MED value (the absence of a MED is interpreted as 0) 5.EBGP>IBGP 6.Best exit from AS 7.For EBGP routes, prefere the current active, otherwise prefer routes from peer with lowest RID 8.Prefer with shortest cluster length 9.Prefer routes from the peer with lowest peer ID 1. The router compares routes for the highest local preference (the only choice based on a higher, rather than lower, value). 2. The router evaluates the AS-path attribute next, where a shorter path is preferred. This attribute is often a common tiebreaker for routes. 3. The router evaluates the origin code. The lowest origin code is preferred: ( I [IGP] < E [EGP] < ? [Incomplete]). 4. If any of the remaining routes are advertised from the same neighboring AS, the router checks the multiple exit discriminator (MED) attributes for the lowest value. The absence of a MED value is interpreted as a MED of 0. 5. If multiple routes remain, the router prefers any routes learned through an EBGP peer over routes learned through an IBGP peer. If all remaining routes were learned through EBGP, the router skips to Step 9. 6. If the remaining routes were learned through IBGP, use the path with the lowest IGP cost to the IBGP peer. For each IBGP peer, install a physical next hop(s) based on the following three rules: a. BGP examines both the inet.0 and the inet.3 routing tables for the BGP next-hop value. The physical next hop(s) of the instance with the lowest Junos OS preference is used. Often, this means that BGP uses the inet.3 version of the next hop, through an MPLS label-switched path. b. Should the preference values in the inet.0 and the inet.3 routing tables tie, the physical next hop(s) of the instance in inet.3 is used. c. When a preference tie exists and the instances are in the same routing table, the number of equal-cost paths of each instance are examined. The physical next hop(s) of the instance with more paths is installed. This tie might occur when the traffic-engineering bgp-igp option is used for MPLS. 7. BGP then uses the route advertised from the peer with the lowest router ID (usually the loopback IP address). When comparing extern al routes from two distinct neighboring ASs, if the routes are equal up to the router ID comparison step, the currently active route is preferred. This preference help s prevent issues with MED-related route oscillation. The external-router-id command overrides this behavior and prefers the external route with the lowest router ID, regardless of which route is currently active. 8. The router then examines the cluster-list attribute for the shortest length. The cluster list is similar in function to an AS path. 9. The router prefers routes from the router with the lowest peer ID. Peer Configuration Options !Não envia Open Messages set protocols bgp group external neighbor 10.1.23.2 passive !Permite aceitar open messages do range definido set protocols bgp group external allow 10.10/16 A opção allow também não envia Open messages para o router remoto, em adição define uma subrange da qual permite aceitar ligações !É necessário definir o peer-as set protocols bgp group external peer-as 12 set protocols bgp group external allow 1.1.1.1/32 É possivel definir o máximo de rotas recebidas por um neighbor e caso ultrapassado terminar a sessão BGP por um determinado tempo !Definir o máximo de prefixos a receber set protocols bgp group external family inet unicast prefix-limit maximum 10 set protocols bgp group external family inet unicast prefix-limit teardown 100 idle-timeout 1 teardown - definido em % apartir do qual gera logs e efetua o reset á sessão BGP idle-timeout - a sessão entre os routers fica suspensa pelo tempo definido, a keyword forever carece de intervenção manual Hold Time set protocols bgp group external hold-time 45 A sessão BGP negoceia o hold-time , o junOS tem by default 90 segundos By default o junOS não adverte rotas para o neighbor que as enviou, ou que o neighbor esteja no mesmo AS Este comportamento é possivel alterar usando o advertise-peer-as set protocols bgp group external peer-as 12 set protocols bgp group external advertise-peer-as set protocols bgp group external neighbor 10.10.10.1 Graceful Restart Os neighbors continuam a fazer forwarding de tráfego para o router Os neighbors não removem rotas com destino ao router em restart End-of-RIB enviado para cada NLRI Notifica o neighbor que toda a routing info foi enviada Local router atrasa o algoritmo de seleção até receber o End-of-RIB set routing-options graceful-restart disable: This option stops the local router from participating in any graceful restart function. restart-time: This negotiable timer sets the amount of time that can elapse for the peering session to reestablish. The default value for this timer is 120 seconds, and its range is between 1 and 600 seconds. stale-routes-time: This timer specifies the amount of time that routes from the restarting peer can be used before they are withdrawn. The default value for this timer is 300 seconds, and its range is between 1 and 600 seconds. !Verificar rotas advertidas para o neighbor 10.1.23.3 root@R2# run show route advertising-protocol bgp 10.1.23.3 !Verificar rotas advertidas pelo neighbor 10.1.23.3 root@R2# run show route receive-protocol bgp 10.1.23.3 Modifying Local Preference set protocols bgp group internal local-preference 200 Definindo na policy o local-preference, terá prioridade sob o valor configurado Remove Private AS set protocols bgp group external remove-private Modifying AS Path set protocols bgp group external local-as 100 set protocols bgp group external local-as 100 private Coordinating MED and IGP Metrics !Definindo o MED estaticamente set protocols bgp group external metric-out 10 !O valor do MED será a métrica até ao peer iBGP, o MED é atualizado caso a métrica IGP altere set protocols bgp group external metric-out igp !Associar o MED ao valor minimo possivel do IGP obtido até á data set protocols bgp group external metric-out minimum-igp O MED pode diminuir caso o valor do IGP decremente, mas caso o IGP aumente o MED irá manter-se !Também é possivel incrementar/decrementar o MED com ambas as opções set protocols bgp group external metric-out igp +5 set protocols bgp group external metric-out minimum-igp +5 Path Selection and MEDs By default o junOS usa o MED deterministic scheme para comparar rotas do mesmo AS always-compare-med - compara o MED independentemente do neighbor AS ser o mesmo cisco-non-deterministic - compara os paths baseado em quando foram recebidos set protocols bgp path-selection always-compare-med Policy and BGP BGP guarda as rotas em 3 RIB memory tables Apenas as rotas ativas são advertidas aos neighbors Adjacency-RIB-IN - Contains all received routes from each peer RIB-LOCAL - Contains local routes used to forwarding traffic Adjacency-RIB-OUT - Contains all advertised routes sent to each peer A import policy e enforced entre a RIB-IN e RIB-LOCAL A export policy é enfored entre a RIB-LOCAL e a RIB-OUT BGP Attributes Os atributos podem recair numa das 4 categorias: Well-know manadatory - deve ser suportados por todos os speakers e presente em todos os updates BGP Well-know discretionary - deve ser suportados por todos os speakers e pode ou não estar presente nos updates BGP Optional transitive - atributo opcional e pode não ser entendido por todos os speakers, mas pode transitar mesmo que os speakers não entendam Optional nontransitive - atributo opcional e pode não ser entendido por todos os speakers, caso não seja entendido pelos speakers é ignorado e não enviado para outros speakers Well-know mandatory - Origin,AS Path, Next-Hop Well-know discretionary - Local-Preference Optional transitive - Community Optional nontransitive - MED Dos anteriores apenas o Next-hop e Community não são usados na route decision Attribute Origin I Internal learned by IGP E External learned by EGP ? Incomplete prefix found by some other means Preferencia I>E>? O junOS define by default o Origin como I, este valor é possivel alterar via policy set policy-option policy-statement set-origin term 1 then origin incomplete set policy-option policy-statement set-origin term 1 then accept set protocols bgp export set-origin Attribute AS Path A keyword as-override permite manipular o AS no AS Path de forma a que a rota nao seja descartada O junoS suporta regex nas policies !Fazendo prepend de ASs set policy-option policy-statement prepend-aspath term 1 then as-path-origin "1 1" set policy-option policy-statement prepend-aspath term 1 then accept set protocols bgp export prepend-aspath Operator Match Definition {m,n} -At least m and at most n repetitions of term. Both m and n must be positive integers, and m must be smaller than n. {m} - Exactly m repetitions of term. m must be a positive integer. {m,} - m or more repetitions of term. m must be a positive integer. * - Zero or more repetitions of term. This is equivalent to {0,}. + - One or more repetitions of term. This is equivalent to {1,}. ? - Zero or one repetition of term. This is equivalent to {0,1}. | - One of two terms on either side of the pipe. – - Between a starting and ending range, inclusive. ^ - A character at the beginning of a community attribute regular expression. This character is added implicitly; therefore, the use of it is optional. $ - A character at the end of a community attribute regular expression. This character is added implicitly; therefore, the use of it is optional. ( ) - A group of terms that are enclosed in the parentheses. Intervening space between the parentheses and the terms is ignored. If a set of parentheses is enclosed in quotation marks with no intervening space "()", it indicates a null path. [ ] - Set of AS numbers. One AS number from the set must match. To specify the start and end of a range, use a hyphen (-). A carrot (^) may be used to indicate that it does not match a particular AS number in the set, for example [^123]. set policy-option policy-statement MATCH-AS term 1 from as-path REGEX-TEST set policy-option policy-statement MATCH-AS term 1 then local-preference 1000 set policy-option policy-statement MATCH-AS term 1 then accept as-path REGEX-TEST "^3 .* 15$" Attribute Next-Hop O next-hop é alterado by default quando enviado via sessões EBGP, para sessões iBGP se a rota é originada num peer eBGP o next-hop não é modificado O MED nunca é "passado" de uma AS para outro By default o MED compara valores de rotas do mesmo AS É preferido o valor mais baixo e by default o valor do MED é baseado na IGP metric Attribute Community Permite aos operadores implementar routing policies administrativas 32 bits onde 16 representam o AS e os restantes 16 representam um numero arbitrário 65003:999 no-export - não adverte para outro AS no-advertise - não adverte para outro neighbor no-export-subconfed - adverte ao um neighboring Sub-AS numa rede usando confederationmas não é advertido posteriormente Community Regular Expressions "^26749:.{2,3}$" - match uma community onde o AS é 26749 e a community value é qualquer 2 ou 3 digitos "^3:.*$" - match uma community onde o AS é 3 e a community value é qualquer combinação de numeros "64555:1[148]9$" - match uma community onde o AS é 64555 e a community value é 119,149 ou 189 set policy-options community no-adv members no-export set policy-option policy-statement export-as1 term 1 from protocol static route-filter 10.0.0.0/8 exact set policy-option policy-statement export-as1 term 1 then community add no-adv set policy-option policy-statement export-as1 term 1 then accept Chapter 7: Enterprise Routing Policies BGP Strengths Diverse Administrative Control Handling Large Prefix Counts BGP Weaknesses Increased Convergence Time Increased Complexity Aggregate Routes set routing-options aggregate route 1.1.1.0/25 set policy-options policy-statement bgp term agg from protocol aggregate set policy-options policy-statement bgp term agg from route-filter 1.1.1.0/25 exact set policy-options policy-statement bgp term agg then accept !Adicionar ao AS Path 5 vezes o mesmo AS set policy-options policy-statement bgp term agg from protocol aggregate set policy-options policy-statement bgp term agg from route-filter 1.1.1.0/25 exact set policy-options policy-statement bgp term agg then as-path-expand last-as count 5 set policy-options policy-statement bgp term agg then accept External Traffic Enterprise Routing Policies Inbound/Outbound Topology Driven Primary/Secondary Load-shared per prefix Common ISP Routing Policies Use Local preference to prefer certain routes - use communities Filter all routes by length - do not accept routes longer than /24 from customer or peers Filter customer routes by prefix, AS path, or both Topology-Driven Accept all routes without attribute modification, the BGP path selection algorithm looks primarily at topological factors (such as AS path, multiple exit discriminator (MED), and the IGP metric) to determine the best route to send the traffic Primary/Secondary Routing Policies Strict primary/secondary: prefere sempre a primary connection, sem tráfego na secondary connection Loose primary/secondary:prefere a primary connection, e envia algum tipo de tráfego na secondary connection Equal-bandwidth links e strict primary/secondary fornece um redundancia assegurada Strict Primary/Secondary Outbound Routing Policy To enforce strict primary/secondary, receive only a default route Loose Primary/Secondary Outbound Routing Policy To allow loose primary/secondary, receive a default route from both ISPs, but also allow specific routes from the secondary that you want to prefer Primary/Secondary Outbound Config Sctrict Example Config: set protocols bgp group primary-isp import [local-80 default-only] set protocols bgp group secondary-isp import [local-70 default-only] set policy-options policy-statement localpref-80 then local-preference 80 set policy-options policy-statement localpref-70 then local-preference 70 set policy-options policy-statement default-only term match-default from route-filter 0.0.0.0/0 exact set policy-options policy-statement default-only term match-default then accept set policy-options policy-statement default-only then reject Loose Config set protocols bgp group primary-isp import [local-80 default-only] set protocols bgp group secondary-isp import [local-70 isp-b-customers default-only] set policy-options policy-statement localpref-80 then local-preference 80 set policy-options policy-statement localpref-70 then local-preference 70 set policy-options policy-statement isp-b-customers from community isp-b-customer-routes set policy-options policy-statement isp-b-customers then accept set policy-options policy-statement default-only term match-default from route-filter 0.0.0.0/0 exact set policy-options policy-statement default-only term match-default then accept set policy-options policy-statement default-only then reject set policy-options policy-statement default-only community isp-b-customer-routes members 65002:8000 Primary/Secondary Inbound set protocols bgp group primary-isp export routes-to-ISP set protocols bgp group secondary-isp export [set-backup routes-to-ISP] set policy-options prefix-list announce-to-ISP 172.31.128.0/20 set policy-options policy-statement routes-to-ISP from prefix-list announce-to-ISP set policy-options policy-statement routes-to-ISP then accept set policy-options policy-statement set-backup then community set ISP-B-localpref-70 set policy-options policy-statement set-backup then as-path-prepend "65501 65501 65501" set policy-options community ISP-B-localpref-70 members 65002:70 Load-Shared Per-Prefix Routing Policies Variation on /primary/secondary routing policy, but on a per-prefix basis This model sacrifices the 1:1 redundancy found in the strict primary/secondary model and the performance found in the topology-driven model. Load-Shared Per-Prefix Outbound set protocols bgp group isp-b import isp-b-import set protocols bgp group isp-c import isp-c-import set policy-options policy-statement isp-b-import term primary from route-filter 0.0.0.0/1 orlonger set policy-options policy-statement isp-b-import term primary then local-preference 80 set policy-options policy-statement isp-b-import term primary then accept set policy-options policy-statement isp-b-import then local-preference 70 set policy-options policy-statement isp-b-import then accept set policy-options policy-statement isp-c-import term primary from route-filter 128.0.0.0/1 orlonger set policy-options policy-statement isp-c-import term primary then local-preference 80 set policy-options policy-statement isp-c-import term primary then accept set policy-options policy-statement isp-c-import then local-preference 70 set policy-options policy-statement isp-c-import then accept Load-Shared Per-Prefix Inbound set protocols bgp group isp-b export [isp-b-export accept-aggregates reject-all] set protocols bgp group isp-c export [isp-c-export accept-aggregates reject-all] set policy-options prefix-list aggregates 172.31.128.0/20 set policy-options prefix-list isp-b-specifics 172.31.128.0/21 set policy-options prefix-list isp-c-specifics 172.31.136.0/21 set policy-options policy-statement accept-aggregates from prefix-list aggregates set policy-options policy-statement accept-aggregates then accept set policy-options policy-statement isp-b-export from prefix-list isp-b-specifics set policy-options policy-statement isp-b-export then accept set policy-options policy-statement isp-c-export from prefix-list isp-c-specifics set policy-options policy-statement isp-c-export then accept set policy-options policy-statement reject-all then reject Chapter 8 Introduction to Multicast Load on source servers reduced Multicast terms Source Multicast IP Packet Group Address : Range 224.0.0.0/4 Receivers Designated Router (DR) : Router closest to the source or receiver that forwards multicast IP packets Group Membership Protocol : IGMP for IPv4 e MLD for IPv6 Multicast Routing Protocol : PIM and DVMRP Multicast State (S,G) Routers ao longo do forwarding path mantêm o (S,G) state Consiste em: Know source IP address Group IP Address Incoming interface Outgoing interface list Multicast State (*,G) Alguns routers mantêm o (*,G) state Consiste em: Unknown source IP Address (Any) Group IP Incoming interface Outgoing interface list A tree construida do receiver para o RP é chamada de shared tree (partilhada por todas as sources). O forwarding path do receiver router até ao RP é mantida como (*,G) forwarding state. Multicast address space - 224.0.0.0/4 O endereço base 224.0.0.0 é reservado, sendo uma Classe D 224.2.0.0/16 - SDP/SAP addressing RFC 2974 232.0.0.0/8 - SSM addressing RFC 4607 233.0.0.0/8 - GLOP addressing RFC 3180.CAda AS tem estaticamente assignado 255 endereços multicast globalmente assignados 233(8bits).ASnumber(16bits).Locally assigned bits (8bits) 239.0.0.0/8 - Administratively scoped addresses RFC 2365, semelhante ao RC 1918 address space IP Multicast to Ethernet mapping 224.10.8.5 Mac Address (48bits) 25bits + 23 bits 01-00-5e-0A-08-05 Multicast Routing Protocol Characteristics Unicast routing decisions based on destination Multicast routing decisions based on sources RPF - directing traffic away from its source Distribution trees Shared or source specific Each router maintains an inblund interface list and an outbound interface list for each group of multicast RPF The RPF check helps guarantee that the distribution tree is loop free.IF the RPF check iss successful the packet is forwarded, otherwise it is dropped. RPF check can be done using inet.0 and normal unicast routing protocols, or RPF can use MBGP to put unicast routes in inet.2 Distribution Trees The two basic type of multicast trees are source and shared trees. the simplest form of a multicast distribution is a source tree. The source tree it is also referred to as a shortest-path tree (SPT) Unlike the source trees that have their root at the source, shared trees use a single, common root placed at some chosen point (rendezvous point (RP)) in the network. RPF Multicast uses unicast routes to determine the path back to the source, this determines upstream or incoming interface The packets do not loop because they are never flooded back towards their source RPF checks can be done using inet.0 and normal unicast routing protocols (default junOS behavior).MBGP can also be used to put unicast routes in inet.2 for RPF When using Protocol Independent Multicast (PIM), RPF checks are performed against the main routing table ( inet.0 ), by default. Distance Vector Multicast Routing Protocol (DVMRP), on the other hand, requires the presence of interface routes in the inet.2 table for this purpose. Normally, inet.2 is only used in conjunction with PIM when you want unique unicast and multicast topologies; when RPF checks are performed against the main routing instance, you effectively have the same topology for bo th unicast and multicast. Dense-Mode Routing Protocols Initially assumes everyone wants to receive multicasts Considerable overhead because every router must maintain multicast state for each active source Eventually creates a source-based distribution tree Examples:DVMRP and PIM-DM Dense-Mode Philosophy : flood first and prune later Os protocolos dense-mode incluem o DVMRP, cria uma routing table e tem uma métrica finita de 32 hops DVMRP foi o primeiro multicast protocol Todos os routers criam uma entrada (S,G) para cada par source/group Routers sem receivers localmente fazem prune periodicamente do STP para evitar a recepção de multicast traffic Pruning Unwanted TRaffic O router envia uma prune message upstream se... Não tiver attached receivers Se receber uma prune message de um downstream PIM neighbor Os routers reenviam as prune messages periodicamente caso continuem a não necessitar de multicast traffic Quando é detetado um novo group membership, o router envia uma Join message até ao SPT para permitir o flow de multicast para esse group A Join message é por vezes chamada de graft message, previne o delay associado ao ter que esperar pelo timeout da prune message recebida anteriormente Flood and prune acontece a cada 3 minutos, a entrada (S,G) expira a cada 3 minutos Os routers continuam a manter a entrada (S,G) para o caso de ser adicionado um receiver que previamente não existia An SPT A shortest path multicast forwarding tree is the final result of any multicast routing protocol (PIM and DVMRP) A source distribution tree é uma SPT entre o sender e o receiver. A SPT é identificada pela presença de (S,G) enquanto que a shared tree por (*,G) IGMP Manages group membership between hosts and routers IGMP message exchange Router queries - sends query messages to solicit group membership Host messages - report/leave-group messages junOs suporta IGMPv2 by default Suporta as 3 versões de IGMP IGMPv1 não suporta explicit leaves IGMP Message Exchange O router pode ser um querier ou um non-querier. O querier envia periodicamnete query messages IGMPv1 RFC 1112 Query messages são enviadas pelo router para o group address 224.0.0.1 com o TTL =1 As queries sao enviadas a cada 60 segundos No IGMPv1, o protocolo multicast determina a querier election IGMPv3 RFC 3376 Introduz o suporte ás group-source report messages. Com estas messages, um host pode eleger para receber tráfego de sources especificas de um grupo multicast. Esta capability acomoda SSM. IGMPv2 Query-Response Process 1.Querier router envia uma general query para todos os hosts usando o grupo multicast 224.0.0.1 2. O host2 é o primeiro a enviar o report para o grupo 224.10.1.1 3. O host1 escuta a resposta do host 2 e suprime o report 4. host3 envia o report para o grupo 224.20.1.1 Query Response Model A função primária do IGMP é informar os routers multicast dos grupos que têm listeners activos. O IGMPv1 suporta apenas general queries (sem o source group) suportando todos os grupos IGMPv2 Group Leave 1.Host2 envia a leave message para o grupo 224.10.1.1 para grupo multicast de todos os routers 224.0.0.2 2. Querier router envia group-specific query para 224.10.1.1 3. Caso não seja recebido nenhum report aproximadamente em 3 segundos o grupo 224.10.1.1 é "removido" IGMPv2 Join Process Quando um host pretende fazer join a um multicast group, envia um membership report para o grupo o grupo em causa. Este tipo é por vezes chamado de unsolicited uma vez que não foi gerado em resposta a uma membership query IGMPv2 Query-Response Process 1. Querier router envia general queries para o group multicast 224.0.0.1 (all-hosts) 2. Host2 envia o report para o grupo 224.10.1.1 3. Host1 receb tb a resposta do host2 e suprime-a 4. Host3 envia o report para o group 224.20.1.1 IGMPv3 and SSM Host1 pretende receber da S=172.16.20.1 mas não da S=192.168.30.1 Host1 envia o group-source message para o S=172.16.20.1 Host1 envia uma exclusão group-source message para o S=172.16.20.1 224.0.0.22 (All IGMPv3 routers) Um dos beneficios do SSM é suporta sparse-mode sem um RP IGMP Protocol Configuration root@R3# set protocols igmp ? Possible completions: <[Enter]> Execute this command accounting Enable join and leave event notification + apply-groups Groups from which to inherit configuration data + apply-groups-except Don't inherit configuration data from these groups > interface Interface options for IGMP maximum-transmit-rate Maximum transmission rate (packets per second) query-interval When to send host query messages (1..1024 seconds) query-last-member-interval When to send group query messages (seconds) query-response-interval How long to wait for a host query response (seconds) robust-count Expected packet loss on a subnet (2..10) > traceoptions Trace options for IGMP | Pipe through a command root@R3# set protocols igmp interface em0.0 ? Possible completions: <[Enter]> Execute this command accounting Enable join and leave event notification + apply-groups Groups from which to inherit configuration data + apply-groups-except Don't inherit configuration data from these groups disable Disable IGMP on this interface group-limit Maximum number of (source,group) per interface (1..32767) + group-policy Group filter applied to incoming IGMP report messages immediate-leave Group is removed immediately without sending query for last membership no-accounting Don't enable join and leave event notification + oif-map Output interface map > passive Suppress sending and receiving IGMP messages promiscuous-mode Accept IGMP messages coming from different subnet ssm-map Map for SSM translation of IGMPv1 or IGMPv2 messages + ssm-map-policy SSM map policy name > static Static group or source membership version Set IGMP version number on this interface (1..3) | Pipe through a command Sample IGMP COnfig edit protocols igmp edit protocols igmp set query-interval 125 set query-response-interval 10 set query-last-member-interval 1 set robust-count 2 set maximum-transmit-interval 500 set interface ge-0/0/8.0 version 3 static group 224.8.8.8 source 192.168.100.10 root@R2# run show igmp interface Interface: em0.0 Querier: 10.1.23.2 State: Up Timeout: None Version: 2 Groups: 2 Immediate leave: Off Promiscuous mode: Off Passive: Off Configured Parameters: IGMP Query Interval: 125.0 IGMP Query Response Interval: 10.0 IGMP Last Member Query Interval: 1.0 IGMP Robustness Count: 2 Derived Parameters: IGMP Membership Timeout: 260.0 IGMP Other Querier Present Timeout: 255.0 show igmp group show igmp statistics clear igmp membership clear igmp statistics Chapter 9: Multicast Routing Protocols and SSM Multicast Service Models ASM Receivers não especificam uma source para receber tráfego Routers aprendem a source do multicast traffic do source addressd dos pacotes multicast Suporta one source to many receivers e many sources to many receivers Original multicast service model RFC 1112 O IPTV usa este modelo SSM Receivers escolhem uma specific source da qual pretendem receber o tráfego Routers aprendem dos receivers a source do tráfego multicast Suporta apenas one-to-many forwarding Multicast Routing Protocol Modes Dense mode Tráfego inicialmnete flooded por toda a rede. Relies em flood/prune process para manter (S,G) em cada router na rede Constroi sempre uma SPT entre a source e os receivers Inclui os protocolos MOSPF, DVMRP e PIM-DM junOS suporta os protocolos DVMRP, PIM-DM mas não o MOSPF Sparse mode Tráfego apenas forwarded para os receivers (interessados) e para o RP Inicialamante cria um RPT entre a source e o receiver Quando o router mais próximo do receiver aprende a source do tráfego multicast, este inicia a criação do SPT Apenas os routers ao longo do SPT necessitam de manter o (S,G) state.Cada router ao longo do forwarding path envia Join messages indicando que pretende receber tráfego. No caso do receiver já não necessitar de receber tráfego, o router envia Prune messages para o upstream neighbor. Existem 2 possiveis forwarding trees: SPT e RPT O SPT é o resultado final de qualquer protocolo de multicast: Dense mode: (S,G) em todos os routers Sparse mode: (S,G) apenas nos routers ao longo do SPT RPT O RPT é uma multicast tree usado durante o initial flow de uma nova source numa rede PIM-SM Routers ao longo de uma RPT mantêm o (*,G) state RP Tree O RPT apenas pode ser encontrado numa rede PIM-SM. O RP é responsável por saber todas as combinações de active sources e groups na rede Esta tree é usada temporariamente até os receivers aprenderem a source do multicast traffic. Umas vez estes routers aprendendo a source do multicast traffic, criam uma SPT directa até ao source designated router (DR) PIM-SM PIM-SM trees Router mais próximo do receiver faz join inicialmente á shared RPT Após receber multicast traffic, o receiver DR inicia uma SPT da source até ao receiver Design considerations Placement/reliability of RP router, need for Tunnel Services PIC on some hardware PIM permite PIM sparse-dense mode, usando sparse para os grupos sparse e dense para os respetivos grupos dense O uso de RP requer uma pd interface no RP e uma pe interface em todos os routers diretamente ligados a uma multicast source. O uso das interface virtuais pd e pe requer a presença de PIC Sparse-Dense mode O Sparse-Dense mode pode ser visto no uso do Auto-RP PIM Register Messages PIM-SM requer que o source DR encapsule o multicast traffic dentro das register messages, enviadas como unicast para o groups RP. O RP após receber desencapsula o trafego e envia via shared-tree como native multicast, se a tree existir Para ter tunnel services alguns routers junOS requerem Adaptive Services ou Tunnel Services PIC PIM-SM Adding a Receiver Adding a Receiver Adicionando um receiver a uma rede PIM-SM (ASM model) requer que seja criado um RPT apartir do RP até ao receiver PIM-SM: The Shared RPT No sparse-mode os upstream routers não fazem forwarding de multicast traffic até receber um Join Message de um downstream router, ou se os group members estão connected directly PIM-SM: Switch to SPT O switch para SPT é despoletado pelo receiver DR após receber multicast traffic Receivers designated router - router mais perto do receiver com PIM priority + alta O receiver DR envia uma Join message para a source enquanto faz prune da (S,G) state da shared RP tree quando determina que o tráfego não foi recebido pelo optimal path (entre o source e o receiver) No junOS o switch de shared para SPT é geralmente despoletado pelo receptor no primeiro pacote da nova source na shared tree O first-hop router continua a enviar periodicamente null register messages para o RP, para este continuar a saber da source RP Options Three RP Election methods: BSR Auto-RP Static configuration Preference: BSR over auto-RP over static Candidate RP requer config caso seja eleito como RP set protocols pim rp local address 192.168.10.1 group-ranges 234.0.0.0/8 Em caso de omissão do group-ranges é usado o 224.0.0.0/4 Static RP Configuration !Definindo o RP set protocols pim rp local address 192.168.10.1 group-ranges 224.0.0.0/4 set interface ge-0/0/1.0 mode sparse set interface lo0.0 mode sparse !Definindo outros PIM Routers set protocols pim rp static address 192.168.10.1 set interface ge-0/0/8.0 mode sparse set interface lo0.0 mode sparse O beneficio de usar static RP é que permite operar numa rede PIM v1 ou v2 root@R3# run show pim rps extensive Instance: PIM.master Address family INET RP: 10.10.10.2 Learned via: static configuration Mode: Sparse Time Active: 00:03:55 Holdtime: 0 Device Index: 25 Subunit: 32769 Interface: pime.32769 Static RP Override: Off Group Ranges: 224.0.0.0/4 Active groups using RP: 224.1.1.1 total 1 groups active Address family INET6 Auto-RP Permite ao router aprender o endereço do RP dinamicamente PIMv1 não permitia aprender dinamicamente o RP Auto-RP protocolo nonstandard São necessários 2 grupos de dense.mode 224.0.1.39 (Announce) - aprende que routers são candidate RPs 224.0.1.40 (Discovery) - permite aos routers PIM aprender os mapeamentos group-to-RP Permite backup RPs para failover Seleciona o RP para um multicast group address range com base no candidate IP (escolhido o endereço IP + alto) Uma das vantagens do Auto-RP é permitir manter um RP de backup na rede. O Auto-RP mapping agent controla esta capability Apenas pode existir 1 RP para cada multicast group O Mapping agent envia para rede Discovery messages informando o RP para os respetivos multicast groups Auto-RP Message Format Version and type (1 byte): This octet is split into two 4-bit fields. The first 4 bits represent the current version of auto-RP being used and are set to a constant value of 1. The second 4 bits are used to represent the actual type of auto-RP message encoded. The possible values are the following: – 1: RP announcement message; and – 2: RP mapping message. RP count (1 byte): This field displays the number of distinct RP addresses present in the message. For each address present, the RP address field, as well as its associated fields, is repeated. Hold time (2 bytes) : This field displays the amount of time, in seconds, for which the RP message is valid. Reserved (4 bytes) : This field is not used and is set to a constant value of 0x00000000. RP address (4 bytes) : This field displays the first RP address included in the message. The remaining fields in the message are repeated for each unique RP address. Reserved and RP version (1 byte): This byte begins with a 6-bit reserved field, which is set to all zeros. The final 2 bits are used to display the version of PIM supported on the RP. The possible values include the following: – 00: Version unknown; – 01: Version 1 only; – 10: Version 2 only; and – 11: Both versions 1 and 2. Group count (1 byte) : This field displays the number of groups associated with the RP address. Encoded group address (6 bytes) : This field contains three separate portions used to describe the group address associated with the RP. The entire 6 bytes are repeated based on the value in the group count field. The field portions include the following: – Reserved and N bit (1 byte): This field contains seven bits set to a value of 0, followed by the N bit. The N bit denotes whether group address is positive or negative. A value of 0 represents a positive group address, which should use sparse mode PIM forwarding. A value of 1 represents a negative group address, which should use dense mode PIM forwarding. – Mask length (1 byte): This field displays the length of the group prefix to follow. – Group address (4 bytes): This field displays the mult icast group address that the RP supports. Configuring Auto-RP Cada router configura a opção auto-rp discovery - permite escutar a mapping messages announce - adverte o candidate RP mapping - elege o RP para cada multicast group e envia usando mapping messages !RP Config set protocols pim dense-groups 224.0.1.39/32 set protocols pim dense-groups 224.0.1.40/32 set protocols pim rp local address 10.10.10.2 group-ranges 224.0.0.0/4 set protocols pim rp auto-rp announce set protocols pim interface all mode sparse-dense !Mapping Agent Config set protocols pim dense-groups 224.0.1.39/32 set protocols pim dense-groups 224.0.1.40/32 set protocols pim rp auto-rp mapping set protocols pim interface all mode sparse-dense !Other PIM Router set protocols pim dense-groups 224.0.1.39/32 set protocols pim dense-groups 224.0.1.40/32 set protocols pim rp auto-rp discovery set protocols pim interface all mode sparse-dense show pim rps extensive BSR Election O mecanismo bootstrap faz parte do PIMv2 Originalmente definido no RFC 2362, agora em RFC 5059 1 BSR é elegido baseado na priority ou IP + alto (dos loopbacks) caso a priority seja igual Se existirem multiplos IPs no Loopback, será usado o IP + baixo excepto se existir a opção primary Routers com Priority 0 são excluidos do processo de selecção Podem ser configurados multiplos candidate BSRs O Router pode acumular funções de candidate RP e candidate BSR No minimo deve existir um candidate RP e um candidate BSR para o mecanismo funcionar BSR Functions Uma vez o BSR conhecido no PIM domain, cada candidate RP adverte um C-RP-adv para o BSR BSR recolhe todos os C-RP-adv messages: Pode usar uma politica local para restringir os candidate RP Bootstrap message envia RP-set para o dominio contendo todos os RP routers anunciados Cada PIM router usa o RP-set info para calcular que RP usar para cada grupo multicast Balanceamento automatico pelos candidate RPs Determinado um set de regras tais que cada router toma a mesma decisão BSR Advertises to the Domain Este proceeso permite que diferentes grupos de multicast sejam suportados por diferentes RPs: 1.Localizar todos os RPs associados com o group range mais especifico na PIM Join message 2.Identificar os RPS com a best Priority 3.Do resultado dos RPs, calcular uma hash usando o group address, RP address e a advertised hash mask length na bootstrap message.Selecionar o RP com o highest hash value 4.Do resultado dos RPs, selecionar o RP com o highest IP address BSR Message Fragment tag (2 bytes) : It is possible that an individu al bootstrap message might be too large for transmission in the network with out fragmentation. This field includes a randomly generated number designed to ensure that all fragmented packets are associated with the same bootstrap message. Each fragment receives the same value in this field. Hash mask length (1 byte) : This field displays the length, in bits, that each router should use when calculating the BSR hash algorithm. For IPv4 messages, a value of 30 is recommended. BSR priority (1 byte): This field displays the priority value of the current network BSR. BSR address (6 bytes): The address of the domain’s BSR is placed in this field using the encoded unicast address format. Group address (8 bytes): This field, as well as its a ssociated fields, can be repeated multiple times in a single bootstrap message . It contains the multicast group address using the encoded group address format. RP count (1 byte): This field displays the number of RPs included in the RP-set for the associated group address. Fragment RP count (1 byte) : When a bootstrap message is fragmented, this field is used to display the number of RPs in the RP-set included in this fragment for the group address. Reserved (2 bytes) : This field is not used and is set to a constant value of 0x0000. RP address (6 bytes) : This field is repeated based on th e value in the RP count field for the associated group address. The address of the RP is displayed using the encoded unicast address format. The RP hold time, RP priority, and reserved fields following the RP address are associated with this single address value. RP hold time (2 bytes): This field displays the amount of time for which the associated RP, in the preceding field, is valid. This field is displayed in units of seconds. RP priority (1 byte) : This field displays the priority of the associated RP. It is used in the hash algorithm for deciding which RP should be used for a particular group address. Reserved (1 byte): This field is not used and is set to a constant value of 0x00. It is associated with the RP address in the preceding field. Candidate RP Advertisement Message Prefix count (1 byte): This field displays the number of distinct group address ranges the local RP supports. A value of 0 in this field means that the RP supports all possible groups—224.0.0.0/4. Priority (1 byte): The priority of the RP for its advertised group address is placed into this field. Lower numerical values translate into a higher priority. The Junos OS uses a priority value of 0 by default. Hold time (2 bytes) : This field displays the amount of time the BSR should retain knowledge of the local RP and its supported group addresses. RP address (6 bytes) : The address of the local RP is placed in this field using the encoded unicast address format. Group address (8 bytes): This field is repeated based on the value in the prefix count field. Each unique group address range supported by the local RP is placed here using the encoded group address format. BSR Election Process 1. BSR envia messages aos outros routers anunciando o BSR IP 2. Candidate RPs enviam advertisements listando um group range 3. BSR colecta os advertisements dos candidate RPs e adverte aos restantes routers 4. Cada PIM router elege um RP para o group range Config a BSR !Config BSR set protocols pim rp bootstrap-riority 50 set protocols pim interface all mode sparse !RP Config set protocols pim rp local address 10.10.10.2 group-ranges 224.0.0.0/4 set protocols pim interface all mode sparse !Other PIM Router set protocols pim interface all mode sparse O valor do BSR priority por default no junOS é zero lab@exC-2# run show pim bootstrap Instance: PIM.master BSR Pri Local address Pri State Timeout 10.10.10.1 50 10.10.10.2 0 InEligible 80 None 0 (null) 0 0 lab@exC-2# run show pim rps Instance: PIM.master Address family INET RP address Type Holdtime Timeout Groups Group prefixes 10.10.10.2 bootstrap 150 125 0 224.0.0.0/4 10.10.10.2 static 0 None 0 224.0.0.0/4 Address family INET6 lab@exC-2# run show pim interfaces Instance: PIM.master Name Stat Mode IP V State NbrCnt JoinCnt(sg) JoinCnt(*g) DR address ge-0/0/12.0 Up Sparse 4 2 NotDR 1 0 0 172.25.1.2 ge-0/0/13.0 Down Sparse 4 2 DR 0 0 0 ge-0/0/8.0 Up Sparse 4 2 DR 0 0 0 192.168.100.11 show pim join extensive clear pim join show multicast route extensive show multicast rpf !View RPF lookup cache show route table inet.1 Após o router fazer o RPF check, o lookup é colocado na cache inet.1. Também é possivel encontrar a entrada na forwarding table Multicast Routing Protocols and SSM ASM: Address Identifier G Address Designation Group Receiver Operations Join, Leave Group Address Range 224/4 excluding 232/8 SSM: Address Identifier S,G Address Designation Channel Receiver Operations Subscribe,Unsubscribe Group Address Range 224/4 (garantido apenas para 232/8) Guarantees for the 232/8 Range Router + mais do receiver nunca inicia uma (*,G) Join message para 232/8 Routers Backbone nunca propagam uma (*,G) Join message para 232/8 RPs não aceitam PIM register messages or (*,G) Join messages para 232/8 Source DR não envia PIM Register messages para RP para 232/8 PIM-SM and SSM Os receivers e querier router devem suportar IGMPv3 Feito bypass ao RP uma vez que o receiver informa a rede do channel (combinação do source e group address) O ASM e SSM podem ser suportados em simultaneo se existir um RP na rede PIM-SM (S,G) state estabelecido em cada router ao longo do SPT. é feito bypass ao RP uma vez que é conhecida a source O PIM-SM using SSM usa sempre o SPT O DR do receiver não tem (*,G) state mas apenas (S,G) state Config PIM-SM com SSM !Receivers DR set protocols igmp interface ge-0/0/8.0 version 3 set protocols pim interface all !Other PIM Routers set protocols pim interface all mode sparse Configure ssm-map Usar ssm-map para permitir o IGMPv1 e IGMPv2 operar numa SSM modeled network !DR config set policy-options policy-statement group-224.7.7.7 term 10 from route-filter 224.7.7.7/32 exact set policy-options policy-statement group-224.7.7.7 term 10 then accept set protocols igmp interface ge-0/0/8.0 version 2 ssm-map example set protocols pim interface all mode sparse Chapter 10 Class of Service CoS components: Traffic classification Policing Queuing Scheduling Rewrite rules show interface ge-0/0/8 extensive | find "Queue counters" show class-of-service classifier type inet-precedence name iprec-compatibility !Criar um DSCP classifier e importar o DSCP default set class-of-service classifiers dscp entvoip import default !Aplicar o classifier ao input da interface set class-of-service interface ge-0/0/8 unit 0 classifiers dscp entvoip By default apenas a Queue 0 e Queue 1 têm reserved bandwidth lab@srxA-2# ...how interfaces ge-0/0/8 extensive | find "Queue counters" Queue counters: Queued packets Transmitted packets Dropped packets 0 best-effort 123 123 0 1 expedited-fo 9 9 0 2 assured-forw 0 0 0 3 network-cont 12267 12267 0 Queue number: Mapped forwarding classes 0 best-effort 1 expedited-forwarding 2 assured-forwarding 3 network-control [edit] lab@srxA-2# run show interfaces ge-0/0/8 extensive | find "CoS info" CoS information: Direction : Output CoS transmit queue Bandwidth Buffer Priority Limit % bps % usec 0 best-effort 95 950000000 95 0 low none 3 network-control 5 50000000 5 0 low none Interface transmit statistics: Disabled As bandwidth reservations são configuradas com recurso aos schedulers set class-of-service schedulers expedited-forwarding priority high set class-of-service schedulers best-effort transmit-rate percent 80 priority low set class-of-service schedulers best-effort transmit-rate percent 8 priority low set class-of-service interfaces ge-0/0/0 scheduler-map sch-map set class-of-service scheduler-maps sch-map forwarding-class assured-forwarding scheduler besteffor set class-of-service scheduler-maps sch-map forwarding-class expedited-forwarding scheduler expeditforwading set class-of-service schedulers expeditforwading priority high set class-of-service schedulers besteffor transmit-rate percent 80 set class-of-service schedulers besteffor priority low juniper@vjx1# run show interfaces ge-0/0/0 extensive .... CoS information: Direction : Output CoS transmit queue Bandwidth Buffer Priority Limit % bps % usec 1 expedited-forwarding r r r 0 high none 2 assured-forwarding 80 800000000 r 0 low none Interface transmit statistics: Disabled O r significa reserved the remainder of the bandwidth, o que são 20% nesta config clear interface statistics De forma a prevenir que a queue de VOIP consuma todos os recursos, é possivel configurar uma policy que use o best-effort para a tráfego excedente !Criar policer set firewall policer voice-overflow if-exceeding bandwidth-limit 2097152 set firewall policer voice-overflow if-exceeding burst-size-limit 25k set firewall policer voice-overflow then forwarding-class best-effort !Criar filter e incluir policer set firewall filter voice-overflow term ef from dscp ef set firewall filter voice-overflow term ef then policer voice-overflow !Aplicar na interface set interface ge-0/0/8.0 family inet filter input voice-overflow Nos SRX devices uma queue com high priority pode colocar em starve outras priorities a menos que seja rate-limited !CoS Processig no SRX Devices Ingress->BA Classifier->Multifield Classifier->Policer->Forwading Policy->Router Lookup->Policing (Egress)->Rewrite/Marker->Queue/Scheduler->WRED->Egress Ingress CoS Processing 1. BA classification : Packets arriving at the router are first subjected to the BA classification stage. This stage sets the forw arding class and packet loss priority (PLP) using any of the supported BA classifier type s, including IP precedence, DSCP, Institute of Electrical and Electronics Engineers (IEEE) 802.1P, and so on. 2. Multifield classification: The next processing stage is multifield classification. Here a firewall filter can be defined to match against numerous packet fields, incoming interfaces, and so on, to set the forwarding class or PLP, or to override the values set during BA classification. 3. Ingress policing : When desired, a firewall or interface-level policer can be applied to limit matching traffic by discarding, by recla ssification, or by marking excess traffic with a loss priority of high. This means that, in the event of congestion, a random early detection (RED) profile can be used to mo re aggressively drop PLP high traffic. 4. Forwarding policy: The last ingress processing stage is forwarding policy. This policy can alter the existing forwarding class or PLP setting, and it can be used to select a forwarding next hop based on a forwarding class, a feature called class-of-service– based forwarding (CBF). Egress CoS Processing 1. Egress policing: After the route lookup, a packet begins its journey toward the selected egress interface. The first egress CoS processing state is output policing, which is again based on either a firewall or an interface-level policer. Once again, excess traffic can be discarded or marked with a loss priority for later discard in the event of congestion. 2. Rewrite marker: The rewrite marker stage allows you to alter one packet field, or in some cases multiple packet fields, as the pa cket is transmitted to downstream nodes. Normally, you rewrite packet fields to accommodate downstream BA-based classification. Rewrite markers are indexed by protocol family and by forwarding class— for example, writing a 001 pattern into the precedence field of all family inet packets that are classified as best effort (BE). 3. Queuing and scheduling: The queuing stage involves placing packet notifications into the corresponding forwarding class queue, where they are serviced by a scheduler that factors priority and configured weight to determine when a packet should be dequeued from a given queue. 4. Red/Congestion control: The final CoS processing stage involves a weighted random early detection (WRED) drop decision, based on protocol, loss priority, and average queue fill level. RED tends to operate at the head of the queue, and a RED decision is made against each packet selected for transmission by the scheduler stage. Behavior Aggregate Classification Available Classification juniper@vjx1# set class-of-service classifiers ? Possible completions: + apply-groups Groups from which to inherit configuration data + apply-groups-except Don't inherit configuration data from these groups > dscp Differentiated Services code point classifier > dscp-ipv6 Differentiated Services code point classifier IPv6 > exp MPLS EXP classifier > ieee-802.1 IEEE-802.1 classifier > ieee-802.1ad IEEE-802.1ad (DEI) classifier > inet-precedence IPv4 precedence classifier O classifier pode ser populado com default values juniper@vjx1# edit class-of-service classifiers inet-precedence test [edit class-of-service classifiers inet-precedence test] juniper@vjx1# set import ? Possible completions: Include this classifier in this definition default Default classifier for this code point type test Behaviour Aggregate Classification set class-of-service classifiers inet-precedence test forwarding-class best-effort loss-priority low code-points 001 set class-of-service interfaces ge-0/0/0 unit 0 classifiers inet-precedence test Multifield Classification É implementado através de um firewall filter set firewall filter mf term 1 from protocol udp set firewall filter mf term 1 from destination 17000 set firewall filter mf term 1 then loss-priority low set firewall filter mf term 1 then forwarding-class assured-forwarding set interfaces ge-0/0/0 unit 0 family inet filter input mf No caso de existir um BA e um Multifield, em caso de conflito o aplicado pelo multifield tem preferência na escolha do forwarding class Rewrite Marker O marking é usualmente feito no edge da network A rewrite table pode ser importada de um default set set class-of-service rewrite-rules dscp test forwarding-class best-effort loss-priority low code-point 000001 set class-of-service interfaces ge-0/0/0 unit 2 rewrite-rules dscp test show class-of-service rewrite-rules Scheduling and Queuing SRX Series implementam o modified deficit round-robin (MDRR) scheduler definido por 3 variáveis: Buffer size Quantum Priority By default um scheduler é aplicado a uma interface fisica. No entanto usando a opção per-unit-scheduler permite associar schedules a unidades lógicas Buffer size: This is the delay buffer for the queue that allows it to accommodate traffic bursts. You can configure a buffer size as a percentage of the output interface's total buffer capacity or as a temporal value from 1–200,000 microseconds, which simply represents buffer size as a function of delay, rather than bytes. Quantum: The quantum is the number of credits added to a queue every unit of time and is a function of the queue's transmit weighting. The queue's tr ansmit rate specifies the amount of bandwidth allocated to the queue and can be set based on bits per second or as a percentage of egress interface bandwidth. By default, a queue can be serviced when in negative credit, as long as no other queues with the same priority have traffic pending. When desired, you can rate-limit a queue to its configured transmit rate with inclusion of the exact option. MDRR uses a deficit counter to determine whether a queue has enough credits to transmit a packet. It is initialized to the queue's quantum, which is a function of its transmit rate, and it is the number of credits that are added to the queue every quantum. Priority : The priority can be low, medium-low, medi um-high, high, or strict-high, and it determines the sequence in which queues are serviced. The scheduler services high-priority queues before it addresses any low-priority queues. set class-of-service schedulers test transmit-rate percent 50 set class-of-service schedulers test buffer-size percent 30 set class-of-service schedulers test priority high set class-of-service schedulers test drop-profile-map loss-priority low protocol any drop-profile high-drop set class-of-service scheduler-map test-map forwarding-class expeditedforwading scheduler test set class-of-service interfaces scheduler-map test-map O drop-profile high-drop é referenciado para any packets marcados loss-priority low Configuring Scheduler Transmission Rate Rate Percentage Remainder set class-of-service schedulers sched-best transmit-rate percent 40 set class-of-service schedulers sched-best buffer-size percent 40 set class-of-service scheduler sched-exped transmit-rate remainder set class-of-service scheduler sched-exped buffer-size remainder set class-of-service scheduler sched-exact transmit-rate percent 40 exact set class-of-service scheduler sched-exact buffer-size percent 40 O transmit-rate é em bps transmit-rate rate: Transmission rate, in bits per second. The rate can be from 3200 through 160,000,000,000 bps. transmit-rate percentage: Percentage of transmission capacity. transmit-rate remainder : Use remaining rate available. Não é possivel combinar as opções remainder e exact Configuring Scheduler Buffer Size Percentage Remainder Temporal value The delay-buffer bandwidth provides packet buffer sp ace to absorb burst traffic up to the specified duration of delay. Once the specified delay buff er becomes full, packets with 100 percent drop probability are dropped from the head of the buffer. A percentage of the total buffer: The total buffer per queue is based on microseconds and differs by platform type. Use the percent percentage option. The remaining buffer available: The remainder is the buffer percentage that is not assigned to other queues. In the example on the slide, we have assigned 40 percent of the delay buffer to the sched-best scheduler, allowed sched-network to keep the default allotment of 5 percent, and assigned the remainder to sched-exped . With this configuration, the sched-exped scheduler will use approximately 55 percent of the delay buffer. A temporal value, in microseconds: For the temporal setting, the queuing algorithm starts dropping packets when it queues more than a computed number of bytes. This maximum is computed by multiplying the logical interface speed by the configured temporal value. This value differs by platform; please refer to the documentation for your particular device. Congestion Control with WRED By default, o Junos aplica 100% drop quando 100% fill, desativando efetivamente o RED na queue Configuracão com drop-profiles, uma referencia ao drop-profile no scheduler set class-of-service drop-profiles high-drop fill-level 40 drop-probability 0 set class-of-service drop-profiles high-drop fill-level 50 drop-probability 10 set class-of-service drop-profiles high-drop fill-level 70 drop-probability 20 Juniper suporta até 4 drop profiles, estes podem ser referenciados no traffic type TCP e/ou UDP com uma loss priority (PLP) high ou low. O resultado é o peso das RED drop actions, baseado no traffic type Queue----Forwading class----Priority----Transmit Rate----Drop Profile 0-Best Effort-Low-95%-Tail drop 1-Expedited forwarding-sem priority-sem transmit rate-drop profile 2-Assured forwarding-sem priority-sem transmit rate-drop profile 3-Network control-Low-5%-Tail drop Na config default o input BA classification é feito pela ipprec-compatibility table Não é feito qualquer rewrite com a default CoS config, os pacotes são enviados com a mesma marcação com que foram recebidos show class-of-service classifier show class-of-service rewrite-rule Junos OS Policing Token Based Os policers implementados são token based Bandwidth is measured as the average number of bits over a one-second interval. Burst size implementa o policers "token-based" behavior Quando é enviado um packet são removidos do bucket os bytes (tokens), se não existirem tokens suficientes o packet é policed O bucket é então "recarregado" com a bandwidth rate Define o size inicial e maximo do bucket em bytes (tokens) Policer Actions Packet dropped if exceeding o rate configurado, denominada como hard police Packet pode ser marked com uma loss priority (PLP),denominada como soft police Packet pode ser classified numa forwarding class,denominada como soft police set firewall policer port80 if-exceeding bandwidth-limit 5242880 set firewall policer port80 if-exceeding burst-size-limit 62500 set firewall policer port80 then discard set firewall family inet filter remotesites term port80 from destination-port 80 set firewall family inet filter remotesites term port80 then policer port80 Os settings do burst-size-limit parecem ser sempre um mistério... Colocar este valor muito baixo faz com que potencialmente todos os pacotes seja policed, e muito alto o outro extremo O burst-size-limit nunca deve ser inferior 10* ao maximum MTU.O valor recomendado é que deva ser enviado tráfego pela interface em 5 miliseconds Portanto se ttivermos uma FastEthernet o minimum são 15000 (10*1500), e o valor recomendado seriam 62500 bytes (12500/ms*5) set firewall policer voice if-exceeding bandwidth-limit 10485760 set firewall policer voice if-exceeding burst-size-limit 62500 set firewall policer voice then loss-priority high set firewall policer voice then forwarding-class best-effort set firewall family inet filter remotesites term voice from dscp ef set firewall family inet filter remotesites term voice then policer voice set firewall policer all if-exceeding bandwidth-limit 15728640 set firewall policer all if-exceeding burst-size-limit 62500 set firewall policer all then loss-priority high set firewall policer all then forwarding-class best-effort set firewall family inet filter remotesites term all then policer all set firewall family inet filter remotesites term all then accept set interface ge-0/0/0.0 family inet filter input remotesites Virtual Channels Usado com tecnologias Frame Relay Garante que o site central não faz overrun slower remote sites Configuração efetuada no high speed central site set class-of-service virtual-channels siteA-default set class-of-service virtual-channels siteB set class-of-service virtual-channel-groups cloud-group siteA-default scheduler-map cos_scheduler set class-of-service virtual-channel-groups cloud-group siteA-default shaping-rate 128k set class-of-service virtual-channel-groups cloud-group siteA-default default set class-of-service virtual-channel-groups cloud-group siteB scheduler-map cos_scheduler set class-of-service virtual-channel-groups cloud-group siteB shaping-rate 256k set class-of-service interfaces t1-0/0/0 unit 0 virtual-channel-group cloud-group Não é possivel fazer commit com any shaping or scheduler-map configuration set firewall filter vc_select term 1 from destination-address 10.0.0.3/32 set firewall filter vc_select term 1 then virtual-channel siteB accept set firewall filter vc_select term default then virtual-channel siteA-default accept set class-of-service interfaces t1-0/0/0 unit 0 virtual-channel-group cloud-group set interfaces t1-0/0/0 per-unit-scheduler set interfaces t1-0/0/0 unit 0 family inet filter output vc_select Resource Performance Monitoring Aka SLA probes Mede round-trip delays Real-time performance monitoring (RPM) Data such as transit delay and jitter can be collected from these probes, and this data can be used to provide an approximation of the delay and jitter experienced by live traffic in the network. Different live traffic metrics like round-trip time (RTT), positive egress jitter, negative egress jitter, positive ingress jitter, negative ingress jitter, positive round-trip jitter, and negative round-trip jitter can be gleaned from the results. RPM calculates minimum, maximum, average, peak-to-peak, standard deviation, and sum calculations for each of these measurements. RPM Config Example !Client Side set services rpm probe test_probe test udp_test probe-type udp-ping set services rpm probe test_probe test udp_test target address 1.1.2.1 set services rpm probe test_probe test udp_test probe-count 15 set services rpm probe test_probe test udp_test probe-interval 2 set services rpm probe test_probe test udp_test test-interval 1 set services rpm probe test_probe test udp_test destination-port 52000 set services rpm probe test_probe test udp_test source-address 1.1.2.2 set services rpm probe test_probe test udp_test history-size 100 set services rpm probe test_probe test udp_test dscp-code-points ef set services rpm probe test_probe test udp_test data-size 128 set services rpm probe test_probe test udp_test thresholds total-loss 5 Teste executado 15 vezes em 2 segundos !Server Side set services rpm probe-server udp port 52000 juniper@VJX2# run show services rpm ? Possible completions: active-servers Show configured servers history-results Show history results probe-results Show probe results juniper@VJX2# run show services rpm probe-results Owner: test_probe, Test: udp_test Target address: 1.1.2.1, Source address: 1.1.2.2, Probe type: udp-ping, Test size: 15 probes Probe results: Response received, Fri Oct 31 10:12:22 2014, No hardware timestamps Rtt: 20732 usec Results over current test: Probes sent: 11, Probes received: 11, Loss percentage: 0 Measurement: Round trip time Samples: 11, Minimum: 20365 usec, Maximum: 29443 usec, Average: 21366 usec, Peak to peak: 9078 usec, Stddev: 2558 usec, Sum: 235024 usec Results over last test: Probes sent: 15, Probes received: 15, Loss percentage: 0 Test completed on Fri Oct 31 10:12:01 2014 Measurement: Round trip time Samples: 15, Minimum: 20439 usec, Maximum: 30387 usec, Average: 21295 usec, Peak to peak: 9948 usec, Stddev: 2433 usec, Sum: 319430 usec Results over all tests: Probes sent: 56, Probes received: 37, Loss percentage: 33 Measurement: Round trip time Samples: 37, Minimum: 20365 usec, Maximum: 30387 usec, Average: 21115 usec, Peak to peak: 10022 usec, Stddev: 2111 usec, Sum: 781245 usec !On server side juniper@VJX1# run show service rpm active-servers Protocol: UDP, Port: 52000 Timestamps necessary to account for latency Apllied on client, server or both Following types: udp-ping-timestamp : UDP timestamp requests sent to target address. udp-ping : UDP packets sent to target. tcp-ping : TCP packets sent to target. icmp-ping: Internet Control Message Protocol (I CMP) echo requests sent to target address. icmp-ping-timestamp: ICMP timestamp requests sent to target address. http-get : HTTP get requests sent to target URL. http-metadata-get: HTTP get request for metadata to target URL. By default é usada a opção icmp-ping como default probe Timestamps O timestamping é necessário para contabilizar qualquer latency na comunicação de pacotes Todos os nodos devem estar com o NTP sincronizado (preferencialmente Stratum 3) Probe type: icmp-ping; icmp-ping-timestamp; udp-ping ; and udp-ping-timestamp . The timestamping activity consists of the source (client) node applying a timestamp to the RPM packets with the time at which they leave the node. The destination (server) node applies a second timestamp when it receives the probe and a third timestamp when the probe leaves the destination back to the source. The source receives the response and applies a fourth timestamp. Different metrics are calculated based on these timestamps collected from a series of probes. snmpwalk -mALL -v2c -c sla srx-1 DISMAN-PING-MIB::pingCtlEntry BGP Route Reflection Scalling BGP IBGP Full-mesh tem o problem do n(n-1)/2 Existem 2 scaling mechanisms: Route reflection (RFC 4456) Confederations (RFC 3065) Route Reflection Concepts Permite um IBGP readvertise a IBGP learned route para outro IBGP neighbor Route reflector adverte apenas a active route, não altera by default os os atributos do IBGP 2 novos atributos para prevenir loops Cluster list - contem 1 ou mais cluster ID values Originator ID Route Reflection Attribures Cluster list Contem a sequencia de cluster IDs RR faz drop ás rotas que previamente tenham transitado pelo cluster Originator ID identifica o 1 router (router ID) que injectou a rota na RR network O Originator funciona também como mecanismo de loop prevention quando o cluster list não o previne !Activar o RR configurando o cluster ID set protocols bgp group int type internal set protocols bgp group int local-address 10.10.10.2 set protocols bgp group int cluster 10.10.10.2 set protocols bgp group int peer-as 12 set protocols bgp group int neighbor 10.10.10.1 set protocols bgp group int neighbor 10.10.10.4 RR Client Full Mesh Em caso de existir full mesh entre os clients, o client recebe sempre duas cópias da route (1 do RR e outro do outro client), ,usando o comando no-client-reflector o RR apenas envia rotas outside do cluster Os clients podem fazer peering com outros membros do RR cluster Para para evitar advertisemnets desnecessários, config no RR o comando no-client-reflect Modifying Attributes on the RR O RR reflector pode modificar qualquer atributo BGP usando um routing policy A presença de RR não deveria afetar forwarding paths, usar o next-hop self pode resultar em forwarding paths ineficientes