Application of CSS Cluster Technology in Seismic Core Network

Abstract With the development of network virtualization technology and the expansion of the service content of seismic systems, the core layer of the original seismic industry network has gradually been difficult to meet the needs of performance. In order to ensure the reliable and efficient operation of the network, and at the same time to meet the zero-interruption demand of the seismic industry network, a new generation of CSS switch cluster technology was adopted and the intermediate equipment was temporarily taken over. The seismic core network was smoothly transformed and upgraded, and the network structure was simplified through upgrading. , improve network performance.

Keywords: CSS network virtualization China Rescue Equipment Network

introduction

In recent years, the occurrence of many catastrophic earthquakes has increased the public's demand and expectations for the earthquake industry, which has also contributed to the rapid growth of various types of seismic systems. The ever-expanding business content and higher standards of business requirements have brought about rapid growth in network load and increasingly complicated routing in the network. This has brought a lot of pressure on the core layer of the Shaanxi earthquake network system that has been building for many years. , making it difficult to effectively support. Therefore, it is imperative to build a more secure, stable, reliable, and efficient network environment for upgrading and upgrading the core network.

With the rapid development of network technology, network virtualization has emerged as an emerging technology and research hotspot in recent years. Many network equipment companies have launched their latest research results. CSS cluster technology is one of them. By virtualizing multiple switches into one, it has features such as simplified network structure, easy maintenance and management, and stable and efficient operation, which can fully meet the urgent needs of the network operation of seismic systems. This article applies it to the upgrade and upgrade of the seismic core network, successfully solved the existing network problems and improved the network performance.

1 Existing network analysis

1.1 The basic situation of the existing network

The seismic network system of Shaanxi Province adopts a three-tier architecture of core layer, convergence layer, and access layer. The network topology is shown in Figure 1 . Wherein the access layer is an Layer forwarding mode; stacked convergence layer 2 switches in order to improve the availability and reliability of the convergence layer, convergence layer switches configured user gateway; core layer 2 switch to HUAWEI S6503 The heartbeats are interconnected to form a dual-system hot standby mode through the VRRP protocol. Each core switch is connected to some seismic system servers. The downlink connects to the aggregation switches through the fiber and the uplink connects to the core router through the electrical interface. This implements the Seismic Bureau of China. Interconnections with other provincial bureaux (Zhao Jun et al., 2009 ), on the other hand, connect to outbound devices to interconnect with the Internet. Both the core switch and the upstream and downstream devices use the OSPF protocol for route learning to implement network interworking.

Figure 1 Network topology before upgrading

1.2 Analysis of Existing Network Defects

Based on the existing network structure, the seismic system business has been operating for nearly 10 years. With the increase of network equipment, the expansion of the network scale, and the increase of the service content of the seismic system, the aforementioned network system gradually shows some deficiencies in its operation.

( 1 ) The hidden trouble caused by aging equipment

The existing core switches have been put into use in 2007 , and it has been 7 years since then. After a long period of continuous operation, the availability and reliability of equipment have declined to a certain extent, and the probability of failure has increased accordingly. Although the hot standby mode protection of normal communication when a single fault occurs in the link core switch, but because of the two servers are connected to core switches have different service content, and therefore, when a station fails which will All the servers connected to the switch cannot provide services. This will directly affect the normal operation of the various services of the seismic system.

( 2 ) The recovery time caused by the VRRP protocol is too long

The two core switches form a dual-system hot standby mode through the VRRP protocol. Under normal conditions, by configuring the shunt switch on the core business of each floor, i.e., the core portion of a main floor to go business, business go to rest the core 2, wherein if a downtime core switch fails, the routing will re-converge bear all of the service by another switch 1, link to ensure normal communication. However, because of the performance problem of the VRRP protocol itself, its failure convergence time is in the order of seconds, which is difficult to meet the efficiency requirements of the current network.

( 3 ) The decrease of network forwarding efficiency due to the application of OSPF protocol

In the existing network structure, the core switch and the uplink and downlink devices both perform route learning through OSPF and achieve link intercommunication. However, due to the characteristics of the earthquake industry, the provincial bureau needs not only to communicate with hundreds of nodes in the province, but also to communicate with the China Seismological Bureau and other provincial seismological bureaus (Zhao Jun et al., 2009 ), resulting in learning. The size of the routing table is large. For switches at the convergence layer, the actual routes are simple and clear. Therefore, the huge routing table reduces the forwarding efficiency of the network to some extent.

( 4 ) Low bandwidth utilization

Although the aggregation switches of all floors are dual -linked upstream to core switching, link redundancy is formed due to the VRRP protocol running between the core switches . Therefore, only one of the links is working at the same time and is not fully effective. Use the bandwidth provided.

( 5 ) Complicated management and maintenance

In the network, two core switches are presented as two separate entities, each with its own ip address and configuration file. When the network is expanded or adjusted, two devices need to be configured and modified at the same time. When troubleshooting a network, you need to start from two devices at the same time to check the running status of each device and each other, and to manage and maintain the daily network. In terms of the degree of difficulty.

2 CSS Cluster Technology Overview

2.1 Definition and Characteristics of CSS Cluster Technology

The CSS ( Cluster Switch System ), also known as clustering, is the switch virtualization technology used by Huawei. It refers to the combination of multiple switch devices that support clustering features and logically combined into one overall switching device. The physical connection of clusters can be divided into cluster card clusters and service port clusters. The cluster card cluster is connected through the cluster port of the CSS card. The service port cluster is connected through the service port of a specific service board to build a cluster system. The typical features of CSS are (Figure 2 ):

( 1 ) The switch is more virtual one: The CSS externally represents one logical switch, and the control planes are unified and managed in a unified manner .

( 2 ) The forwarding planes are unity: The forwarding planes of the physical devices in the CSS are unified, and the forwarding information is shared and synchronized in real time.

( 3 ) Inter-device link aggregation: Links across physical devices within the CSS are aggregated into a trunk port and interconnected with downstream devices.

Figure 2 CSS cluster diagram

2.2 CSS Cluster Establishment Process

2.2.1 Cluster device cable connection

After previous tests and comparisons, the core switch upgrade used in this article is HUAWEI S7712 , and the cluster mode used is a cluster card cluster. Each VSTSA cluster card is inserted on each main control board . Each cluster card has four cluster ports. Two devices need to be configured with two main control boards. S7712 is achieved by creating two virtual cluster membership table into a more powerful device, required by a dedicated Virtual cluster port connecting cable according to certain rules, the specific connection shown in FIG.

Figure 3 HUAWEI S7712 trunk cable connection

2.2.2 Role election

When a cluster is established, member switches send contention packets to each other. Through competition, one station becomes the master switch, that is, the master is responsible for managing the entire cluster system; the other one becomes the cluster backup switch, that is, Standby . The master switch selection rules are:

( 1 ) Comparison of operating status: Switches that first started and entered the running state of the cluster are preferred to master switches.

( 2 ) Comparison of cluster priorities: Switches with a high cluster priority compete with each other as the master switch.

( 3 ) MAC address comparison: When the device starts up at the same time and the cluster priority is the same, the switch with the smaller MAC address is preferred to be the master switch.

( 4 ) Cluster ID comparison: When two switches are started at the same time and the cluster priority and MAC address are the same, the switch with the smaller cluster ID becomes the master switch. It is worth noting that the two member switches in a cluster must have different cluster IDs , and two switches with the same ID cannot be clustered.

By default, the switch's cluster ID and cluster priority are all 1 , and the cluster function cannot be used. Therefore, when setting up a cluster, manually configure the switch. The specific steps are:

( 1 ) Run the set css id new-id command in the system view to configure the member switch cluster IDs 1 and 2 respectively . After the cluster is established, the cluster ID of the switch cannot be modified at will. Otherwise, the cluster splits.

( 2 ) Run the set css priority new-priority command to set the cluster priority of the device.

( 3 ) Run the css enable command to enable the switch cluster function and restart the switch as prompted. For devices that want to be the master switch, you can choose to restart first.

2.2.3 Configure synchronization

The cluster has a strict configuration file synchronization mechanism to ensure that multiple switches in the cluster can work in the same network as a single device. When a cluster is established, member switches are started using their respective configuration files during the start-up phase. After the startup is complete, the standby switch merges the cluster-related configuration of the device into the configuration file of the master switch to form the configuration file of the cluster system. After the cluster operates normally, the master switch serves as the management hub of the cluster system and is responsible for synchronizing the user configuration to the standby switch, so that the configurations of member switches in the cluster can be consistent at any time. With instant synchronization, all member switches in the cluster maintain the same configuration. Even if the master switch fails, the standby switch can perform the functions in the same configuration. The synchronization process of the configuration is performed by the switch itself without human intervention.

2.2.4 Cluster status check

After the cluster is established, you can view the status of the cluster establishment in two ways. One is through the indicator on the cluster card view, is to build a successful indicator: 4 clusters card, only one card MASTER cluster lights up green; the two clusters card numbered 1 switches the CSS 1 ID lights up green, the two additional clusters card 1 to switch the CSS ID number 2 of the steady green light; ACT / LINK lamp cluster card green. The other is to view the command line:

( 1 ) Run the display device command to check the board status (Figure 4 ).

Figure 4 Board status after a cluster is successfully established

( 2 ) Run the display css status all command to check the status of the cluster system (Figure 5 ).

( 3 ) Run the display css channel command to check the CSS link status (Figure 6 ).

3 Several key issues in the application of CSS technology transformation and upgrading

3.1 Zero Interrupt Smooth Upgrade

In view of the special nature of the earthquake industry, networks such as seismic network monitoring and quick reporting are required to operate 24 hours a day. The implementation of the core switching transformation and upgrading is limited by the environment of the equipment room, involving the removal of the original core equipment from the network, and then adding a new core switch to the shelf, starting, and connecting the uplink and downlink lines, etc., which will take a long time. Time, if an earthquake occurs within this time period, will cause irreversible consequences. Therefore, how to cut off the network in the shortest time under the premise of protecting the important business of the network has become the most important issue to be considered in the implementation.

After research and discussion, the author finally decided to go through the following steps:

( 1 ) Before the two new core switches are put on shelves, all other configurations except for the access control policy are completed, and the power is tested to see if the status and duration of the cluster establishment process are consistent with expectations. After that, the configured core switch is tested and interconnected with all the devices that need to be connected upstream and downstream (Zhang Ying et al., 2011 ) to check whether the two parties can communicate normally in the cluster state, so as to avoid link failure after the new core switch is added. Unreasonable situation.

(2) prepare a temporary stage three switches, to be in accordance with the needs of its Network on important business uplink and downlink configuration and make a test in advance. In the implementation of the replacement of old and new switches, the lines required for the operation of the network service are temporarily connected to the switch. This operation is simple and quick, which can ensure the normal communication of important seismic network data and reduce

The switch replaced the implementation process pressure (Figure 7 ).

( 3 ) Remove the original core switch, put the new equipment into the shelf, and connect all the lines and servers except the network service. After the network test is unblocked, move the network cable to the new switch to realize the entire network service. restore. At this point, the network is operating in the all-pass state.

( 4 ) Gradually add access control policies and set up dual network card bindings on the server. This completes the upgrade and upgrade of the core network.

3.2 The whole network routing and transformation

In the original network structure, both the core switch and the upstream and downstream devices achieve link communication through OSPF route learning. As described in section 1.2 , this method reduces the forwarding efficiency of the network. In order to solve this problem, in the process of upgrading and upgrading the core switch, according to the actual needs of the network, the routing of the entire network was combed and reformed. The modified routing mode is as follows: First , the scope of the OSPF domain contains only the upstream port of the core switch and the port of the uplink device; secondly, the static route is used between the core switch and the downstream interconnect device, that is, the core layer and the convergence layer, and the convergence layer. All the access layers use static routes. Third , the static route selection protocol is introduced on the device that configures the OSPF routing protocol. Fourth, the core router uses the default route to configure intranet users to access the Internet .

3.3 Application of Link Aggregation Technology

3.3.1 Link Aggregation Technology Definition

Link Aggregation (Link Aggregation) is to bundle a group of physical interfaces, a logical interface as a method to increase the bandwidth, the interface is also called a load balancing group (Load Sharing Group) or link aggregation (LinkAggregation Group ).

3.3.2 Application of cluster-based link aggregation

The CSS technology supports cross-frame link aggregation Eth-Trunks . You can configure the physical Ethernet ports on different member devices into one aggregation port. Even if some of the devices on which the ports are located fail, the aggregated links will not completely fail, and other normal working member devices will continue to manage and maintain the remaining aggregation ports. In this way, the capacity of the device can be increased, and service backup between devices can also be performed to increase reliability. The upgraded and upgraded two core switches are clustered using CSS technology. Two physical links are used to interconnect the upstream and downstream devices through link aggregation. The specific implementation method is:

( 1 ) Create an Eth-Trunk in the cluster system and add member interfaces to the Eth-Trunk . Examples are as follows:

System-view

Enter system view, return user view with Ctrl+Z.

[SXDZJ_S7712]interface eth-trunk 31

[SXDZJ_S7712-Eth-Trunk31]quit

[SXDZJ_S7712]interface GigabitEthernet 1/11/0/12

[SXDZJ_S7712-GigabitEthernet1/11/0/12]eth-trunk 31

[SXDZJ_S7712-GigabitEthernet1/11/0/12]quit

[SXDZJ_S7712]interface GigabitEthernet 2/11/0/12

[SXDZJ_S7712-GigabitEthernet2/11/0/12]eth-trunk 31

[SXDZJ_S7712-GigabitEthernet1/11/0/12]quit

( 2 ) In uplink and downlink devices, different link aggregation configuration methods are available due to different device models. The following line aggregation switch is used as an example. The configuration method is to create a link aggregation group and add member interfaces. Examples are as follows:

System-view

Enter system view, return user view with Ctrl+Z.

[SXDZJ_HUIJU]link-aggregation group 1 mode manual

[SXDZJ_HUIJU]interface GigabitEthernet 1/1/1

[SXDZJ_HUIJU-GigabitEthernet1/1/1]port link-aggregation group 1

[SXDZJ_HUIJU-GigabitEthernet1/1/1]quit

[SXDZJ_HUIJU]interface GigabitEthernet 2/1/1

[SXDZJ_HUIJU-GigabitEthernet2/1/1]port link-aggregation group 1

[SXDZJ_HUIJU-GigabitEthernet2/1/1]quit

After the above transformation, the seismic core network was upgraded and upgraded. Figure 8 shows the network topology after the upgrade and upgrade.

Figure 8 Network topology after upgrade and upgrade

4 Network advantages after applying CSS clustering technology

4.1 The network structure is simpler

Based on the concept of “multiple virtual ones” based on CSS technology, the two core switches after transformation and upgrading are