Summary:
Due to a bug in the NE software of V100R010C03, the lower order ingress bus table is not recovered from the lowxcbus.cfg file after a standby system control board is switched to be the active system control board. As a result, matrix configuration on the active system control board and that on the cross-connect board are consistent. If services are configured after the switchover and a switchover of lower order SNCP services occurs, there is a possibility that the lower order SNCP services are interrupted.
[Problem Description]Fault symptoms:
Lower order SNCP services are interrupted after a switchover of them.
Trigger conditions:
There is a possibility that lower order SNCP services are interrupted after a switchover of them if services are configured or boards are added or deleted after a switchover between the active and standby system control boards.
Identification method:
The problem occurs if all of the following conditions are met:
- The NE is an NG SDH NE with the version of V100R010C03SPC203 (5.21.20.54/5.36.20.54) or an earlier V100R010C03 version.
- Lower order SNCP services have been configured.
- A switchover between the active system control board and standby system control board has occurred.
- Services or protection is configured or boards are added or deleted after the switchover to trigger verification.
- A switchover of lower order SNCP services has occurred.
[Root Cause]
After the active and standby system control boards on an NE of the involved version have switched over, the lower order bus table on the current system control board fails to be recovered from the lowxcbus.cfg file due to a bug in the NE software. After the switchover, the current active system control board (the original standby system control board) still uses the original bus table. The table is not updated until the system control board is reset. In this case, data on the active system control board and data in the lowxcbus.cfg file are different. The lower order ingress numbers stored on the active system control board and those stored on the cross-connect board are different. After a switchover of lower order SNCP services, there is a possibility that the services are interrupted.
[Impact and Risks]
There is a possibility that lower order SNCP services are interrupted.
[Measures and Solutions]
Recovery measures: Do the following two steps
- Deactivate and then activate the interrupted lower order SNCP services on the NMS.
- Warm reset the active system control board, standby cross-connect board and active cross-connect board one by one. Reset the next board after the previous board has got online. If the NE is an ASON NE, pay attention to the precautions mentioned in the appendix.
Workaround:
Use the Inspector tool to check an entire network. If a problematic NE is found, perform the recovery measure and then upgrade the NE to V100R010C03SPC203+V100R010C03SPH205 or a later version.
Version of the Inspector: SmartKit V200R008C00SPC100 or later
Tool upgrade package: Common_Inspector_V200R008_ON_20130111154652680.exeInspector_V200R008_ON_OptiX 10G MADM,OptiX 155_622,+_20130111154652680.exe
Upgrade the SmartKit automatically by downloading the upgrade package according to the reminding information.
Name of the test case: Checking the SNCP configuration whether is consistency between XCS board and BUS
Directory of the test case: OptiX OSN 1500/2500/3500/7500 configuration data check/SNCP configuration check
Solution:
Perform the workaround measure before an upgrade and upgrade the target NE to V100R010C03SPC203+V100R010C03SPH205 or a later version.
Material handling after replacement:
None
[Rectification Scope]
Not involved
[Rectification Guide]
None
[Appendix]
Precautions for resetting system control boards on ASON NEs:
- Backup databases of all ASON NEs on a network. If any exception occurs after a reset, restore services using the backed up databases.
- Use the PMI tool to check an NE before a reset. The check focuses on only the ASON data which can be found by choosing Intelligence_Data_Checking > ASON_Data_Analyse_Before_Reset. If any error is reported, contact the contact person.
- Run the following commands to manually synchronize databases:
:dbms-copy-all:mdb,drdb;
:dbms-copy-all:drdb,fdb0;:dbms-copy-all:drdb,fdb1;
- Query the synchronization status between active and standby system control boards by issuing the :hbu-get-backup-info; command. A warm reset can be performed only when the returned value is 0x00000003.
- Warm reset the active Huawei system control board on the NMS.
Comments are closed