Summary: In recent years, the long spare part in-transit time has become a major factor leading to the long fault recovery time when major faults occur on WDM networks without protection. To ensure that live network faults are rectified in a timely manner, representative offices are required to assess the service robustness and spare part availability of WDM networks where unprotected long-span links, unprotected links traversing multiple countries,unprotected subnets or sites in remote areas, unprotected high rate services, or unprotected ROADM sites are present. In addition, representative offices are required to identify the high-risk networks that do not satisfy the service robustness and fault recovery SLA requirements, report these high-risk networks to the related regional departments and network product maintenance departments, and determine network enhancement measures with these departments together.
1. Criteria for Identifying High-Risk WDM Networks
Criteria for Identifying High-Risk WDM Networks | ||
No. | Identification Criteria | Remarks |
1 | No protection is configured for services on the WDM network. | If protection is configured for the WDM network or client services, identify whether the protection can effectively prevent faults on single boards from impairing services. |
2 | Key services cannot be switched to backup links in a timely manner in case of a fault. | |
3 | Some subnets or links are deployed in remote areas. In case of a protection failure or faults on multiple boards, services will be impaired for a long time. | Remote areas refer to islands, deserts, or country borders. Identify the areas where spare parts can be delivered only by means of flights, ferries, or international transportation. The arrival time of delivery in such modes is generally uncontrollable. |
4 | According to the service contract, the representative office is responsible for delivering spare parts to sites in case of a major fault. | |
5 | In case of a major fault, the spare part delivery time is counted into the fault recovery time. | |
6 | In case of a major fault, the maximum spare part delivery time does not satisfy the fault recovery SLA requirements. | |
7 | In case of a major fault, the maximum spare part delivery time does not satisfy the spare part service SLA requirements. | |
8 | Even the above-mentioned SLA requirements are satisfied, the impact of service impairment caused by a major fault is uncontrollable. | Key industries, such as international business, finance, healthcare, and ticket booking, have high requirements on service robustness. |
Representative offices can determine whether a network is high-risky based on the
preceding criteria and other similar information that has not been listed. If a network is
identified as high-risk, communicate the risks with the customer and promote the
customer to configure protection for services, reserve sufficient spare parts, and ensure
the timely availability of the spare parts.
Meanwhile, report the high-risk networks to the WDM maintenance contact persons of
the regional departments and network product maintenance departments to file the
networks, and determine network enhancement measures with these departments
together.
2. List of High-Risk Boards That Are Prone to Generate Single-Point Failures and
Cannot Be Easily Resolved
- Top 1: Optical amplifier (OA) boards in unprotected networks
− Long-span dedicated boards such as HBA, ROP, or CRPC/RAU on unprotected links
− Two levels of OA boards such as OAU, OBU, or OPU at the receive end of the main optical path.
− High-power OA boards such as OAU105 or OBU205 on the main optical path
For the specific critical boards in scenarios without protection, see the attachment List of Risky Boards in Scenarios Without Protection.
[Impact and Risk]
If a critical board in an unprotected WDM network is faulty and the spare part cannot be provided in a timely manner, the fault may not be able to be resolved within the time specified in the fault recovery SLA signed with the customer, causing unmeasurable negative effects.
[Measures and Solutions]
Recovery measures:
Replace faulty parts in a timely manner.
Preventive measures:
- Identify high-risk networks according to the above-mentioned criteria, and report the high-risk networks to the WDM maintenance contact persons of the regional departments and maintenance departments to file the networks.
- Promote the customers to configure protection for services, reserve sufficient spare parts, and ensure the timely availability of the spare parts.
Comments are closed