Pay attention to POWER_FAIL and THUNDERALM Alarms on the OptiX OSN 6800

Summary: The NE software does not provide the filtering and anti-jitter function for detection of overvoltage, undervoltage, and lightning protection failure signals. Therefore, overvoltage, undervoltage, or lightning protection failure signals are detected by mistake with a small probability due to interference of the fan-tray assembly, and subsequently the POWER_FAIL and THUNDERALM alarms are reported by mistake.

[Problem Description]
Trigger conditions:
1. The product version is earlier than OptiX OSN 6800 V100R004C04SPC800.
2. One of the following three situations occurs:
− A certain fan in the fan-tray assembly fails or rotates abnormally.
− The air filter on the fan-tray assembly is blocked.
− The equipment is struck by lightning.

 

 

Fault symptom:

At least one of the following symptoms occurs:

1. The TN11PIU reports a POWER_FAIL alarm.
2. The alarm parameter is 0X41 and OX42 for the OptiX OSN 6800 of a version earlier than V100R004C01.
3. The alarm parameter is 0X53 and OX54 for the OptiX OSN 6800 of version V100R004C01 and later versions.
4. The TN11PIU reports a THUNDERALM alarm and the alarm parameter is 0X01.
Identification method:
1. Check whether the product version is earlier than OptiX OSN 6800 V100R004C04SPC800.
2. Check whether either or both of the POWER_FAIL and THUNDERALM alarms are reported.
3. Check whether the THUNDERRALM alarm clears after it is generated.
4. Perform a cold reset on the SCC board in the subrack where the TN11PIU board is located. If the subrack is configured with active and standby SCC boards, reset both the active and standby SCC boards. Then, check whether the alarms are cleared after the cold reset. If the alarms are cleared, it indicates that the alarms are reported by mistake.

[Root Cause]
Both the logic and software of the TN11PIU board do not provide the filtering and anti-jitter function for detection of alarm signals. In addition, interference signals are added to the alarm signals due to the interferenc Person day e of the fan-tray assembly. In this case, the alarm signals exceed the detection value.

[Impact and Risks]
1. If only one TN11PIU board detects the overvoltage signals by mistake, the power supply module of the TN11PIU board is shut down and the POWER_FAIL alarm is reported but services are not affected.
2. If the power supply module of one TN11PIU board is shut down due to overvoltage and the power supply module of another TN11PIU board fails, both the two boards report undervoltage alarms and the fans in the fan-tray assembly stop working. In this case, services may be affected if the faulty TN11PIU boards are not replaced in time.
3. If both TN11PIU boards detect overvoltage by mistake, only the power supply module of the TN11PIU board in slot 19 is shut down and the POWER_FAIL alarm is reported. However, services are not affected.
4. When a TN11PIU board detects undervoltage by mistake, it reports a POWER_FAIL alarm with the alarm parameter of 0X41 or 0X53. However, services are not affected.
5. When a TN11PIU board detects lightning protection failure signals by mistake, it reports a THUNDERALM alarm with the alarm parameter of 0X01. However, services are not affected.

[Measures and Solutions]
Recovery measures:
Perform a cold reset on the SCC board in the subrack where the TN11PIU board is located. If the subrack is configured with active and standby SCC boards, reset both the active and standby SCC boards.
Workarounds:
None.
Solutions:
Solution to the POWER_FAIL alarm: Replace the TN11PIU boards that were manufactured after July 2010. In addition, upgrade the software of the NE housing a TN11PIU board to the OptiX OSN 6800 V100R005C00SPC900 version or later.
Solution to the THUNDERALM alarm: upgrade the software of the NE housing a TN11PIU board to the OptiX OSN 6800 V100R005C00SPC900 version or later
Solution of material handling after replacement:
None.

Comments are closed