Codigos alarmas

13
CODIGOS DE ALARMAS Type Lo que sigue explica los valores que pueden aparecer en el campo del Type: Las alarmas "Communications ” un problema relacionado con la comunicación (por ejemplo errores del protocolo) La alarma “qualityOfService ” indica un problema relacionado con la calidad del servicio (por ejemplo los umbrales que se cruzan) Las alarma “processing ” indica un problema relacionado con los datos de proceso (por ejemplo un problema de la memoria) La alarma "Equipment ” indica un problema con el equipo físico (por ejemplo una falta del procesador) La alarma de seguridad indica un problema relacionado con la seguridad (por ejemplo un acceso desautorizado) La alarma "Operador ” indica que un cierto acontecimiento fue causado por el operador (por ejemplo bloqueo de un componente) La alarma "debug ” indica que el acontecimiento estaba previsto para propósitos que eliminaban errores Una alarma unknown ” indica que la razón del evento es desconocida. Que causa las alarmas Generalmente, las alarmas ocurren en las siguientes situaciones: Degradación/calidad de las condiciones del servicio (por ejemplo, el inicio de la congestión severa) proceso de los errores (por ejemplo, errores del protocolo) Alarmas de Ingenieria (por ejemplo, memoria insuficiente para un componente) Alarmas de Ingeniería ( por ejemplo, memoria insuficiente para un componente), condiciones fuera de servicio (por ejemplo, las fallas del hardware tales como un procesador funcional o falta de la fuente de alimentación) Errores de Software ( Que son, una condición inesperada ha sido detectada e el software) Condiciones administrativas (tales como el uso del comando para bloquear temporalmente un componente) Violaciones de Seguridad

Transcript of Codigos alarmas

Page 1: Codigos alarmas

CODIGOS DE ALARMAS

Type

Lo que sigue explica los valores que pueden aparecer en el campo del Type:

Las alarmas "Communications” un problema relacionado con la comunicación (por ejemplo errores del protocolo)

La alarma “qualityOfService” indica un problema relacionado con la calidad del servicio (por ejemplo los umbrales que se cruzan)

Las alarma “processing” indica un problema relacionado con los datos de proceso (por ejemplo un problema de la memoria)

La alarma "Equipment” indica un problema con el equipo físico (por ejemplo una falta del procesador)

La alarma de seguridad indica un problema relacionado con la seguridad (por ejemplo un acceso desautorizado)

La alarma "Operador” indica que un cierto acontecimiento fue causado por el operador (por ejemplo bloqueo de un componente)

La alarma "debug” indica que el acontecimiento estaba previsto para propósitos que eliminaban errores

Una alarma “unknown” indica que la razón del evento es desconocida.

Que causa las alarmas

Generalmente, las alarmas ocurren en las siguientes situaciones:

Degradación/calidad de las condiciones del servicio (por ejemplo, el inicio de la congestión severa)

proceso de los errores (por ejemplo, errores del protocolo) Alarmas de Ingenieria (por ejemplo, memoria insuficiente para un componente) Alarmas de Ingeniería ( por ejemplo, memoria insuficiente para un componente), condiciones

fuera de servicio (por ejemplo, las fallas del hardware tales como un procesador funcional o falta de la fuente de alimentación)

Errores de Software ( Que son, una condición inesperada ha sido detectada e el software) Condiciones administrativas (tales como el uso del comando para bloquear temporalmente un

componente) Violaciones de Seguridad

AD: Alarma display.insv (in service)oos (out of service)trb (troubled) Preocupante.unk (unknown)nex (non-existent)

State change notifications (SCN)

Page 2: Codigos alarmas

Formato de Alarma

La información de las alarmas en este documento esta dividido en 7 campos como se muestral en el siguiente ejemplo:

Componente Es el nombre completo del componente o una declaración que indica que es una alarma común que se aplica a todos los componentes.

SeverityOne of: critical, major, minor, warning, Indeterminate or cleared

StatusOne of, message, set, clear or set/clear:

LeyendaDescribe los valores posibles para cualquier cosa en el campo componente.

DetallesProporciona los detalles en la causa de la alarma, y cuando sea aplicable, el impacto que la alarma tendrá en el sistema o en otros componentes.

Acción remediadoraSugiere una acción al operador para corregir la falta (si es posible).

Número de alarmaCada encabezado de la alarma consiste en un número de ocho dígitos que identifica la alarma.

El número de la alarma es el principal identificador y proporciona los medios por los cuales usted puede encontrar su alarma rápidamente. Las alarmas aparecen secuencialmente en este libro para el acceso fácil.La alarma se compone de un grupo índice (el primer grupo de cuatro dígitos) y de un SubIndex (el segundo grupo de cuatro dígitos). Este también se refiere como el índice del NTP.

• El IndexGroup es un número de cuatro cifras que representa las agrupaciones lógicas de la alarma. Por ejemplo, puede representar:

— Una aplicación de servicio— Un subsistema interno— Un tipo de componente— Una clase de componente— Un modulo de software— Un evento similar

For a complete list of all alarm IndexGroups, refer to table “Passport IndexGroups” (page 37).• The SubIndex is a four-digit number which has significance only within the IndexGroup.

Alarm statusPassport implements this attribute as a read-only set-valued attribute. Possible values for the alarms status attribute are:• Empty set—the attribute value appears as empty.• Under repair—the resource is being repaired. The operational state can be either enabled or disabled.

Page 3: Codigos alarmas

• Critical—one or more critical alarms indicating a fault or failure have been detected and have not been cleared. The operational state can be either enabled or disabled.• Major—one or more major alarms indicating a fault have been detected and have not been cleared. These faults can be disabling.• Minor—one or more minor alarms indicating a fault, have been detected and have not been cleared. These faults can be disabling.• Alarm outstanding—one or more alarms have been detected and have not been cleared. The condition may or may not be disabling. If the operational state is enabled, additional component-specific attributes may indicate the nature and cause of the condition.

Attribute ValuesOperational state enabled, disabledUsage state idle, active, busyAdministrative state unlocked, locked, shutting downAlarm status empty, under repair, critical, major, minor, alarm,

outstandingProcedural status empty, initialization required, not initialized, initializing,

reporting, terminatingAvailability status empty, in text, failed, power off, off line, off duty,

dependency, degraded, not installed, log fullControl status empty, subject to test, part of services locked, reserved for

test, suspended.Standby status not set, hot standby, cold standby, providing serviceUnknown status true or false

What causes alarmsGenerally speaking, alarms occur in the following situations:• Degradation/quality of service conditions (for example, the onset of severe congestion)• processing errors (for example, protocol errors)• Engineering alarms (for example, insufficient memory for a required component) out-of-service conditions (for example, hardware failures such as a functional processor or power supply failure)• Software errors (that is, an unexpected condition has been detected in software) • Administrative conditions (such as using the lock command to temporarily lock a component)• Security violations

AD: Alarma display.insv (in service)oos (out of service)trb (troubled) Preocupante.unk (unknown)nex (non-existent)

State change notifications (SCN)

TypeThe following explains the values that can appear in the Type field:• Communications alarm indicates a problem related to communication (for example protocol errors)• qualityOfService alarm indicates a problem related to quality of service (for example crossing thresholds)• processing alarm indicates a problem related to processing data (for example a memory problem)• Equipment alarm indicates a problem with the physical equipment (for example a processor failure)• Security alarm indicates a problem related to security (for example an unauthorized access)• Operator alarm indicates that some event was caused by operator error (for example locking a component)• debug alarm indicates that the event was for debugging purposes• unknown alarm indicates that reason for the event is not known

IndexGroup Group0000 Common alarms0999 Preside Multiservice Data Manager-generated alarms

Page 4: Codigos alarmas

For information on Preside Multiservice Data Managergenerated alarms, see 241-6001-501 Preside MDM Proxy

Alarms Reference Guide.7000 Component administration system7001 Virtual circuit7002 Bus control system7003 Data collection system7004 Module interconnection link7005 Module interconnection transport7006 Network management interface system7007 Frame relay service7008 File system7009 Routing7010 RID/MID system and External Address Plan7011 Port management system7012 Processor control system7013 Traffic management7014 Memory management7015 Network time synchronization7016 Destination call routing7017 Network clock synchronization7018 Path Oriented Routing Service7019 Voice Transparent Data Service, including Bit Transparent Data Service and HDLC Transparent

Data Service7020 Virtual router7021 Internet protocol (IP)7022 Bridge7023 Novell internetwork packet exchange (IPX)7026 LAN port management system7027 Simple network management protocol7028 Packet control facility7029 X.25 DTE7030 LAPB7031 Point-to-point protocol7032 Frame Relay DTE7035 Statistics Management System7036 APPN Protocol Code7037 SNA Common Tools7038 LLC2 Protocol Code7039 ATM7040 Source route end station7041 ATM Networking7042 ATM AAL17043 Trace7044 SNA7046 SNA GvcIf7047 SNA7048 Frame Relay ISDN7049 Voice Networking7050 Remote Service Agent (Rsa)7052 LAN Emulation Client (Lec)7053 Multiservice Cut Through Switching (Mcs)7054 Sparing Management7056 Voice Server Processor (VSP)/Narrowband Service Trunk over ATM (Nsta)7058 Hunt Group (Hg)7060 LP Eng Arc7062 WirelessPCU project7064 MPLS

Component field

Page 5: Codigos alarmas

The component field contains the name of the component needing repair or detecting the fault.

The component field always contains the abbreviated form of the component name. To find out component abbreviations, refer to 241-5701-060 Passport 7400, 15000 Components.Common component fieldsIn cases where the alarm is common to all or most components the field contains <component_name> or <component_type>

Severity fieldSeverity is always one of: indeterminate, critical, major, minor, warning, or cleared. These values and their definitions correspond to those defined by OSI in ITU-T X.733.Note: For Passport common alarms (alarms with an IndexGroup 0000), the severity is dependent on the component. To reflect this, the severity field says <severity> to indicate that the value may change with different components.

Following are explanations of the different types of severity:• indeterminate—the system cannot determine the level of severity.• critical—requires you to react immediately to the failure. Usually it implies that the resource is completely disabled and that service is affected.• major—requires that you take immediate corrective measures. The resource is severely disabled and service is affected.• minor—corrective action should be taken to prevent a more serious fault. The resource is partially disabled but service is not affected.• warning—action should be taken to diagnose and correct a problem. Some problem has been detected but the resource is not disabled and service is not affected.• cleared—all previous alarms on this component are cleared. Alarms that have a status of clear always have a severity of cleared.

Status fieldStatus is always one of message, set, clear or set/clear. In situations where thesame alarm generates both a set and clear, this document represents it withset/clear.

Following are explanations of the different types of status:• message—a message alarm indicates a condition in which you may be interested. All software alarms have the status of message.• set—indicates that a fault or failure has occurred and that an operator action may be required to correct the problem.• clear—when the fault is repaired, a clear alarm is generated to indicate that the condition has returned to normal. Alarms that have a status of clear always have a severity of cleared.• set/clear—sometimes an alarm must be both set and cleared. In this situation, the alarm is issued twice, once to set the alarm and once to clear the alarm. The alarm appears with the same alarm index number in both cases. This occurrence is indicated in this document by set/clear in thestatus field.

Details fieldThe details field contains the following information:• what has caused the alarm• how the alarm impacts the network, service, and other componentsNote: Alarm descriptions that include mention of the control processor(CP) are generally applicable to a control and function processor (CFP1).

Remedial action fieldThe action field contains information telling the operator what actions to take to correct the problem. The remedial action might include one of the following:• issuing an operator command• replacing hardware• waiting until the alarm clears itself (no action is required)• opening an service request (SR)

Page 6: Codigos alarmas

CODIGOS70071000CRITICAL SET 70071000 06-01-26 17:04:05 EM/BCO4013 FRUNI/34 LMIID: 03000318 TYPE: communications CAUSE: commProtocolErrorRAW: oos ADMIN: unlocked OPER: enabled USAGE: busy AVAIL: PROC: CNTRL: ALARM: STBY: notSet UNKNW: REL COMP: EM/BCO4013 LP/3INT: 3/0/2/20969;frsBaseLmiHandler.cc;865;PCR6.1.58;

DetailsThis alarm is set when the number of frame relay Local Management Interface (LMI) procedure errors within the last eventCount events has exceeded the errorEventThreshold attribute. (Both the eventCount and errorEventThreshold attributes are provisionable.) In this situation, the local interface is declared insane.For FrUni, FrNni and FrAtm components, both the local and remote user equipment are signalled by the LMI asynchronous status report message if asynchronous notification is supported at their local interface. For the FrMux component, all local Applications are signalled. Data transfer for all connections associated with the local DLCIs is suspended.

A clear is issued after a fixed number (provisioning parameter) of correct message exchanges between the inter-operating LMI entities has occurred. Data transfer for all connections associated with the local DLCIs is resumed.

Remedial action

Verify that the other side of the interface has the LMI protocol enabled. Verify that the LMI parameters set on the other side are compatible with those on this side.Turn off the LMI protocol if the other side does not support the LMI protocol.

70112001CRITICAL SET 70112001 06-01-26 17:05:02 EM/BCO4013 LP/3 V35/4ID: 03000319 TYPE: communications CAUSE: dteDceInterfaceErrorCO: LineState -> dce: rfs dsr dcd ~rts does not match provisioned, or clock is not available. Check the cabling or the far end device.RAW: oos ADMIN: unlocked OPER: disabled USAGE: idle AVAIL: PROC: CNTRL: ALARM: STBY: notSet UNKNW: INT: 3/0/2/21147;PmsHwProcessHandler_Actor.cc;524;PCR6.1.58;

DetailsIf the status is set, the link is in a state that renders the port disabled.

On a V35 or a X21 port, if the incoming line state is not consistent with that described in the provisionable attribute readyLineState, the port is disabled.

A clear will be issued when the condition has been cleared.

Probable CausedteDce interface error

TypeCommunications

Remedial actionIssue the display command on the component that has issued the alarm to discover the cause of the problem.

Page 7: Codigos alarmas

For a V35 or X21 port, check if the readyLineState attribute is set up as expectedand verify that the cable is connected properly.

09990012MAJOR SET 09990012 06-01-30 09:16:19 EM/IGU4024 LP/0 OAMENET/0ID: FE839F77 TYPE: unknown CAUSE: unknownCO: Proxy alarm generated as a result of OSI Notification.Please refer to EM/IGU4024 LP/0 and subcomponents for possible causesof the problem.RAW: oos ADMIN: unlocked OPER: disabled USAGE: AVAIL: PROC: CNTRL: ALARM: STBY: UNKNW: INT: ;;;;

DetailsThe state of the component has been changed due to a state change notification (SCN) received from the network. When the status of the alarm is a set, it means that the SCN indicated that the component is down, and there are no corresponding active alarms (generated by the switch) on that component to indicate that it is down. Therefore Preside Multiservice Data Manager created a proxy alarm to replace the missing alarm. Preside Multiservice Data Manager will mark thecomponent Out of service.When the status of the alarm is a clear, it means that the SCN indicated that the component is up, and there were active alarms on that component to indicate that it is down. Therefore Preside Multiservice Data Manager generates a proxy alarm to replace the missing clear. Preside Multiservice Data Manager will mark the component Inservice.

The comment text of this alarm contains information on the SCN that was received that triggered Preside Multiservice Data Manager to create the proxy alarm.This alarm is generated completely within Preside Multiservice Data Manager. It is never spooled to a Passport disk and never appears on the text interface device.

Remedial actionThis alarm is issued either because the SCN was received before the alarm issued by the switch, which would put the component in the proper state (and the proxy alarm will be cleared when the real alarm is received) or because the alarm issued by the switch has been lost.Treat proxy alarms as you would treat regular alarms and use them in debugging network problems.

Component Severity StatusEM/<component id> major/cleared set/clear

00001000CRITICAL SET 00001000 06-01-27 13:56:46 EM/POB4008 LP/13 E1/2 CHAN/31ID: 0D002A92 TYPE: operator CAUSE: operationalConditionCO: The component is lockedRAW: oos ADMIN: locked OPER: disabled USAGE: idle AVAIL: offLine PROC: CNTRL: ALARM: STBY: notSet UNKNW: REL COMP: EM/POB4008 LP/13INT: 13/1/2/24403;osiState.cc;670;PCR6.1.58;

DetailsWhen the status is set, the component has gone into a locked state or a shutting down state.The locked component is no longer permitted to provide service. As a result, dependent components may be operationally disabled as well.When the status is clear the component is unlocked.The OSI administrative state attribute in the alarm specifies the new administrativestate for the component.

Probable CauseDenial of service, Loss of signal, Operational condition

Page 8: Codigos alarmas

TypeOperator, Communication

Remedial actionWhen the status is set, issue the unlock command to attempt to unlock the component.When the status is clear, no remedial action is required.

70115003CRITICAL SET 70115003 06-01-27 16:24:40 EM/ENV4011 LP/6 E1/1ID: 0600001E TYPE: communications CAUSE: lossOfSignalCO: Loss of Signal condition has been detected (losAlarm). Check the cabling and termination panel.RAW: oos ADMIN: locked OPER: disabled USAGE: idle AVAIL: offLine PROC: initializing CNTRL: ALARM: STBY: notSet UNKNW: INT: 6/1/2/16524;PmsPriLinkStateHandler_Actor.cc;476;PCR6.1.58;

DetailsIf the status is set, the link has been in an Loss of Signal (LOS) state for greater than2 seconds.A clear will be issued when the LOS condition has been cleared for more than 10seconds.

Probable CauseLoss of Signal

TypeCommunications

Remedial actionCheck the cabling between this port and the far end port.

70150002This alarm is generated when the time difference between the Passport UTC (moduleTime minus offset) and the network time server is greater than 1000 seconds.The synchronization status of Passport XNTP is changed to unsynchronized and the main server is set to NULL. This allows the operator to correct the time by setting the moduleTime manually.A clear is issued when the network time is corrected to within 1000 seconds of the network time server.

Probable CauseRemote transmission error

TypeEnvironmental

Remedial actionCheck the time of the network time servers and the time of Passport module, and correct the time manually by setting the moduleTime of the Passport module. After a few minutes, the Passport XNTP synchronizes with the network time server.

7041 0150CRITICAL SET 70410150 06-01-31 11:47:58 EM/CON4014 ATMIF/41 UNI SIGID: 040000B7 TYPE: communications CAUSE: commProtocolErrorCO: Signalling channel down! QSAAL received disconnectRAW: oos ADMIN: unlocked OPER: disabled USAGE: idle AVAIL: PROC: CNTRL: ALARM: STBY: notSet UNKNW: REL COMP: EM/CON4014 LP/4INT: 4/0/2/19261;hjSigLayerMgr.cc;1834;PCR5.2.65;

Details

Page 9: Codigos alarmas

If the status is set, the alarm indicates that the signalling channel is down.If the status is clear, the alarm indicates that the signalling channel is up.

Probable CauseProtocol error

TypeCommunications

Remedial actionIf the alarm is on during the system start up, it is possible that- the “side” attributes for both ends of the Signalling channel may not be setproperly (i.e. both ends may have been set to user-to-user or network-to-network).- the vpi/vci value for the Signalling channel for both ends may not be matched.- one end of the Signalling channel was down and the switch may required to be restarted.If the <atmif type> is Pnni, it is possible that the Rcc channel is not up.

70070000DetailsThis alarm is generated if there is a DLCI in a troubled condition under the FrAtm Interface. A troubled condition exists when there is not enough bandwidth available for a DLCI on a FrAtm interface. This condition must be cleared before a connection can be enabled. The alarm is set only once when the initial DLCI on the interface experiences this troubled condition.The alarm is cleared when all the DLCIs on the interface are no longer in a troubled condition.

Remedial actionDetermine amount of bandwidth and the bandwidth pool that is being requested by the connection by displaying the equivalentBitRate and assignedBandwidthPool attributes respectively, under the interworking function. Ensure there is sufficient bandwidth to accommodate the request by displaying the Connection Administrator (CA).Increase the percentage of bandwidth allocated in the bandwidth pool used by the troubled DLCI.If CAC is not required, then it can be turned off.

70030001DetailsIf the status is set, a data collection system (DCS) Agent’s queue has reached a 75%full threshold. If the queue becomes full then subsequent records will be discarded.There are three possible reasons for this to occur. First, if there are no requestors for this particular DCS data type then the records will be held in the Agent queues.Second, the provisioned queue size may be insufficient for the amount of DCS data traffic. Thirdly, a requestor of this particular DCS data type could be slowing or blocking the flow of DCS records. For example, if spooling is provisioned to on, and the spooler is the only requestor of data then if the spooler is locked or not spooling because of a recoverable condition detected by the file system (for example, disk full), data flow will be blocked and the Agent queues will fill up.A clear will be issued when the queue size drops below 50% full and at least 5 minutes has elapsed since the set was issued. This delay provides a throttling mechanism so that bursts of records for relatively small queue sizes do not cause too many alarms within a short time. The clear will indicate how many records were discarded (if any) in the event that data arrived after the queue became 100% full.

Probable CauseThreshold crossed

TypeQuality of service

Remedial actionIf you do not wish to monitor or collect this particular type of DCS data, then you may want to provision the queue size to be zero for this data type. If this is done, then all records of this type will be discarded.Verify that the queue size is provisioned to an adequate size for the expected amount of traffic.

Page 10: Codigos alarmas

Verify that there are no requestors of this DCS data type holding up the flow of records. For example, if spooling is provisioned to on, and the spooler is locked, then unlock it. If the spooler is not spooling because of a recoverable condition detected by the file system, then clear the file system problem. For information on resolving such problems, refer to the Troubleshooting chapter in 241-5701-605 Passport 7400, 15000 User Access Guide.

Field Description Example from “Alarmon a text interfacedevice” (page 22)

M/O

type This is a general explanation of why the alarm was generated. Possible values for this field include communications, quality of service, processing, equipment, or environmental, security, operator, debug,or unknown.For further details, refer to “Type” (page 26).

equipment M

cause This provides another level of detail of whythe alarm was generated. It is informationgiven in addition to the general explanationgiven in the Type field.For further details, refer to Appendix“Alarm causes” (page 775).

processorProblem M

Alarm index An eight-digit number which is the principalalarm identifier. It consists of anIndexGroup and SubIndex. For furtherdetails see “Alarm index” (page 26)

11010001

ADMIN, OPER,USAGE

These fields describe the possible OSIstates, that is, the administrative state,operational state, and usage state of thecomponent.For information on OSI states, refer to “OSIstates” (page 27).For component-specific OSI statecombinations, refer to the appropriateAppendix in 241-5701-520 Passport 7400,15000 Troubleshooting Guide. Some ofthis information is also included in the userguides for the various services.

unlocked/disabled/idle