[previous] Clear Spacer [next]

Ethernet Packet Loss

Use these sections to identify and correct Ethernet packet loss:

See "Ethernet Packet Loss Reference" for additional conceptual and problem analysis detail.


If your Ethernet network shows signs of congestion, it may be experiencing packet loss. When your network is congested, utilization is usually high, packets are discarded because buffers are full, and collision rates are up. Problems related to "Collisions" are often at the heart of packet loss.

Understanding the Problem

Collisions are normal in Ethernet networks. In many cases, Collision rates of 50 percent do not cause a large decrease in throughput. The Collision rate helps mark the upper limit on your network (the maximum percentage of collisions that your network can bear), which is usually around 70 percent. If Collisions increase above this upper limit, your network can become unreliable.

When the Collision rates increase, so do "Excessive Collisions", which causes a delay in transmitting data. An increase in Collisions also indicates that network utilization and network errors, such as "FCS Errors", are probably increasing.

The real packet problems to watch for, however, are undetected collisions that show up as "Late Collisions".

If small packets are colliding, you do not necessarily see a rise in utilization, but you may still have a problem. Capture packets to determine their size.

Identifying the Problem

To identify that your network's problem is related to packet loss, verify that frames are being dropped on your network by examining this packet loss data:

The process of identifying the problem is discussed in "Searching for Packet Loss".

Solving the Problem

If you notice that packet loss data is consistently high, then your network is too congested. In this case, segment your network with the appropriate network device (such as a switch or router). If Collision data shows increases but your network's utilization is the same, then your network may have a physical problem, such as cabling that is too long. Other problems that packet loss data can indicate include:

Possible solutions to these problems are explained in the procedures in "Searching for Packet Loss".


Searching for Packet Loss

When you look for packet loss, use the following applications:

Status Watch

Status Watch monitors:

Follow these steps:

1 .   Determine if the thresholds for the Alignment Errors tool and FCS Errors tool are being exceeded.

Table 16 identifies the problems that this data can indicate and your possible actions. For information about problems related to a nonstandard Ethernet implementation, see "Nonstandard Ethernet Problems".

Table 16 Alignment Errors, FCS Errors, and CRC Errors Data

Possible Problem

Possible Action

Faulty cabling

Examine the cable and cable connections for breaks or damage.

Network noise

Look for improper cabling, faulty cable, faulty network equipment, or cables that are too close to equipment that emits electromagnetic interference (lamps, for example).

Faulty transceiver

Use an analyzer to identify the problematic transceiver. If necessary, replace the transceiver, network adapter, or station.

Fault at the transmitting end station

1 . Locate the source of the errors by looking at the module and port statistics.

2 . Verify the correct operation of the transceiver or adapter card of the device that is connected to the problem port.

3 . If the card appears to be operating correctly, examine the cable and cable connections for breaks or damage.

Station powering up or down

None required.

Early implementations of Ethernet transceivers generate a significant amount of in-band noise when powering up; they frequently cause Alignment Errors and FCS Errors in an otherwise stable network.

When powering up, some software drivers for Ethernet controllers also initiate Time Domain Reflectometry (TDR) tests to test the Ethernet media. Network monitors report TDR tests as Alignment Errors and FCS Errors.

Faulty adapter

Replace the adapter.

2 .   Determine if the Excessive Collisions tool threshold is being exceeded.

Table 17 identifies the problems that this data can indicate and your possible actions.

Table 17 Collisions and Excessive Collisions Data

Possible Problem

Possible Action

Busy network

Use a bridge, router, or switch to reconfigure your network into segments with fewer stations.

Faulty device (adapter, switch, hub, and the like) that does not listen before broadcasting. This problem increases the incidence of all types of collisions.

Isolate each adapter to see if the problem stops.

Network loop

Ensure that no redundant connections to the same station have both connections active simultaneously.

3 .   Determine if the Receive Discards and Transmit Discards tools thresholds are being exceeded.

If these errors are high in conjunction with the data that you learned in steps 1 and 2, then your network is overloaded. Segment your network.

LANsentry Manager Network Statistics Graph

Use the LANsentry Manager Network Statistics graph to view data for:

Follow these steps:

1 .   Display a Network Statistics graph for the local Ethernet segment on which users have reported poor performance.

This graph shows the most recent trend in Collision rates. If you have set up a History sample, you can also look at the historical trend. If a number of segments are connected by repeaters, examine the graph for each Ethernet segment.

2 .   Analyze Utilization and Collision rates to determine whether collisions are caused by an overloaded segment or a faulty component.

3 .   Examine the CRC Errors and Late Collisions, which often indicate cabling or component problems.

Table 16 identifies the problems that CRC Errors can indicate and your possible actions. Table 18 identifies the problems that Late Collisions data can indicate and your possible actions.

Table 18 Late Collisions Data

Possible Problem

Possible Action

Cabling problems:

Correct the cabling problem by doing one or more of the following:

Component problems:

Correct the component problem by doing one of the following:

4 .   Trace Too Short Errors and Too Long Errors to the sender.

These errors often indicate faulty routers or LAN drivers and transceiver problems. Table 19 identifies the problems that this data can indicate and your possible actions.

Table 19 Too Long Errors and Too Short Errors Data

Possible Problem

Possible Action

A transceiver on your network is adding bits to the packets that are transmitted by the attached station.

1 . Use a network analyzer to identify the problematic transceiver.

2 . If necessary, replace the transceiver, network adapter, or station.

The jabber protection mechanism on a transceiver has failed; it can no longer protect the network from the jabbering produced by the attached station.

Replace the network card.

Excessive noise on the cable

Note: Some 10/100 Mbps cards that autodetect the network speed may connect to the network at the wrong speed, causing excessive noise.

Check for improper cabling, faulty cable, faulty network equipment, or cables too close to noisy electronic equipment (lamps, for example)

If your network card autodetects the network speed, and you have ruled out other problems, manually configure the network speed.

Faulty routers (two different network types are connected and the router is not enforcing proper frame size restrictions)

Notify the manufacturer.

Faulty LAN driver

Replace the driver.

A normal condition on a LinkSwitch® 1000, LinkSwitch® 3000, or CoreBuilder 5000 FastModule

If you use maximum-sized, 1518 Ethernet frames, the device's VLT-enabled ports add a frame tag of 4 bytes, resulting in a misleading Too Long Frame error.

These frames are passed successfully but will create the Too Long Frame error message.

If you want to eliminate the error message, reduce your Ethernet packet frames by 4 bytes.

Device View

Device View allows you to display a variety of port and device-level statistics relevant to Ethernet packet loss. Table 20 describes these statistics and their use in troubleshooting.

Table 20 Activity and Error Statistics in Device View

Statistics Group

Description

Use in Troubleshooting

Activity

Displays the total network activity and errors on the selected port.

This data shows readable packets, broadcast packets, "Collisions", total errors, and runts, which cause "Too Short Errors". You can interpret this data in the following ways:

Errors

Displays the number of frames with errors on the selected port.

The significance of errors depends on accompanying errors and prevailing network conditions. See the following error data for more information:

:

To display Activity and Errors statistics for a device or port, follow these steps:

1 .   Select the required port or device.

2 .   From the shortcut menu, select Activity or Errors.

The statistics available depend on the type of port or device selected. See Table 20 for troubleshooting information.

You may not be able to access these statistics on some devices using Device View. See the Device View documentation for additional information.


Ethernet Packet Loss Reference

This section explains terms that are relevant to Ethernet packet loss and provides additional conceptual and problem analysis detail.

Alignment Errors

An Alignment Error indicates a received frame in which both are true:

Alignment Errors often result from MAC layer packet formation problems, cabling problems that cause corrupted or lost data, and packets that pass through more than two cascaded multiport transceivers. See "FCS Errors" for more information about interpreting Alignment Errors.

Collisions

Collisions indicate that two devices detect that the network is idle and try to send packets at exactly the same time (within one round-trip delay). Because only one device can transmit at a time, both devices must stop sending and attempt to retransmit. Collisions are detected by the transmitting stations.

The retransmission algorithm helps to ensure that the packets do not retransmit at the same time. However, if the two devices retry at nearly the same time, packets can collide again; the process repeats until either the packets finally pass onto the network without collisions, or 16 consecutive collisions occur and the packets are discarded.

CRC Errors

A Cyclic Redundancy Check (CRC) Error is an RMON statistic that combines "FCS Errors" and "Alignment Errors". These errors indicate that packets were received with:

CRC Errors can cause an end station to freeze. If a large number of CRC Errors are attributed to a single station on the network, replace the station's network interface board. Typically, a CRC Error rate of more than 1 percent of network traffic is considered excessive.

Excessive Collisions

Excessive Collisions indicate that 16 consecutive collisions have occurred, usually a sign that the network is becoming congested. For each excessive collision count (or after 16 consecutive collisions), a packet is dropped. If you know the normal rate of excessive collisions, then you can determine when the rate of packet loss is affecting your network's performance. See "Knowing Your Network's Configuration" for more information.

FCS Errors

Frame Check Sequence (FCS) Errors, a type of CRC, indicate that frames received by an interface are an integral number of octets long but do not pass the FCS check. The FCS is a mathematical way to ensure that all the frame's bits are correct without having the system examine each bit and compare it to the original. Packets with Alignment Errors also generate FCS Errors.

Both Alignment Errors and FCS Errors can be caused by equipment powering up or down or by interference (noise) on unshielded twisted-pair (10BASE-T) segments. In a network that complies with the Ethernet standard, FCS or Alignment Errors indicate bit errors during a transmission or reception. A very low rate is acceptable. Although Ethernet allows a 1 in 108 bit error rate, typical Ethernet performance is 1 in 1012 or better.

Late Collisions

Late Collisions indicate that two devices have transmitted at the same time, but cabling errors (most commonly, excessive network segment length or repeaters between devices) prevent either transmitting device from detecting a collision. Neither device detects a collision because the time to propagate the signal from one end of the network to the other is longer than the time to put the entire packet on the network. As a result, neither of the devices that cause the late collision senses the other's transmission until the entire packet is on the network.

Although late collisions occur for small packets, the transmitter cannot detect them. As a result, a network suffering measurable Late Collisions for large packets is losing small packets as well.

Nonstandard Ethernet Problems

Table 21 lists the symptoms that typically occur if a system violates the Ethernet standard.

Table 21 Symptoms of Common Ethernet Network Problems

Symptoms

Problem

Notes

"FCS Errors" and "Alignment Errors" increase significantly.

Network cabling is too long.

If you use a promiscuous network monitor, the number of Late Collisions reported by stations should correlate with the FCS and Alignment Errors reported by the monitor.

FCS and Alignment Errors increase proportionally with interference (sometimes referred to as noise hits).

Network segment is noisy.

Typically observed on a 10BASE-T network segment in a noisy environment. If you use multiple promiscuous monitors, the FCS and Alignment Errors among the monitors will not correlate.

If the monitor can track runts, also called "Too Short Errors", the number of runt packets should be significantly higher than normal.

FCS and Alignment Errors are much higher than normal.

Networks do not conform to the access scheme of Carrier Sense Multiple Access with Collision Detect (CSMA/CD).

Occurs when some implementations of Ethernet in the segment are not entirely compatible with IEEE 802.3 repeaters.

Collision fragments linger on the network long enough to collide with retry packets at the minimum interpacket gap (IPG). The IPG is smaller on one side of the repeated network, causing a lost packet.

Ethernet controllers cannot receive packets that are separated by 4.7 µs or less. Some controllers cannot sustain receptions of packets separated by as much as 9.6 µs. If runt packets are received one after another and are followed by a collision fragment, Ethernet controllers that cannot sustain reception will lose packets.

Receive Discards

Receive Discards indicate that received packets could not be delivered to a high-layer protocol because of congestion or packet errors.

Too Long Errors

A Too Long Error indicates that a packet is longer than 1518 octets (including FCS octets) but otherwise well formed. Too Long Errors are often caused by a bad transceiver, a malfunction of the jabber protection mechanism on a transceiver, or excessive noise on the cable.

Too Short Errors

A Too Short Error, also called a runt, indicates that a packet is fewer than 64 octets long (including FCS octets) but otherwise well formed.

Transmit Discards

Transmit Discards indicate that packets were not transmitted because of network congestion.

[previous] Clear Spacer [next]