These sections describe the steps that you can take to effectively troubleshoot your network when the need arises:
By designing your network for troubleshooting, you can access key devices on your network when your network is experiencing connectivity or performance problems. Having adequate management access depends on these design criteria:
The following sections discuss how to design your network with the preceding criteria in mind:
In a typical LAN, locate your management station directly off the backbone where it can conduct SNMP polling and manage network devices. The backbone is usually the optimum location for the management station because:
Make sure that the capacity of your backbone can accommodate the SNMP traffic that the management applications generate.
Figure 2 shows a management station that is set up at the network backbone and polling network devices.
Figure 2 SNMP Management at the Backbone
Although SNMP management from the backbone is a good way to keep track of what is happening on your network, do not rely on it exclusively. Because SNMP management occurs in-band (that is, SNMP traffic shares network bandwidth with data traffic), network troubleshooting using SNMP can become a problem in these ways:
To minimize the frequency of SNMP traffic on your network, set up one or more Probes to collect Remote Monitoring (RMON) data from the network devices. In the distributed model illustrated in Figure 3, the management station uses SNMP polling to collect data from the probes rather than from all the network devices. Distributing the management over the network ensures you of some continued data collection even if you have network problems.
Many management applications support data from MIBs other than the RMON MIBs. For this reason, even if you are using RMON probes, some SNMP polling to individual devices from a key management station is always useful for a complete picture of your network.
Figure 3 Management at the Backbone with an Attached Probe
To extend your remote monitoring capabilities, use embedded RMON probes or roving analysis (monitoring one port for a period of time, moving on to another port for a while, and so on). However, with roving analysis, you cannot see a historical analysis of the ports because the probe is moving from one port to another.
Some probes, like 3Com's Enterprise Monitor, are designed to support the large number of interfaces that are found in switched environments. The probe's high port density supports this multi-segmented switched environment. You can also use the probe's interfaces to monitor mirror (or copy) ports on the switch, which means that all data received and transmitted on a port is also sent to the probe.
Probes do not indicate which port has caused an error. Only a managed hub (a hub or switch with an onboard management module) can provide that level of detail. Probes and a hub's own management module complement each other.
On business-critical networks, you need to increase your level of management by dedicating probes to the essential areas of your network. For detailed network management, it is not enough to gather raw performance figures - you need to know, at the network and conversation level, what is generating the traffic and when it is being generated. For this type of analysis, use reporting tools, such as Traffix Manager, and low-level, fault diagnostic tools, such as LANsentry Manager®.
The three critical areas to monitor on this type of network are discussed in these sections and shown in Figure 4:
Figure 4 Probes Monitoring a Business-critical Network
On the FDDI backbone, you need to continually monitor whether it is being overutilized, and, if so, by what type of traffic. By placing the SuperStack® II Enterprise Monitor with an FDDI media module directly at the backbone, you can gather utilization and host matrix information. Traffix Manager uses these data to provide regular segment utilization reports and Top-N host reports. In addition, the probe provides a full range of FDDI performance statistics that LANsentry Manager can record or that SNMP traps can report to the management station.
To ensure management access to the probe, provide a direct connection to the probe from your management station. You can use this connection to access probe data even if the ring is unusable and keeps management traffic off the main ring.
The Internet link is a concern for dedicated network management because it:
In a way that is similar to monitoring the FDDI backbone, Traffix Manager reports can indicate whether you are paying for too much bandwidth or whether you need to purchase more. Traffix Manager can also indicate the level of use on a workgroup basis for internal billing and highlight the top sites that users visit. Similarly, you can monitor for unexpected conversations and protocols.
You also need to know the error rates on this link and whether you are experiencing congestion because of circumstances on the Internet provider's network. LANsentry Manager can record and display these statistics and provide a detailed real-time view.
The third area of interest in this network is the large number of switch-to-end station links. When detailed analysis of these devices is required (for example, if one of the ports on the network suddenly reports much higher traffic than normal), you need to track the source of the problem and decide whether you can optimize the traffic path. In this case, you need a way to view the traffic on the switch port at a conversation level.
By placing a Superstack II Enterprise Monitor in a central location, you can easily attach it to the switches that have the most Ethernet ports as the need arises. By using the roving analysis feature of many 3Com devices, you can copy data from a monitored port to the port on the switch that is connected to the SuperStack II. When a problem arises, roving analysis is activated for a particular switch and LANsentry Manager or Traffix Manager collects the data from the SuperStack II Enterprise Monitor. These applications can then monitor the network data for the devices that are connected to that switch.
To minimize your dependency on SNMP management, set up a way to reach the console of your key networking devices. Through the console, you can often view Ethernet, FDDI, Asynchronous Transfer Mode (ATM), and token ring statistics, view routing and bridging tables, and determine and modify device configurations.
Out-of-band (that is, management using a dedicated line to a device) console connections are also key to network troubleshooting. If the network goes down, your console connections are still available.
The types of console connections include:
Figure 5 shows management of a device through the serial line and modem ports.
Figure 5 Out-of-band Management Using the Serial and Modem Ports
Sometimes, direct access to network devices through out-of-band management is the only way to examine a network problem. For example, if your network connections are down, you can Telnet to one of your key routers and examine its routing table. The routing table lists the devices that the router can reach, allowing you to narrow the area of the problem. You can also Ping from this device to further investigate which areas of the network are down.
Although out-of-band management keeps you in contact with a particular device during a network problem, it does not inform you about all the areas of your network from a central point. You must access each device separately. To manage devices more centrally, you can set up a communications server (often called a comm server). See Figure 6.
Figure 6 Out-of-band Management with a Communications Server
For optimal benefit, provide two management connections to the comm server:
To ensure that a management station can always access the backbone, set up a redundancy system of management. In this setup, management applications (often different ones) run on separate management workstations, which are connected to the backbone through separate network devices or by using a network card.
This setup allows the management workstations to monitor each other and report any problems with their attached network devices. The redundancy system also provides a backup management connection to your network if one management station loses connectivity.
This section provides some additional tips for designing your network for troubleshooting.