|   Register   |  
Search  

Optimizing Windows NT

Last Updated 2/3/2009 3:42:59 PM


Abstract


This chapter discusses a number of strategies and techniques for troubleshooting and recovering Windows NT.

THIS CHAPTER ASSISTS in resolving some of the problems that can occur with your Windows NT installation. There are a number of different events that can cause problems on your system, including hard drive failure or corruption, a buggy application or application installer, a problematic Registry modification, or a virus attack. There are also a number of problems that are directly caused by bugs or incompatibilities related to the Windows NT operating system itself.

The information in this chapter may also prove useful should you encounter a situation where your NT system refuses to boot. Assuming that you’ve taken the precautionary steps outlined in Chapter 12 to prepare yourself for such situations, there is an excellent chance that you’ll be able to recover your system to its original state. Even if you haven’t made the preparations outlined in Chapter 12, you may still be lucky enough to perform a successful system recovery. However, the chances are much better when you have your “Break Glass in Case of Emergency” materials at the ready; these materials should include at least a full system backup and an updated Emergency Repair Disk. An even better set of disaster recovery tools includes several generations of the Emergency Repair Disk, a Windows NT startup disk, a backup of the Master Boot Record (MBR) and partition tables of every hard drive on the system, and extra copies of the Windows NT Registry hive files on a removable storage device.


SERVICE OR DRIVER FAILING TO LOAD

The most commonly seen problem in Windows NT is the failure of a service or driver at system startup. This circumstance results in the appearance of the dialog box shown in Figure 13-1.

This situation often occurs after a hardware change of some kind such as the installation of a new video or SCSI card or changing your CD-ROM drive from an IDE version to a SCSI version. It can also occur after an application has been added to or removed from the system or if a device fails to initialize properly when its driver loads.

The first thing to do in such situations is identify exactly which driver(s) or service(s) are failing. This is accomplished with the Windows NT Event Viewer application. After running Event Viewer, examine the System and Application logs to see which services or drivers have red stop signs next to events related to them. For example, let’s say you just replaced one SCSI card in your system with a different SCSI card and successfully installed the driver for the new card. However, on starting the system you receive the “At least one service or driver failed during system startup. Use Event Viewer to examine the event log for details” message (Figure 13-1). An examination of the Event Viewer System log would reveal something similar to the screen shown in Figure 13-2.

NOTE: The System log displays events related to system services and drivers, whereas the Application log displays user application-related events. The third log is the Security log.

Here, we see that a service is failing to load at startup (as indicated by the red Stop icon and the event type of Service Control Manager next to it). Double-clicking the entry brings up the event detail window shown in Figure 13-3 that provides us with additional information about the problem.

Examining this Event Detail window, we recognize the driver name, aic78xx, as the driver for the SCSI card we just removed. Now that we’ve identified the source of the boot-up error message, we can diagnose the problem further and determine its cause. In this case, we forgot to tell NT that we’re no longer using the old card so it’s still trying to load its driver at boot. Because we know this card is no longer in the system, we know that we can safely remove its driver without causing boot problems. Normally, this procedure is accomplished using the SCSI Adapters Control Panel to remove the driver for this card. However, it is possible that even when you have properly removed a device’s driver, Windows NT may continue to try loading its driver at startup. This happens because the device’s startup type does not get changed to Disabled or Manual, and NT therefore attempts to initialize it during each system boot.

In these situations, the remedy is to manually disable the appropriate device driver or service in either the Services or Devices Control Panel (whichever is appropriate). Once there, locate the service or device in question and use the Startup... button to reconfigure the device/service’s startup type to either Manual or Disabled. This configuration dialog is shown in Figure 13-4.

Although the example related to a driver failure for a device that was no longer in the system, there are many other types of service and driver failures as well. The causes of these failures include all of these reasons:
  • Service log on failure. This occurs for a number of reasons, including a service’s logon account password changing, the logon account being deleted altogether, or the logon account not having the proper permissions required or not having been assigned the special “Logon As a Service” user right in User Manager.
  • One or more of the files related to a service or device driver was moved, deleted, corrupted, or is for a different version of Windows NT.
  • A service or device driver on which a service is dependent has failed to load.
  • There is a resource conflict between the device the service or driver relates to and another device in the system (such as a hardware IRQ, I/O Base address, Upper Memory address, or DMA channel conflict).
TIP: If you receive the “At least one service or driver failed during system startup. Use Event Viewer to examine the event log for details” message for a device driver whose device you’re certain is no longer installed, you can stop the error message from occurring by setting the device’s Startup type to Disabled in the Devices Control Panel.

If you believe the problem could be related to a hardware resource conflict of some kind, try assigning different resources (e.g., IRQ, I/O Base Address, DMA channel, or Upper Memory Address range) to one or more cards on the system. After this, reboot the system and see if the error message repeats. Also try removing cards one by one and restarting the system after each removal to see if the service(s) or driver(s) in question begins to work properly.

There are many different types of service and device driver-related Event Viewer errors you might see when examining the Event logs, ranging from informational to extremely serious in nature. Because Windows NT automatically notifies you about any device driver or service (with an Automatic startup type) that failed to load, you always know at boot when something has gone awry. The next step is to examine the Event Viewer application to get additional information about exactly which services or drivers failed and why. More often than not, you’ll get specific information that will allow you to diagnose and solve the problem.


RESOLVING APPLICATION CRASHES WITH DR. WATSON

If you’ve used Windows NT for any length of time, it’s likely that you’ve encountered one or more application crashes that resulted in the appearance of a special Windows NT utility called Dr. Watson. Dr. Watson is a Windows NT application error debugger that Microsoft ships with Windows NT. This utility’s primary job is to detect application errors when they occur and to help you, Microsoft, and/or the application’s developer to diagnose and determine the cause of the error. It does this by automatically, at the time of the crash, creating an error log file and, optionally, a binary application “dump” file. These files log diagnostic information that provides helpful clues about why the crash occurred. When an application is consistently crashing, a programmer or technical support representative at the software vendor’s organization can often use this information to resolve the problem.

Windows NT automatically configures Dr. Watson as the system’s default crash debugger during installation. When a user application crash occurs, a Dr. Watson dialog box automatically appears and informs you that a log file is being generated. By default, the log file is named DRWTSN32.LOG and stored in the %SYSTEMROOT% (Windows NT installation) folder; for example, C:\WINNT. Once generated, you can either examine log file or forward it to technical support personnel at Microsoft or the application’s vendor for further examination.

Although Dr. Watson is automatically installed for you, it is possible to customize the program’s operation by running it manually. The program file is DRWTSN32.EXE and is located in the %SYSTEMROOT%\SYSTEM32 folder. To run it, go to an NT command prompt or the Start Menu Run option, type DRWTSN32, and press Enter. The Dr. Watson main dialog, shown in Figure 13-5, should appear.

The options found on this screen are:
  • Dump Symbol Table: This option determines whether Dr. Watson for Windows NT includes the symbol table for each involved module in the log file. Symbol table dumps contain the address and name for each symbol and require that the kernel debugger symbol files have first been installed (see the “Configuring STOP Error Behavior” section of Chapter 12 for information on installing these files). This option can provide a greater level of detail regarding the problem, but also increases the overall size of the resulting DRWTSN32.LOG file. This option is off by default.
  • Dump All Thread Contexts: This option controls which application threads Dr. Watson will dump information on. If checked, Dr. Watson will dump information on each thread in the application causing the error; otherwise, it will only log information about the thread that caused the error. This option is on by default.
  • Append to Existing Log File: Determines whether Dr. Watson appends log file information to the end of the existing DRWTSN32.LOG file (if any exists) or creates a new log file for each new application error. As with the symbol table option, be aware that this can significantly increase the size of the log file over time; if selected, be sure to periodically check the log file and delete it if necessary to save disk space. This option is on by default.
  • Visual Notification: Determines whether Dr. Watson displays a pop-up window with an OK button after an application error occurs. Even if selected, the box automatically disappears if the OK button isn’t selected within five minutes of the error being issued. This option is on by default.
  • Sound Notification: This option determines whether Dr. Watson plays a sound (.WAV file) when an application error occurs. The sound played is either the file specified in the Wave File option, or a standard beep if no .WAV file is selected. This option is off by default.
  • Create Crash Dump File: This option determines whether Dr. Watson creates a binary crash dump file. A support technician might request this file when diagnosing a problem, or it can be loaded into a debugging utility for examination. If you mark this checkbox you must also specify a filename for the crash dump file in the Crash Dump option box. This option is on by default.
To ensure that Dr. Watson generates a binary memory dump file in addition to the DRWTSN32.LOG file, make sure that at least the Create Crash Dump File option is selected and that a valid path and filename are listed in the Crash Dump field. By default, this file is named USER.DMP. This file should not be confused with the MEMORY.DMP file, which is a different memory dump file that is generated when a system STOP blue screen error occurs (see the next section for additional information on this topic).

Dr. Watson is normally configured to start automatically whenever an application crashes. However, if Dr. Watson doesn’t appear on your system, it is possible that it was disabled for some reason. To reset Dr. Watson as the default debugger on your system, run the utility with the /I switch as follows:
DRWTSN32 /I
After being run (e.g., from a command prompt session or the Start Menu Run dialog box), Dr. Watson is reinstated as the default system debugger.


STOP ERRORS (A.K.A. “BLUE SCREENS OF DEATH”)

It seems that every operating system has its dreaded error message that users never want to see. In MS-DOS, it is the EMM386 Exception Errors 12 and 13; in Windows 3.1, it is the “General Protection Fault.” Novell system administrators are loathe to see crash-indicative server “Abend” errors, and Linux users will tell of their experiences with “Kernel Panic” error messages. For Windows NT users, the dreaded message is the infamous STOP error, also affectionately known as a “Blue Screen of Death.” These are operating system–generated bug checks that purposefully bring the system to a halt when severe error conditions are encountered. The resulting blue text screen includes some type of STOP error message and often other information as well such as one or more hexadecimal codes and a list of all system drivers active in memory at the time of the STOP error. An example of a STOP error/bug check is shown in Figure 13-6.



Page: 1, 2, 3, 4, 5, 6, 7, 8

next page

Rate this:
Recent Comments
There are currently no comments. Be the first to make a comment.