debuggingwindbgkernel-mode

WinDbg loses connection debugging over network, and target machine freeze


I'm trying to get WinDbg debugging over the network to work, but it always loses connections after I break into the debugger (Debug->Break), and then try to start it again (Debug->Go). However, if I never break into the debugger, it looks like the connection is stable for an 'N' period of time. I can even see debug print statements in WinDbg as I use the target system during this grace period. Moreover, It seems like the connection is good while in debug break, because I can gather information from the target system. I use "!ustr srv!SrvComputerName" to get the target computer name, and it returns the correct name. Any help would be much appreciated.

Setting up the systems: I followed instructions from MSDN website to setup my target and host systems.

Debugging: Below are my attempts to resolve this issue.

  1. Disabling Flow Control, and using Half Duplex mode, on the network adapter. I tried this after reading this post: WinDbg, host machine lose network if test machine is on the same switch
  2. Buying new network adapters. According to this webpage, my network adapter should support network kernel debugging. However, further investigation shows that vendors have a bad habit of not updating their device IDs, so I decided to rule out this possibility by buying new adapters from different vendors.
  3. Changing network port. I've tried a hand full of different network ports (49152-65535) just in case one of them is being used for a different purpose.
  4. Unplugging the Ethernet cable, and then plug it back in. Once the connection has been lost, I tried this hoping it would re-establish connection.
  5. Rebooting the target system. Same reason as #4.
  6. Changing PCIe ports. I'm running out of options.
  7. Moved host system to a different network switch. No change.

Observation:

  1. Wireshark shows that the target system sends a UPD packages to the host system as soon as the system boots up, but the host system does not respond until WinDbg is launched. More interestingly, the target system continue sending UPD packages to host even after the target system has become unresponsive. Unfortunately, I don't understand the UPD packages data.
  2. WinDbg can consistently re-establish connection with target system, if restarted. The target system seems to be stuck in debug break.

System Info: The host system is running Windows 8.1 Pro. The target system is running a Windows 8.1 Enterprise Evaluation (8GB of RAM).

WinDbg print out:

Microsoft (R) Windows Debugger Version 6.3.9600.17237 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.

Using NET for debugging
Opened WinSock 2.0
Waiting to reconnect...
Connected to target **.**.*.*** on port ***** on local IP **.**.*.***
Connected to Windows 8 9600 x64 target at (Fri Mar 27 18:58:06.217 2015 (UTC - 7:00)), ptr64 TRUE
Kernel Debugger connection established.

************* Symbol Path validation summary **************
Response                         Time (ms)     Location
Deferred                                       srv*C:\Symbols*http://msdl.microsoft.com/download/symbols
Symbol search path is: srv*C:\Symbols*http://msdl.microsoft.com/download/symbols
Executable search path is: 
Windows 8 Kernel Version 9600 MP (4 procs) Free x64
Product: WinNt, suite: TerminalServer SingleUserTS
Built by: 9600.17031.amd64fre.winblue_gdr.140221-1952
Machine Name:
Kernel base = 0xfffff801`00e70000 PsLoadedModuleList = 0xfffff801`0113a2d0
Debug session time: Fri Mar 27 18:58:06.918 2015 (UTC - 7:00)
System Uptime: 0 days 0:47:15.869
Break instruction exception - code 80000003 (first chance)
*******************************************************************************
*                                                                             *
*   You are seeing this message because you pressed either                    *
*       CTRL+C (if you run console kernel debugger) or,                       *
*       CTRL+BREAK (if you run GUI kernel debugger),                          *
*   on your debugger machine's keyboard.                                      *
*                                                                             *
*                   THIS IS NOT A BUG OR A SYSTEM CRASH                       *
*                                                                             *
* If you did not intend to break into the debugger, press the "g" key, then   *
* press the "Enter" key now.  This message might immediately reappear.  If it *
* does, press "g" and "Enter" again.                                          *
*                                                                             *
*******************************************************************************
nt!DbgBreakPointWithStatus:
fffff801`00fcab90 cc              int     3
0: kd> g
... Retry sending the same data packet for 64 times.
The transport connection between host kernel debugger and target Windows seems lost.
please try resync with target, recycle the host debugger, or reboot the target Windows.
... Retry sending the same data packet for 128 times.
... Retry sending the same data packet for 192 times.

At this point WinDbg is no longer responsive, and continue sending data packets. The target system is also non-responsive.


Solution

  • I finally solved this problem by switching the host system. In the beginning, I thought the target system was the problem, because MSDN only put the NIC debug requirement on the target system. It appears that there might be requirements placed the host system as well.

    New host system: Desktop (Identical to target system)

    Previous host system: Laptop

    NOTE: I don't really know the root cause. Both NICs are on the Supported Ethernet NICs list, I used the same WinDbg version that came with the WDK, and all systems are on the same switch.