Archive for the ‘Postmortem Debugger’ Category

ASP.NET Worker Process Recycle/.NET Application crash - What not to do in Finalizer - includes asp.net 2.0 debugging lab

Monday, March 23rd, 2009

Click to Download debugging lab

Debugging Labs

SFTSRC.StockTrader asp.net 2.0 web application will be used to introduce you to a series of debugging labs. We will start with a simple asp.net application which crashes and follow up with memory leak, memory fragmentation, application performance and application hang.

The first lab in series will show you how an application can crash if you are not being careful while implementing finalizer.

Issue Description

Website user gets “Server Application Unavailable error” message from time to time.

ASP.NET Debugging Lab

This lab is prepared based on a production issue resolved at a customer site. Please note that you can download this debugging lab from here. Web project/solution has been created using visual studio 2008.

What will you learn

1. How to debug asp.net application

2. How to identify what is causing asp.net application pool to recycle

3. WinDbg / sos basic commands to analyze a crash dump

Application Architecture

architecture

Breaking down your thought process

1.  Since website user gets to see a generic error handler page, you will need to identify why http request is failing.

2. As shown in the above diagram, website and webservice are hosted on two different servers, so you need to identify the server where request is failing

3. You should look at the http error logs/iis logs and the events logs on each server to identify the issue

4. Logs will also help you determine if there is a network issue

Analysis

After looking at the server logs, we have determined that worker process is getting recycled on a sever hosting Sftsrc Stock Trading WebService. A worker process can get recycled depending on the configuration, low on virtual memory, crash due to unhandled exception etc. Let’s assume that you don’t have any information on what has caused worker process to recycle

Let’s look at the lab which is prepared based on the above issue description.

Lab

We will analyze this issue using Lab01- Download it from here.

1. Once you download the lab, create 2 virtual directory “StockTrader” and “TradeWS”

2. “StockTrader” will be set to folder “\Lab01\SFTSRC.StockTrader”

3. “TradeWS” will be set to folder “\Lab01\SFTSRC.UnmanagedService”

4. Browse to http://localhost/StockTrader/Default.aspx

You will see the website as shown below
debugging lab 01 snapshot

5. Enter Stock Symbol, Quantity, Price and click on Submit Order

6. Once you submit an order, you will see “Server Unavailable Error” page in a few seconds.

7. If you look at the iis log, you will see http 503 error

please visit http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html for description on HTTP 1.1 status code

8. Event log will have the error message “asp.net process recycle or stopped unexpectedly.”

9. Attach a debugger to your worker process

10. we will use adplus configuration file to get a crash dump on first chance access violation.

11. download accessviolation.cfg from here

12. go to your windbg folder and run the following command

cscript.exe adplus.vbs -c accessviolation.cfg

13. Browse to http://localhost/StockTrader/Default.aspx again

14. Once you get “server unavailable error”, you should have the dump files created under your windbg folder

15. Please note that adplus configuration file has  <ProcessName>aspnet_wp.exe</ProcessName> by default, if you are using windows server then change it to <ProcessName>w3wp.exe</ProcessName>. You may have multiple dumps created if you have more than one application pool. You can identify the worker process id for the application pool hosting TradeWS and then you can create a dump based on process id.

16. Open the dump file( *****.EXE__1st_chance_AccessViolation__full_*****c.dmp) using windbg

17.  Run following command to load sos dll

0:009> .loadby sos mscorwks

18. You should always run !analyze -v when you are doing a crash dump analysis, most of the time this one command is enough to tell you what has caused the crash

0:009> !analyze -v

Attempt to read from address 00000000

DEFAULT_BUCKET_ID:  NULL_POINTER_READ

PROCESS_NAME:  aspnet_wp.exe

MANAGED_STACK:
0195F948 05CD0848 SFTSRC_UnmanagedService!SFTSRC.UnmanagedService.UnmanagedService.GetTLS()+0×20
0195F950 05CD07E6 SFTSRC_UnmanagedService!SFTSRC.UnmanagedService.UnmanagedService.Finalize()+0×1e

I have omitted most of the output for the sake of brevity.

19. You will notice that application has crashed while executing Finalize

20. On a side note you can look at all the Exception objects on managed heap with the following command

0:009> !dumpheap -type Exception

21. disassemble managed method Finalize using IP

0:009> !u 05cd07e6

05cd07e0 ff15c005cc05    call    dword ptr ds:[5CC05C0h] (SFTSRC.UnmanagedService.UnmanagedService.GetTLS(), mdToken: 06000028)
>>> 05cd07e6 8bce            mov     ecx,esi

· You will notice that Exception is occurring while calling GetTLS, where it tries to assign the return value

22. On a side note, you can use View->Registers window to look at registers for the current thread or use r command

23. disassemble managed method GetTLS using IP

0:009> !u 05cd0848
05cd0841 e81a085b73      call    mscorlib_ni+0×1c1060 (79281060) (System.Threading.Thread.GetData(System.LocalDataStoreSlot), mdToken: 060012f3)
05cd0846 8bc8            mov     ecx,eax
>>> 05cd0848 8b01            mov     eax,dword ptr [ecx]

24. You will notice that this is occurring right while calling to System.Threading.Thread.GetData() and ecx register is null

25. SOS dll has powerful command to look at clr data structure. We are going to look at GetTLS() implementation by dumping the IL(intermediate language) of this method

26. in order to dumpil, you have to first convert instruction pointer IP to Method Descriptor

0:009> !ip2md 05cd0848
MethodDesc: 05cc05b8
Method Name: SFTSRC.UnmanagedService.UnmanagedService.GetTLS()
Class: 05c54628

0:009> !dumpil 05cc05b8
ilAddr = 05c72250
IL_0000: ldstr “ThreadSpecificData”
IL_0005: call System.Threading.Thread::GetNamedDataSlot
IL_000a: stloc.0
IL_000b: ldloc.0
IL_000c: call System.Threading.Thread::GetData
IL_0011: callvirt System.Object::ToString
IL_0016: callvirt System.String::get_Length
IL_001b: ret

27. As shown in above IL, it has crashed while converting System.Threading.Thread::GetData object to String so that means System.Threading.Thread::GetData is returning NULL

Conclusion

Application Pool crashed on finalizer thread while executing Thread.GetData(). If you are not familiar with Thread Local Storage, please read it on MSDN http://msdn.microsoft.com/en-us/library/ms686749.aspx . Thread.GetData get the allocated data from thread slot but the issue here is, thread data slots are unique per thread. No other thread (not even a child thread) can get that data. Please see more details on MSDN http://msdn.microsoft.com/en-us/library/system.threading.thread.getdata.aspx . We all know that Finalizer runs on a separate thread and a finalizer thread wakes up when it the Q is notified.

You will see that in SFTSRC.UnmanagedService.UnmanagedService.CreateTLS() method, we are allocating the data slot which is in a totally different thread that’s why Finalizer thread doesn’t have access to thread specific data and resulting in crash.

You can run the following command to look at all the thread stacks

0:009> ~*e!clrstack

OS Thread Id: 0×11cc (20)
ESP       EIP
05c4ec64 7c90e4f4 [HelperMethodFrame: 05c4ec64] System.GC.WaitForPendingFinalizers()
05c4ecb4 05cd07aa SFTSRC.UnmanagedService.UnmanagedService.CreateTLS()
05c4ecc0 05cd06e8 TradeWebService.OrderService.SendOrder(System.String, UInt32, Int32, System.String, System.String)

You will notice that, while sending the order, we create a TLS and then in order to force GC and Finalizer, CreateTLS() method calls GC.Collect and GC.WaitForPendingFinalizers.

Lesson Learnt

Never access a thread specific data in Finalizer. This lab uses Thread Local Storage to show you the thread specific data but it could be anything. For example, in windows application, thread which creates a handle or control owns it so you can’t access those in finalizer.

Exercise

1. use !tls command to look at Thread Local Storage slot

2. use !teb to get familiar with thread environment block and how to use it with !tls

3. ~ command to display all the threads and get their corresponding TEB

4. Search memory to find which thread has created the slot

Download debugging lab

Share/Save/Bookmark

Analyze DrWatson minidump using WinDbg

Tuesday, November 4th, 2008

This is the analysis of a recent DrWatson(postmortem debugger) mini crash dump we received

Debugging Steps using WinDBG

  1. Open the .dmp file using WinDBG
  2. Make sure your symbol server path is set correctly(File->Symbol File Path), below is what I have set it to srv*C:\Temp\Symbols*http://msdl.microsoft.com/download/symbols;
  3. You should save the workspace to save your symbol path information unless you are using _NT_SYMBOL_PATH otherwise you will loose the information next time you start WinDBG. Remember to save it without opening a dump file to use it in next debugging session
  4. Run !analyze -v
  5. 0:182> !analyze -v

FAULTING_IP:
kernel32!RaiseException+53
77e4bee7 5e pop esi
EXCEPTION_RECORD: ffffffff — (.exr 0xffffffffffffffff)
ExceptionAddress: 77e4bee7 (kernel32!RaiseException+0×00000053)
ExceptionCode: e0434f4d (CLR exception)
ExceptionFlags: 00000001
NumberParameters: 1
Parameter[0]: 8007000e
DEFAULT_BUCKET_ID: CLR_EXCEPTION
ERROR_CODE: (NTSTATUS) 0xe0434f4d - <Unable to get error code text>
MANAGED_STACK: !dumpstack -EE
OS Thread Id: 0xdd0 (182)
TEB information is not available so a stack size of 0xFFFF is assumed
Current frame:
ChildEBP RetAddr Caller,Callee
ADDITIONAL_DEBUG_TEXT: Followup set via attribute from Frame ffffffff on thread ffffffff
FAULTING_THREAD: 00000dd0
PRIMARY_PROBLEM_CLASS: CLR_EXCEPTION
BUGCHECK_STR: APPLICATION_FAULT_CLR_EXCEPTION
LAST_CONTROL_TRANSFER: from 79f071ac to 77e4bee7
STACK_TEXT: 15d9f418 79f071ac e0434f4d 00000001 00000001 kernel32!RaiseException+0×53
15d9f478 79f9293a 75c9cb08 00000000 00000000 mscorwks!RaiseTheExceptionInternalOnly+0×2a8
15d9f4b0 7a00e74a 15d9f508 7a3c441c 9ae216b2 mscorwks!UnwindAndContinueRethrowHelperAfterCatch+0×70
15d9f558 793a858c 0874cd34 00000000 793af628 mscorwks!JIT_NewFast+0×105
15d9f578 79e7c74b 00000000 15d9f5e0 15d9f608 mscorlib_ni+0×2e858c
15d9f588 79e7c6cc 00000000 00000000 15d9f714 mscorwks!CallDescrWorker+0×33
15d9f608 79f00eca 00000000 00000000 15d9f714 mscorwks!CallDescrWorkerWithHandler+0xa3
15d9f628 79f00e75 00000000 00000000 15d9f714 mscorwks!DispatchCallBody+0×1e
15d9f68c 79f00f03 00000000 00000000 15d9f714 mscorwks!DispatchCallDebuggerWrapper+0×3d
15d9f6c0 79f02978 00000000 00000000 15d9f714 mscorwks!DispatchCallNoEH+0×51
15d9f720 79ef3207 15d9f870 7a3b57d8 00000000 mscorwks!AddTimerCallback_Worker+0×79
15d9f734 79ef31a3 15d9f810 15d9f7bc 79f91478 mscorwks!Thread::DoADCallBack+0×32a
15d9f7c8 79ef30c3 15d9f810 9ae21bee 7a3b57d8 mscorwks!Thread::ShouldChangeAbortToUnload+0xe3
15d9f804 79ef4826 15d9f810 00000001 00000000 mscorwks!Thread::ShouldChangeAbortToUnload+0×30a
15d9f82c 79ef48cf 00000001 79f02928 15d9f870 mscorwks!Thread::ShouldChangeAbortToUnload+0×33e
15d9f844 79f02901 00000001 79f02928 15d9f870 mscorwks!ManagedThreadBase::ThreadPool+0×13
15d9f88c 79f0288c 106bcc38 00000001 00000001 mscorwks!AddTimerCallbackEx+0×83
15d9f8a0 79f02850 106bcc38 00000001 9ae21b36 mscorwks!AddTimerCallback+0×10
15d9f8dc 79f024cf 2a023e70 9ae21ac2 79f02030 mscorwks!ThreadpoolMgr::AsyncTimerCallbackCompletion+0×64
15d9f928 79f0202a 00000000 00000000 77e619d1 mscorwks!UnManagedPerAppDomainTPCount::DispatchWorkItem+0xb2
15d9f93c 79f021a0 9ae21a7e 00000000 79f0207d mscorwks!ThreadpoolMgr::ExecuteWorkRequest+0xaf
15d9f994 79f95a2e 00000000 7c81c5f0 15d9fa48 mscorwks!ThreadpoolMgr::WorkerThreadStart+0×223
15d9ffb8 77e64829 106f61d8 00000000 00000000 mscorwks!Thread::intermediateThreadProc+0×49
15d9ffec 00000000 79f959e8 106f61d8 00000000 kernel32!BaseThreadStart+0×34

6) RaiseTheExceptionInternalOnly method signature is RaiseTheExceptionInternalOnly(OBJECTREF, BOOL, BOOL)
7) 2nd parameter evaluates to whether the exception is re-thrown and the 3rd parameter tells you if this is a stack overflow exception but in this case both of them are false
8 ) ExceptionCode: e0434f4d (CLR exception) which applies to any unhandled CLR exception e.g DivideByZero
9) mini dump reports 1204 threads, as shown below

0:182> ~
0 Id: 968.1128 Suspend: 1 Teb: 7ffdf000 Unfrozen
1 Id: 968.c34 Suspend: 1 Teb: 7ffde000 Unfrozen
2 Id: 968.8e0 Suspend: 1 Teb: 7ffdd000 Unfrozen
3 Id: 968.a90 Suspend: 1 Teb: 7ffdc000 Unfrozen
…………………………………………………………………………..
1200 Id: 968.24b0 Suspend: 1 Teb: 7fbd6000 Unfrozen
1201 Id: 968.1bc4 Suspend: 1 Teb: 7fae4000 Unfrozen
1202 Id: 968.25ac Suspend: 1 Teb: 7faba000 Unfrozen
1203 Id: 968.1b7c Suspend: 1 Teb: 7fab8000 Unfrozen
10) More than 200 threads have the following stack trace
0:182> ~*k
1201 Id: 968.1bc4 Suspend: 1 Teb: 7fae4000 Unfrozen
ChildEBP RetAddr
7b22ebec 7c827d0b ntdll!KiFastSystemCallRet
7b22ebf0 77e61d1e ntdll!NtWaitForSingleObject+0xc
7b22ec60 79e789c6 kernel32!WaitForSingleObjectEx+0xac
7b22eca4 79e7898f mscorwks!PEImage::LoadImage+0×1af
7b22ecf4 79e78944 mscorwks!CLREvent::WaitEx+0×117
7b22ed08 79fbc82a mscorwks!CLREvent::Wait+0×17
7b22ed1c 79fbc6da mscorwks!SVR::GCHeap::WaitUntilGCComplete+0×34
7b22ed58 79fc51bf mscorwks!Thread::RareDisablePreemptiveGC+0×1a0
7b22ed8c 79fc529b mscorwks!Thread::UserSleep+0xb1
7b22ee2c 0deb1401 mscorwks!ThreadNative::Sleep+0xce
WARNING: Frame IP not in any known module. Following frames may be wrong.
7b22ef18 793b0d1f 0xdeb1401
7b22ef20 793740ab mscorlib_ni+0×2f0d1f
7b22ef30 79e7c74b mscorlib_ni+0×2b40ab
7b22ef40 79e7c6cc mscorwks!CallDescrWorker+0×33
7b22efc0 79e7c8e1 mscorwks!CallDescrWorkerWithHandler+0xa3
7b22f0fc 79e7c783 mscorwks!MethodDesc::CallDescr+0×19c
7b22f118 79e7c90d mscorwks!MethodDesc::CallTargetWorker+0×1f
7b22f12c 79eb300f mscorwks!MethodDescCallSite::Call_RetArgSlot+0×18
7b22f2f8 79eb2f31 mscorwks!ExecuteCodeWithGuaranteedCleanupHelper+0×9b
7b22f3a8 79373ff7 mscorwks!ReflectionInvocation::ExecuteCodeWithGuaranteedCleanup+0xf9

11) There are 2 GC threads (2 and 3) since this process is running on Windows Server 2003 with 2 CPUs so one GC thread per CPU.
0:182> ~2 kb
ChildEBP RetAddr Args to Child
00b7fd88 7c827d0b 77e61d1e 000000e8 00000000 ntdll!KiFastSystemCallRet
00b7fd8c 77e61d1e 000000e8 00000000 00000000 ntdll!NtWaitForSingleObject+0xc
00b7fdfc 79e789c6 000000e8 ffffffff 00000000 kernel32!WaitForSingleObjectEx+0xac
00b7fe40 79e7898f 000000e8 ffffffff 00000000 mscorwks!PEImage::LoadImage+0×1af
00b7fe90 79e78944 ffffffff 00000000 00000000 mscorwks!CLREvent::WaitEx+0×117
00b7fea4 79efa6c3 ffffffff 00000000 00000000 mscorwks!CLREvent::Wait+0×17
00b7fee8 79efb5b7 001568f8 00000006 00000000 mscorwks!SVR::t_join::join+0xae
00b7ff28 79efa9ba 00000002 00000000 001568f8 mscorwks!SVR::gc_heap::mark_phase+0×16f
00b7ff54 79efaf60 ffffffff 00000000 001568f8 mscorwks!SVR::gc_heap::gc1+0×46
00b7ff74 79efa72f 00000002 00000000 001568f8 mscorwks!SVR::gc_heap::garbage_collect+0×246
00b7ff98 79fc8583 79fc8559 77e64590 79fc857c mscorwks!SVR::gc_heap::gc_thread_function+0×6a
00b7ffb8 77e64829 001568f8 00000000 00000000 mscorwks!SVR::gc_heap::gc_thread_stub+0×92
00b7ffec 00000000 79fc8538 001568f8 00000000 kernel32!BaseThreadStart+0×34
0:182> ~3 kb
ChildEBP RetAddr Args to Child
00c7fc9c 79efd1d0 55b4de68 00000001 001568f8 mscorwks!SVR::gc_heap::mark_object_simple1+0xdf
00c7fce4 79efba46 0108ffcc 00000001 09063c94 mscorwks!SVR::gc_heap::mark_object_simple+0×254
00c7fd10 79fbb49d 09063c94 0108ffcc 00000000 mscorwks!SVR::GCHeap::Promote+0×115
00c7fd28 79fc3a38 00c7fdd0 09063c94 00000000 mscorwks!GcEnumObject+0×2d
00c7fd48 79fc39e2 00153050 79fbb475 00c7fdd0 mscorwks!Module::EnumStaticGCRefs+0×59
00c7fd74 79fc3ada 79fbb475 00c7fdd0 00174ea8 mscorwks!DomainAssembly::EnumStaticGCRefs+0×38
00c7fd98 79fc3a9b 79fbb475 00c7fdd0 2a03d8b0 mscorwks!AppDomain::EnumStaticGCRefs+0×29
00c7fdb4 79fc399b 79fbb475 00c7fdd0 79efba5a mscorwks!SystemDomain::EnumAllStaticGCRefs+0×4e
00c7fddc 79efb559 00000000 00000002 00000002 mscorwks!CNameSpace::GcScanRoots+0×1a2
00c7fe28 79efa9ba 00000002 00000000 00157d30 mscorwks!SVR::gc_heap::mark_phase+0×111
00c7fe54 79efaf60 ffffffff 00000000 00157d30 mscorwks!SVR::gc_heap::gc1+0×46
00c7fe74 79efa72f 00000002 00000000 00157d30 mscorwks!SVR::gc_heap::garbage_collect+0×246
00c7fe98 79fc8583 00000000 7c8276bf 00c7fef8 mscorwks!SVR::gc_heap::gc_thread_function+0×6a
00c7ffb8 77e64829 00157d30 00000000 00000000 mscorwks!SVR::gc_heap::gc_thread_stub+0×92
00c7ffec 00000000 79fc8538 00157d30 00000000 kernel32!BaseThreadStart+0×34
12) Mark_Phase signature is mark_phase (int gen_number, BOOL mark_only_p
parameters passed are 2, 0, that means GC was triggered for generation 2

13) Thread # 345 has triggered garbage collection for generation 2 with a request to allocate memory from GC Heap(see highleted in red)
0:182> ~345 kb 100
ChildEBP RetAddr Args to Child
2484e0bc 7c827d0b 77e61d1e 00000100 00000000 ntdll!KiFastSystemCallRet
2484e0c0 77e61d1e 00000100 00000000 00000000 ntdll!NtWaitForSingleObject+0xc
2484e130 79e789c6 00000100 ffffffff 00000000 kernel32!WaitForSingleObjectEx+0xac
2484e174 79e7898f 00000100 ffffffff 00000000 mscorwks!PEImage::LoadImage+0×1af
2484e1c4 79e78944 ffffffff 00000000 00000000 mscorwks!CLREvent::WaitEx+0×117
2484e1d8 79efafc5 ffffffff 00000000 00000000 mscorwks!CLREvent::Wait+0×17
2484e1fc 79efad3d ffffffff 71ca0000 00157dd4 mscorwks!SVR::gc_heap::wait_for_gc_done+0×62
2484e224 79fd1cf3 00000002 00000000 00157d30 mscorwks!SVR::GCHeap::GarbageCollectGeneration+0×1b5
2484e2b4 79ef98cf 14fa94e0 0000000c 00000000 mscorwks!SVR::gc_heap::try_allocate_more_space+0×464
2484e2d0 79ef9b9c 14fa94e0 0000000c 00000000 mscorwks!SVR::gc_heap::allocate_more_space+0×2e
2484e2f0 79e7510e 14fa94e0 0000000c 00000000 mscorwks!SVR::GCHeap::Alloc+0×54
2484e30c 79e85806 0000000c 00000000 00000000 mscorwks!Alloc+0×60
2484e34c 79e858cb 79102290 abbf001a 000007d8 mscorwks!FastAllocateObject+0×38
2484e3f0 793ab271 01061034 2484e484 00000000 mscorwks!JIT_NewFast+0×9e
2484e444 793aaef0 f1dcc000 ffffffbc 00000000 mscorlib_ni+0×2eb271
2484e618 793b0d1f 3a5a2074 793740ab 00000000 mscorlib_ni+0×2eaef0
2484e620 793740ab 00000000 14fa94a0 2484e640 mscorlib_ni+0×2f0d1f
2484e630 79e7c74b 00000000 00000000 2484e6c0 mscorlib_ni+0×2b40ab
2484e640 79e7c6cc 2484e710 00000000 2484e6e0 mscorwks!CallDescrWorker+0×33
2484e6c0 79e7c8e1 2484e710 00000000 2484e6e0 mscorwks!CallDescrWorkerWithHandler+0xa3
2484e7fc 79e7c783 796c01c0 2484e92c 2484e84c mscorwks!MethodDesc::CallDescr+0×19c
2484e818 79e7c90d 796c01c0 2484e92c 2484e84c mscorwks!MethodDesc::CallTargetWorker+0×1f
2484e82c 79eb300f 2484e84c abbf0a12 14fa94a0 mscorwks!MethodDescCallSite::Call_RetArgSlot+0×18
2484e9f8 79eb2f31 2484ea8c abbf0942 09060054 mscorwks!ExecuteCodeWithGuaranteedCleanupHelper+0×9b
2484eaa8 79373ff7 2484ea4c 408301c8 0109dd28 mscorwks!ReflectionInvocation::ExecuteCodeWithGuaranteedCleanup+0xf9
2484eac0 79373ede 4a34402c 2484eaf0 2484eb1c mscorlib_ni+0×2b3ff7
00000000 00000000 00000000 00000000 00000000 mscorlib_ni+0×2b3ede

    Analysis  

  1. Application has thrown exception at the top of stack
  2. In CLR 2.0, unhandled exceptions at the top of the stack on any thread will terminate the application
  3. Exception has occurred after GC was triggered
  4. Mark Phase was triggered for Generation 2 that means it was under severe memory pressure
  5. since there are more than 1200 threads that itself amounts to 1.2 GB of virtual memory for just the user mode stack and each thread has user mode plus kernel mode stack although not much but kernel stack will also add up to another 3 virtual pages(12KB) 1200*12KB of physical memory, kernel stack is physical memory resident. There will be around 100 MB or so in MEM_IMAGE itself
  6. Since this exception has occurred while starting a worker thread from managed thread pool and at the same GC was also triggered. So that means there was not enough memory
  7. There is no thread leak, these many threads are by design(one thread per request, I don’t know why)
  8. Some of the interesting points

There are more than 200 threads with preemptive GC disabled what that means is these threads can’t be suspended and GC Threads will have to wait for these threads to return to preemptive GC mode in order to claim the memory. Managed threads should not be in a state with preemptive GC disabled because this is very rare as interpreted from the call stack Thread::RareDisablePreemptiveGC

Below is the description of Preemptive GC from http://blogs.msdn.com/yunjin/archive/2005/08/30/457756.aspx

Preemptive GC: also very important. In Rotor, this is m_fPreemptiveGCDisabled field of C++ Thread class. It indicates what GC mode the thread is in: “enabled” in the table means the thread is in preemptive mode where GC could preempt this thread at any time; “disabled” means the thread is in cooperative mode where GC has to wait the thread to give up its current work (the work is related to GC objects so it can’t allow GC to move the objects around). When the thread is executing managed code (the current IP is in managed code), it is always in cooperative mode; when the thread is in Execution Engine (unmanaged code), EE code could choose to stay in either mode and could switch mode at any time; when a thread are outside of CLR (e.g, calling into native code using interop), it is always in preemptive mode.

CLR rotor source code has the following comments before calling Thread::RareDisablePreemptiveGC

We must do the following in this order, because otherwise we would be constructing the exception for the abort without synchronizing with the GC. Also, we have no CLR SEH set up, despite the fact that we may throw a ThreadAbortException.

Most likely it crashed because of not having enough memory to start a worker thread since it crashed with the call stack pointing to WokerThreadStart

The real challenge is how will you find out the exception detail for sure?How will you make sure that this is indeed a out of memory exception?

Since this is a mini dump with no information on CLR Data structures. Effective managed debugging requires a mini dump with full virtual memory because managed heap is created with virtualalloc but Dr. Watson as a postmortem debugger with default options creates a minidump which includes only all thread contexts and 10 instructions.

Share/Save/Bookmark

Akismet - Spam Blocked
September 2010
M T W T F S S
« Aug    
 12345
6789101112
13141516171819
20212223242526
27282930  
Categories