Posts Tagged ‘WinDbg’

WinDbg meta-command tip to display all the extension commands exported by WinDbg extension

Tuesday, July 7th, 2009

Usually, A WinDbg extension will have the !help command in case you need to look at the supported commands in an extension. However, not all commands may be documented or no documentation at all. In that case you can use Depends or any dissembler to look at the Export section.

But, with .extmatch command, you can achieve the same right in the debugger itself as shown below.

Below is the executed command to display all the extension commands supported by loaded SOS in CLR 4.0

remember the space between sos and *

0:020> .extmatch /D /e c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos *
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.AnalyzeOOM
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.BPMD
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.CLRStack
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.COMState
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.ClrStack
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.DumpArray
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.DumpAssembly
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.DumpClass
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.DumpDomain
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.DumpHeap
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.DumpIL
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.DumpLog
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.DumpMD
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.DumpMT
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.DumpModule
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.DumpObj
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.DumpRuntimeTypes
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.DumpSig
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.DumpSigElem
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.DumpStack
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.DumpStackObjects
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.DumpVC
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.Dumplog
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.Dumpruntimetypes
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.EEHeap
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.EEStack
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.EEVersion
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.EHInfo
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.Ehinfo
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.FinalizeQueue
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.FindAppDomain
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.FindRoots
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.Findappdomain
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.GCHandleLeaks
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.GCHandleleaks
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.GCHandles
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.GCHeapStat
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.GCInfo
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.GCRoot
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.GCWhere
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.GcHeapStat
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.GcWhere
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.Gchandleleaks
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.HandleCLRN
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.HeapStat
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.Help
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.HistClear
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.HistInit
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.HistObj
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.HistObjFind
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.HistRoot
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.HistStats
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.IP2MD
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.ListNearObj
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.MinidumpMode
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.Minidumpmode
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.Name2EE
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.ObjSize
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.PrintException
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.Printexception
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.ProcInfo
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.RCWCleanupList
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.Rcwcleanuplist
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.SOSFlush
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.SaveModule
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.StopOnException
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.Stoponexception
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.SyncBlk
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.ThreadPool
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.ThreadState
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.Threads
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.Token2EE
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.TraverseHeap
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.Traverseheap
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.U
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.VMMap
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.VMStat
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.VerifyHeap
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.VerifyObj
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.VerifyStackTrace
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.Verifyheap
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.WatsonBuckets
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.analyzeoom
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.ao
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.bpmd
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.clrstack
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.comstate
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.da
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.do
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.dso
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.dumparray
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.dumpassembly
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.dumpclass
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.dumpdomain
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.dumpheap
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.dumpil
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.dumplog
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.dumpmd
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.dumpmodule
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.dumpmt
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.dumpobj
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.dumpruntimetypes
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.dumpsig
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.dumpsigelem
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.dumpstack
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.dumpstackobjects
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.dumpvc
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.eeheap
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.eestack
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.eeversion
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.ehinfo
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.finalizequeue
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.findappdomain
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.findroots
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.fq
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.gchandleleaks
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.gchandles
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.gcheapstat
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.gcinfo
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.gcroot
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.gcwhere
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.heapstat
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.help
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.histclear
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.histinit
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.histobj
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.histobjfind
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.histroot
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.histstats
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.hof
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.ip2md
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.listnearobj
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.lno
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.minidumpmode
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.name2ee
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.objsize
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.pe
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.printexception
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.procinfo
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.rcwcleanuplist
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.savemodule
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.soe
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.sosflush
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.stoponexception
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.syncblk
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.t
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.threadpool
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.threads
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.threadstate
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.token2ee
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.tp
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.traverseheap
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.u
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.verifyheap
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.verifyobj
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.vh
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.vmmap
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.vmstat
!c:\WINDOWS\Microsoft.NET\Framework\v4.0.20506\sos.vo

Share/Save/Bookmark

Interesting WinDbg Extension SOS commands in CLR 4.0/.NET Framework 4.0 CTP, .NET runtime dll renamed and sos commands just got richer

Tuesday, July 7th, 2009

We will review the WinDbg Extension SOS.dll in .NET Framework 4.0 CTP. CLR 4.0 has renamed runtime dll from mscorwks.dll to CLR.DLL, that’s really helpful.

loading SOS dll depending on the location of .net 4.0 runtime aka CLR.DLL, execute the following command

.loadby sos clr

1.  DML Support - YES, finally.  SOS supports DML in .NET 1.1 but it was gone in clr 2.0.  Silverlight CoreCLR supports DML and now .NET framework 4.0 supports it as well.

Execute the following command to turn on DMLfor every command or use /D option

0:003> .prefer_dml 1
DML versions of commands on by default

0:003> !dumpheap /D -type Exception -stat

For people new to WinDbg, Why am I so excited about DML support in SOS?
DML Snapshot

If you look at the above snapshot, you have the link for each MethodTable address which you can just click on to execute the command. No need to type, however not every commands will have the DML support but !dumpobject is another important one, you can just click on object address to dump an object from GC Heap.

2. The following additional extension commands are added

Examining code and stacks

!ThreadState

Examining CLR data structures

!DumpSigElem

Diagnostic Utilities

!VerifyObj
!FindRoots
!HeapStat
!GCWhere
!ListNearObj (lno)
!AnalyzeOOM (ao)

Examining the GC history

!HistInit
!HistStats
!HistRoot
!HistObj
!HistObjFind
!HistClear

!ThreadState Command

When you execute !threads command, you will see the similar output as shown below

PreEmptive   GC Alloc           Lock
ID OSID ThreadOBJ    State     GC       Context       Domain   Count APT Exception
0    1  310 00161438      a020 Enabled  013b4c64:013b5fe8 00159230     1 MTA
2    2  8c4 0016dab0      b220 Enabled  00000000:00000000 00159230     0 MTA (Finalizer)

First column is your debugger thread id and the second column ID is ManagedThread ID, OSID column is OS thread ID so that means OSID column will be 0 or some garbage when a runtime uses Fiber.

You will see the State column which is a bit flag as shown below(taken from Shared CLI)

TS_Unknown                = 0×00000000,    // threads are initialized this way

TS_AbortRequested         = 0×00000001,    // Abort the thread
TS_GCSuspendPending       = 0×00000002,    // waiting to get to safe spot for GC
TS_UserSuspendPending     = 0×00000004,    // user suspension at next opportunity
TS_DebugSuspendPending    = 0×00000008,    // Is the debugger suspending threads?
TS_GCOnTransitions        = 0×00000010,    // Force a GC on stub transitions (GCStress only)

TS_SuspendUnstarted       = 0×00400000,    // latch a user suspension on an unstarted thread

TS_ThreadPoolThread       = 0×00800000,    // is this a threadpool thread?
TS_TPWorkerThread         = 0×01000000,    // is this a threadpool worker thread?

TS_Interruptible          = 0×02000000,    // sitting in a Sleep(), Wait(), Join()
TS_Interrupted            = 0×04000000,    // was awakened by an interrupt APC. !!! This can be moved to TSNC

TS_CompletionPortThread   = 0×08000000,    // Completion port thread
………………………………………………………………………..

SOS in CLR4.0 has !threadstate command, which tells you exactly the state of the thread given the bit field, the following output shows you the threadstate bit for Worker Thread, Completion Port Thread and Finalizer Thread

0:000> !ThreadState 1009220
Legal to Join
Background
CLR Owns
In Multi Threaded Apartment
Thread Pool Worker Thread
0:000> !ThreadState 800a220
Legal to Join
Background
CoInitialized
In Multi Threaded Apartment
Completion Port Thread
0:000> !ThreadState b220
Legal to Join
Background
CLR Owns
CoInitialized
In Multi Threaded Apartment

Other Important Commands

!findroots - This is a very powerful and interesting command, because it allows you to break into debugee when CLR garbage collect generational objects.

!GCWhere - tells you the generation number along with the GC heap segment, you no longer need to map the object address with the GC heap segment or use any other extension dll

!HeapStat- This is another cool command, this command displays the stat on generational heap including generation sizes

!AnalyzeOOM - displays the detailed informatin on Last System.OutOfMemoryException

I can’t do justice on detailed documentation for each of these commands because SOS !help documentation has done a very good job. You can either look at !help documentation  or read below. I am just copying and pasting from SOS Help documentation

0:020> !help ThreadState
——————————————————————————-
!ThreadState value

The !Threads command outputs, among other things, the state of the thread.
This is a bit field which corresponds to various states the thread is in.
To check the state of the thread, simply pass that bit field from the
output of !Threads into !ThreadState.

Example:
0:003> !Threads
ThreadCount:      2
UnstartedThread:  0
BackgroundThread: 1
PendingThread:    0
DeadThread:       0
Hosted Runtime:   no
PreEmptive   GC Alloc           Lock
ID OSID ThreadOBJ    State     GC       Context       Domain   Count APT Exception
0    1  250 0019b068      a020 Disabled 02349668:02349fe8 0015def0     0 MTA
2    2  944 001a6020      b220 Enabled  00000000:00000000 0015def0     0 MTA (Finalizer)
0:003> !ThreadState b220
Legal to Join
Background
CLR Owns
CoInitialized
In Multi Threaded Apartment

Possible thread states:
Thread Abort Requested
GC Suspend Pending
User Suspend Pending
Debug Suspend Pending
GC On Transitions
Legal to Join
Yield Requested
Hijacked by the GC
Blocking GC for Stack Overflow
Background
Unstarted
Dead
CLR Owns
CoInitialized
In Single Threaded Apartment
In Multi Threaded Apartment
Reported Dead
Task Reset
Sync Suspended
Debug Will Sync
Stack Crawl Needed
Suspend Unstarted
Aborted
Thread Pool Worker Thread
Interruptible
Interrupted
Completion Port Thread
Abort Initiated
Finalized
Failed to Start
Detached
0:020> !help DumpSigElem
——————————————————————————-
!DumpSigElem <sigaddr> <moduleaddr>

This command dumps a single element of a signature object.  For most circumstances,
you should use !DumpSig to look at individual signature objects, but if you find a
signature that has been corrupted in some manner you can use !DumpSigElem to read out
the valid portions of it.

If we look at a valid signature object for a method we see the following:
0:000> !dumpsig 0×000007fe`ec20879d 0×000007fe`eabd1000
[DEFAULT] [hasThis] Void (Boolean,String,String)

We can look at the individual elements of this object by adding the offsets into the
object which correspond to the return value and parameters:
0:000> !dumpsigelem 0×000007fe`ec20879d+2 0×000007fe`eabd1000
Void
0:000> !dumpsigelem 0×000007fe`ec20879d+3 0×000007fe`eabd1000
Boolean
0:000> !dumpsigelem 0×000007fe`ec20879d+4 0×000007fe`eabd1000
String
0:000> !dumpsigelem 0×000007fe`ec20879d+5 0×000007fe`eabd1000
String

We can do something similar for fields.  Here is the full signature of a field:
0:000> !dumpsig 0×000007fe`eb7fd8cd 0×000007fe`eabd1000
[FIELD] ValueClass System.RuntimeTypeHandle

Using !DumpSigElem we can find the type of the field by adding the offset of it (1) to
the address of the signature:
0:000> !dumpsigelem 0×000007fe`eb7fd8cd+1 0×000007fe`eabd1000
ValueClass System.RuntimeTypeHandle

!DumpSigElem will also work with generics.  Let a function be defined as follows:
public A Test(IEnumerable<B> n)

The elements of this signature can be obtained by adding offsets into the signature
when calling !DumpSigElem:

0:000> !dumpsigelem 00000000`00bc2437+2 000007ff00043178
__Canon
0:000> !dumpsigelem 00000000`00bc2437+4 000007ff00043178
Class System.Collections.Generic.IEnumerable`1<__Canon>

The actual offsets that you should add are determined by the contents of the
signature itself.  By trial and error you should be able to find various elements
of the signature.

0:020> !help VerifyObj
——————————————————————————-
!VerifyObj <object address>

!VerifyObj is a diagnostic tool that checks the object that is passed as an
argument for signs of corruption.

0:002> !verifyobj 028000ec
object 0×28000ec does not have valid method table

0:002> !verifyobj 0680017c
object 0×680017c: bad member 00000001 at 06800184

0:020> !help FindRoots
——————————————————————————-
!FindRoots -gen <N> | -gen any | <object address>

The “-gen” form causes the debugger to break in the debuggee on the next
collection of the specified generation.  The effect is reset as soon as the
break occurs, in other words, if you need to break on the next collection you
would need to reissue the command.

The last form of this command is meant to be used after the break caused by the
other forms has occurred.  Now the debuggee is in the right state for
!FindRoots to be able to identify roots for objects from the current condemned
generations.

!FindRoots is a diagnostic command that is meant to answer the following
question:

“I see that GCs are happening, however my objects have still not been
collected. Why? Who is holding onto them?”

The process of answering the question would go something like this:

1. Find out the generation of the object of interest using the !GCWhere
command, say it is gen 1:
!GCWhere <object address>

2. Instruct the runtime to stop the next time it collects that generation using
the !FindRoots command:
!FindRoots -gen 1
g

3. When the next GC starts, and has proceeded past the mark phase a CLR
notification will cause a break in the debugger:
(fd0.ec4): CLR notification exception - code e0444143 (first chance)
CLR notification: GC - end of mark phase.
Condemned generation: 1.

4. Now we can use the !FindRoots <object address> to find out the cross
generational references to the object of interest.  In other words, even if the
object is not referenced by any “proper” root it may still be referenced by an
older object (from an older generation), from a generation that has not yet been
scheduled for collection.  At this point !FindRoots will search those older
generations too, and report those roots.
0:002> !findroots 06808094
older generations::Root:  068012f8(AAA.Test+a)->
06808094(AAA.Test+b)

0:020> !help HeapStat
——————————————————————————-
!HeapStat [-inclUnrooted | -iu]

This command shows the generation sizes for each heap and the total, how much free
space there is in each generation on each heap.  If the -inclUnrooted option is
specified the report will include information about the managed objects from the
GC heap that are not rooted anymore.

Sample output:

0:002> !heapstat
Heap     Gen0         Gen1         Gen2         LOH
Heap0    177904       12           306956       8784
Heap1    159652       12           12           16
Total    337556       24           306968       8800

Free space:                                                 Percentage
Heap0    28           12           12           64          SOH:  0% LOH:  0%
Heap1    104          12           12           16          SOH:  0% LOH:100%
Total    132          24           24           80

0:002> !heapstat -inclUnrooted
Heap     Gen0         Gen1         Gen2         LOH
Heap0    177904       12           306956       8784
Heap1    159652       12           12           16
Total    337556       24           306968       8800

Free space:                                                 Percentage
Heap0    28           12           12           64          SOH:  0% LOH:  0%
Heap1    104          12           12           16          SOH:  0% LOH:100%
Total    132          24           24           80

Unrooted objects:                                           Percentage
Heap0    152212       0            306196       0           SOH: 94% LOH:  0%
Heap1    155704       0            0            0           SOH: 97% LOH:  0%
Total    307916       0            306196       0

The percentage column contains a breakout of free or unrooted bytes to total bytes.

0:020> !help GCWhere
——————————————————————————-
!GCWhere <object address>

!GCWhere displays the location in the GC heap of the argument passed in.

0:002> !GCWhere 02800038
Address  Gen Heap segment  begin    allocated size
02800038 2    0   02800000 02800038 0282b740  12

When the argument lies in the managed heap, but is not a valid *object* address
the “size” is displayed as 0:

0:002> !GCWhere 0280003c
Address  Gen Heap segment  begin    allocated size
0280003c 2    0   02800000 02800038 0282b740  0

0:020> !help ListNearObj
——————————————————————————-
!ListNearObj <object address>

!ListNearObj is a diagnostic tool that displays the object preceeding and
succeeding the address passed in:

The command looks for the address in the GC heap that looks like a valid
beginning of a managed object (based on a valid method table) and the object
following the argument address.

0:002> !ListNearObj 028000ec
Before: 0×28000a4           72 (0×48      ) System.StackOverflowException
After:  0×2800134           72 (0×48      ) System.Threading.ThreadAbortException
Heap local consistency confirmed.

0:002> !ListNearObj 028000f0
Before: 0×28000ec           72 (0×48      ) System.ExecutionEngineException
After:  0×2800134           72 (0×48      ) System.Threading.ThreadAbortException
Heap local consistency confirmed.

The command considers the heap as “locally consistent” if:
prev_obj_addr + prev_obj_size = arg_addr && arg_obj + arg_size = next_obj_addr
OR
prev_obj_addr + prev_obj_size = next_obj_addr

When the condition is not satisfied:

0:002> !lno 028000ec
Before: 0×28000a4           72 (0×48      ) System.StackOverflowException
After:  0×2800134           72 (0×48      ) System.Threading.ThreadAbortException
Heap local consistency not confirmed.

0:020> !help AnalyzeOOM
——————————————————————————-
!AnalyzeOOM

!AnalyzeOOM displays the info of the last OOM occured on an allocation request to
the GC heap (in Server GC it displays OOM, if any, on each GC heap).

To see the managed exception(s) use the !Threads command which will show you
managed exception(s), if any, on each managed thread. If you do see an
OutOfMemoryException exception you can use the !PrintException command on it.
To get the full callstack use the “kb” command in the debugger for that thread.
For example, to display thread 3’s stack use ~3kb.

OOM exceptions could be because of the following reasons:

1) allocation request to GC heap
in which case you will see JIT_New* on the call stack because managed code called new.
2) other runtime allocation failure
for example, failure to expand the finalize queue when GC.ReRegisterForFinalize is
called.
3) some other code you use throws a managed OOM exception
for example, some .NET framework code converts a native OOM exception to managed
and throws it.

The !AnalyzeOOM command aims to help you with investigating 1) which is the most
difficult because it requires some internal info from GC. The only exception is
we don’t support allocating objects larger than 2GB on CLR v2.0 or prior. And this
command will not display any managed OOM because we will throw OOM right away
instead of even trying to allocate it on the GC heap.

There are 2 legitimate scenarios where GC would return OOM to allocation requests -
one is if the process is running out of VM space to reserve a segment; the other
is if the system is running out physical memory (+ page file if you have one) so
GC can not commit memory it needs. You can look at these scenarios by using performance
counters or debugger commands. For example for the former scenario the “!address
-summary” debugger command will show you the largest free region in the VM. For
the latter scenario you can look at the “Memory\% Committed Bytes In Use” see
if you are running low on commit space. One important thing to keep in mind is
when you do this kind of memory analysis it could an aftereffect and doesn’t
completely agree with what this command tells you, in which case the command should
be respected because it truly reflects what happened during GC.

The other cases should be fairly obvious from the callstack.

Sample output:

0:011> !ao
———Heap 2 ———
Managed OOM occured after GC #28 (Requested to allocate 1234 bytes)
Reason: Didn’t have enough memory to commit
Detail: SOH: Didn’t have enough memory to grow the internal GC datastructures (800000 bytes) -
on GC entry available commit space was 500 MB
———Heap 4 ———
Managed OOM occured after GC #12 (Requested to allocate 100000 bytes)
Reason: Didn’t have enough memory to allocate an LOH segment
Detail: LOH: Failed to reserve memory (16777216 bytes)

0:020> !help FAQ
——————————————————————————-
>> Where can I get the right version of SOS for my build?

If you are running version 1.1 or 2.0 of the CLR, SOS.DLL is installed in the
same directory as the main CLR dll (CLR.DLL). Newer versions of the
Windows Debugger provide a command to make it easy to load the right copy of
SOS.DLL:

“.loadby sos clr”

That will load the SOS extension DLL from the same place that CLR.DLL is
loaded in the process. You shouldn’t attempt to use a version of SOS.DLL that
doesn’t match the version of CLR.DLL. You can find the version of
CLR.DLL by running

“lmvm clr”

in the debugger.  Note that if you are running CoreCLR (e.g. Silverlight)
then you should replace “clr” with “coreclr”.

If you are using a dump file created on another machine, it is a little bit
more complex. You need to make sure the mscordacwks.dll file that came with
that install is on your symbol path, and you need to load the corresponding
version of sos.dll (typing .load <full path to sos.dll> rather than using the
.loadby shortcut). Within the Microsoft corpnet, we keep tagged versions
of mscordacwks.dll, with names like mscordacwks_<architecture>_<version>.dll
that the Windows Debugger can load. If you have the correct symbol path to the
binaries for that version of the Runtime, the Windows Debugger will load the
correct mscordacwks.dll file.

>> I have a chicken and egg problem. I want to use SOS commands, but the CLR
isn’t loaded yet. What can I do?

In the debugger at startup you can type:

“sxe clrn”

Let the program run, and it will stop with the notice

“CLR notification: module ‘mscorlib’ loaded”

At this time you can use SOS commands. To turn off spurious notifications,
type:

“sxd clrn”

>> I got the following error message. Now what?

0:000> .loadby sos clr
0:000> !DumpStackObjects
Failed to find runtime DLL (clr.dll), 0×80004005
Extension commands need clr.dll in order to have something to do.
0:000>

This means that the CLR is not loaded yet, or has been unloaded. You need to
wait until your managed program is running in order to use these commands. If
you have just started the program a good way to do this is to type

bp clr!EEStartup “g @$ra”

in the debugger, and let it run. After the function EEStartup is finished,
there will be a minimal managed environment for executing SOS commands.

>> I have a partial memory minidump, and !DumpObj doesn’t work. Why?

In order to run SOS commands, many CLR data structures need to be traversed.
When creating a minidump without full memory, special functions are called at
dump creation time to bring those structures into the minidump, and allow a
minimum set of SOS debugging commands to work. At this time, those commands
that can provide full or partial output are:

CLRStack
Threads
Help
PrintException
EEVersion

For a minidump created with this minimal set of functionality in mind, you
will get an error message when running any other commands. A full memory dump
(obtained with “.dump /ma <filename>” in the Windows Debugger) is often the
best way to debug a managed program at this level.

>> What other tools can I use to find my bug?

Turn on Managed Debugging Assistants. These enable additional runtime diagnostics,
particularly in the area of PInvoke/Interop. Adam Nathan has written some great
information about that:

http://blogs.msdn.com/adam_nathan/

>> Does SOS support DML?

Yes.  SOS respects the .prefer_dml option in the debugger.  If this setting is
turned on, then SOS will output DML by default.  Alternatively, you may leave
it off and add /D to the beginning of a command to get DML based output for it.
Not all SOS commands support DML output.

0:020> !help HistInit
——————————————————————————-
!HistInit

Before running any of the Hist - family commands you need to initialize the SOS
structures from the stress log saved in the debuggee.  This is achieved by the
HistInit command.

Sample output:

0:001> !HistInit
Attempting to read Stress log
STRESS LOG:
facilitiesToLog  = 0xffffffff
levelToLog       = 6
MaxLogSizePerThread = 0×10000 (65536)
MaxTotalLogSize = 0×1000000 (16777216)
CurrentTotalLogChunk = 9
ThreadsWithLogs  = 3
Clock frequency  = 3.392 GHz
Start time         15:26:31
Last message time  15:26:56
Total elapsed time 25.077 sec
……………………………….
—————————- 2407 total entries —————————–

SUCCESS: GCHist structures initialized

0:020> !help HistStats
——————————————————————————-
!HistStats

HistStat provides a number of garbage collection statistics obtained from the
stress log.

Sample output:

0:003> !HistStats
GCCount    Plugs Promotes   Relocs
———————————–
2296        0       35       86
2295        0       34       85
2294        0       34       85

2286        0       32       83
2285        0       32       83
322        0       23       55
0        0        0        0
Root 01e411b8 relocated multiple times in gc 322
Root 01e411bc relocated multiple times in gc 322

Root 01e413f8 relocated multiple times in gc 322
Root 01e413fc relocated multiple times in gc 322

0:020> !help histroot
——————————————————————————-
!HistRoot <root>

The root value obtained from !HistObjFind can be used to track the movement of
an object through the GCs.

HistRoot provides information related to both promotions and relocations of the
root specified as the argument.

0:003> !HistRoot 01e411b8
GCCount    Value       MT Promoted?                Notes
———————————————————
2296 028970d4 5b6c5cd8       yes
2295 028970d4 5b6c5cd8       yes
2294 028970d4 5b6c5cd8       yes
2293 028970d4 5b6c5cd8       yes
2292 028970d4 5b6c5cd8       yes
2291 028970d4 5b6c5cd8       yes
2290 028970d4 5b6c5cd8       yes
2289 028970d4 5b6c5cd8       yes
2288 028970d4 5b6c5cd8       yes
2287 028970d4 5b6c5cd8       yes
2286 028970d4 5b6c5cd8       yes
2285 028970d4 5b6c5cd8       yes
322 028970e8 5b6c5cd8       yes Duplicate promote/relocs

0:020> !help HistObj
——————————————————————————-
!HistObj <obj_address>

This command examines all stress log relocation records and displays the chain
of GC relocations that may have led to the address passed in as an argument.
Conceptually the output is:

GenN    obj_address   root1, root2, root3,
GenN-1  prev_obj_addr root1, root2,
GenN-2  prev_prev_oa  root1, root4,

Sample output:
0:003> !HistObj 028970d4
GCCount   Object                                    Roots
———————————————————
2296 028970d4 00223fc4, 01e411b8,
2295 028970d4 00223fc4, 01e411b8,
2294 028970d4 00223fc4, 01e411b8,
2293 028970d4 00223fc4, 01e411b8,
2292 028970d4 00223fc4, 01e411b8,
2291 028970d4 00223fc4, 01e411b8,
2290 028970d4 00223fc4, 01e411b8,
2289 028970d4 00223fc4, 01e411b8,
2288 028970d4 00223fc4, 01e411b8,
2287 028970d4 00223fc4, 01e411b8,
2286 028970d4 00223fc4, 01e411b8,
2285 028970d4 00223fc4, 01e411b8,
322 028970d4 01e411b8,
0 028970d4

0:020> !help HistObjFind
——————————————————————————-
!HistObjFind <obj_address>

To examine log entries related to an object whose present address is known one
would use this command. The output of this command contains all entries that
reference the object:

0:003> !HistObjFind 028970d4
GCCount   Object                                  Message
———————————————————
2296 028970d4 Promotion for root 01e411b8 (MT = 5b6c5cd8)
2296 028970d4 Relocation NEWVALUE for root 00223fc4
2296 028970d4 Relocation NEWVALUE for root 01e411b8

2295 028970d4 Promotion for root 01e411b8 (MT = 5b6c5cd8)
2295 028970d4 Relocation NEWVALUE for root 00223fc4
2295 028970d4 Relocation NEWVALUE for root 01e411b8

0:020> !help HistClear
——————————————————————————-
!HistClear

This command releases any resources used by the Hist-family of commands.
Generally there’s no need to call this explicitly, as each HistInit will first
cleanup the previous resources.

Share/Save/Bookmark

visual studio 2008 memory leak/memory issue on x86 - the operation could not be completed.Not enough storage is available to complete this operation

Friday, July 3rd, 2009

And No I don’t have a solution for it and probably the only workaround is to make your visual studio Large Address Aware(3GB switch) on x86.

vs2008 error message

Steps to re-create

1. download and unzip http://debuggingblog.com/resources/transcripts.zip

2. open, close the xml file and try to load it the second time

3. If you load the xml file using IE8, you will see the followings once you close it

——————– State SUMMARY ————————–
TotSize (      KB)   Pct(Tots)  Usage
19e6f000 (  424380) : 20.24%   : MEM_COMMIT
f3b4000 (  249552) : 11.90%   : MEM_FREE
56dcd000 ( 1423156) : 67.86%   : MEM_RESERVE

Almost 1.4 GB Memory allocated in GC Segements for xml file is still reserved even after unloading the xml file.

However, visual studio 2008 is another story

0:000> !eeheap -gc
ephemeral segment allocation context: none
segment    begin allocated     size
01830000 01831000  027ecadc 0×00fbbadc(16497372)
12860000 12861000  137616c4 0×00f006c4(15730372)
………………………………………………………………………………..

We have bunch of 16MB GC segments and most of the objects are in gen 2.

0c55d1ec   739459     85777244 Microsoft.XmlEditor.XmlElement
0c559858  1496448     89786880 Microsoft.XmlEditor.Identifier
001f1918   105303     97472784      Free
793308ec  2369315    387475460 System.String
Total 9375571 objects

we have 90+ MB of free blocks and 380+MB in System.String. There are 2.36 million string objects, yeah so you don’t wanna pick each one of the string object to find GC root unless Microsoft or someone is paying you a dime to dump each object and aha a dump a day will make your day for sure.

0:000> !dumpheap -mt 0c9d4134
Address       MT     Size
018f241c 0c9d4134       68
5f610108 0c9d4134       68
total 2 objects
Statistics:
MT    Count    TotalSize Class Name
0c9d4134        2          136 Microsoft.XmlEditor.XmlDocumentProperties
Total 2 objects
0:000> !objsize 018f241c
sizeof(018f241c) =    507372388 (  0×1e3de364) bytes (Microsoft.XmlEditor.XmlDocumentProperties)
0:000> !objsize 5f610108
sizeof(5f610108) =    507371128 (  0×1e3dde78) bytes (Microsoft.XmlEditor.XmlDocumentProperties)

Did you just see that almost 1GB of virtual memory rooted in Microsoft.XmlEditor.XmlDocumentProperties? That’s just outrageous, I mean why would microsoft visual studio take up 1.2 GB of virtual memory to open a 58MB file, although It does make use of schema context cache.

0:000> !gcroot -nostacks 018f241c
DOMAIN(001EC570):HANDLE(RefCnt):16d1b20:Root:018f241c(Microsoft.XmlEditor.XmlDocumentProperties)

GCHandle of type RefCnt is keeping reference to Microsoft.XmlEditor.XmlDocumentProperties

There is an OutOfMemoryException thrown with the following callstack

Exception object: 5ed00a34
Exception type: System.OutOfMemoryException
Message: Insufficient memory to continue the execution of the program.
InnerException: <none>
StackTrace (generated):
SP       IP       Function
0012F5A0 0C97E8B3 Microsoft_VisualStudio_Package_LanguageService_9_0!Microsoft.VisualStudio.NativeMethods.ThrowOnFailure(Int32, Int32[])+0×3b
0012F5AC 0C9E94BB Microsoft_VisualStudio_Package_LanguageService_9_0!Microsoft.VisualStudio.Package.Source.GetText()+0×3c
0012F5DC 0C9E9360 Microsoft_VisualStudio_Package_LanguageService_9_0!Microsoft.VisualStudio.Package.Source.BeginParse()+0×55
0012F644 0C9ECF38 Microsoft_VisualStudio_Package_LanguageService_9_0!Microsoft.VisualStudio.Package.Source.OnIdle(Boolean)+0×80
0012F654 0C9ECE28 Microsoft_VisualStudio_Package_LanguageService_9_0!Microsoft.VisualStudio.Package.LanguageService.OnIdle(Boolean)+0xd8
0012F674 0C9ECCDD Microsoft_XmlEditor!Microsoft.XmlEditor.XmlLanguageService.OnIdle(Boolean)+0×35
0012F684 0C9ECC34 Microsoft_XmlEditor!Microsoft.XmlEditor.Package.FDoIdle(UInt32)+0xc4

Conclusion

I hope this is fixed in Visual Studio 2010, I do need to try it out.

Share/Save/Bookmark

Silverlight App not working as expected in FireFox, IE shows error message “Unhandled Error in Silverlight 2 Application - Element is already the child of another element”

Monday, June 8th, 2009

Problem Description

This exception is thrown in a Silverlight 2 App while trying to show and hide System.Windows.Controls.TabItem in System.Windows.Controls.TabControl

Steps to Recreate

browse to http://debuggingblog.com/sl/project1/default.html and click on button “Show Angelina”  3 times

Debugging Silverlight app using WinDbg

We will be analyzing a memory dump of IE on System.InvalidOperationException

1. We have the following two InvalidOperationException exception objects on managed heap

0:007> !dumpheap -type System.InvalidOperationException
Address       MT     Size
1013c79c 0f1ff824       72
1015addc 0f1ff824       72
total 2 objects
Statistics:
MT    Count    TotalSize Class Name
0f1ff824        2          144 System.InvalidOperationException
Total 2 objects

2. Lets get the stack trace when this exception occured

0:007> !pe 1013c79c
Exception object: 1013c79c
Exception type:   System.InvalidOperationException
Message:          Element is already the child of another element.
InnerException:   <none>
StackTrace (generated):
SP       IP       Function
021FF640 0F40F06A !MS.Internal.XcpImports.CheckHResult(UInt32)+0×32
021FF64C 0E3A4CE9 !MS.Internal.XcpImports.SetValue(MS.Internal.INativeCoreTypeWrapper, System.Windows.DependencyProperty, System.Windows.DependencyObject)+0xa9
021FF680 0E3A4118 !MS.Internal.XcpImports.SetValue(MS.Internal.INativeCoreTypeWrapper, System.Windows.DependencyProperty, System.Object)+0×100
021FF71C 0E3A3FBC !System.Windows.DependencyObject.SetObjectValueToCore(System.Windows.DependencyProperty, System.Object)+0×1c4
021FF760 0E3A1CBB !System.Windows.DependencyObject.SetValueInternal(System.Windows.DependencyProperty, System.Object, Boolean, Boolean, Boolean)+0×503
021FF834 0E3A17A1 !System.Windows.DependencyObject.SetValueInternal(System.Windows.DependencyProperty, System.Object)+0×21
021FF848 0E3A1763 !System.Windows.DependencyObject.SetValue(System.Windows.DependencyProperty, System.Object)+0×1b
021FF85C 0E3D68D7 !System.Windows.Controls.ContentControl.set_Content(System.Object)+0×37
021FF870 0E3212CF !SilverlightApplication1.Page.ShowContent()+0×57

021FF880 0E3208BE !SilverlightApplication1.Page.ShowHideAngelina_Click(System.Object, System.Windows.RoutedEventArgs)+0×76
021FF898 0E4099B5 !System.Windows.Controls.Primitives.ButtonBase.OnClick()+0×5d
021FF8B0 0E40993F !System.Windows.Controls.Button.OnClick()+0×47
021FF8C0 0E40986D !System.Windows.Controls.Primitives.ButtonBase.OnMouseLeftButtonUp(System.Windows.Input.MouseButtonEventArgs)+0×85
021FF8D0 0E4097D1 !System.Windows.Controls.Control.OnMouseLeftButtonUp(System.Windows.Controls.Control, System.EventArgs)+0×41
021FF8E0 0E36C887 !MS.Internal.JoltHelper.FireEvent(IntPtr, IntPtr, Int32, System.String)+0×1b7

3. This exception occured while trying to Set the Content of a Control, so lets find out what is this content and who holds this UIElement, lets get the IL of method SilverlightApplication1.Page.ShowContent()

0:007> !dumpil 0f224060
ilAddr = 0f4607f8
IL_0000: ldarg.0
IL_0001: newobj System.Windows.Controls.TabItem::.ctor  // A new TabItem object is created
IL_0006: stfld SilverlightApplication1.Page::item
IL_000b: ldarg.0
IL_000c: ldfld SilverlightApplication1.Page::item
IL_0011: ldstr “Angelina Jolie”
IL_0016: callvirt System.Windows.Controls.TabItem::set_Header
IL_001b: ldarg.0
IL_001c: ldfld SilverlightApplication1.Page::item//Get the reference to a  TabItem object created earlier
IL_0021: ldarg.0
IL_0022: ldfld SilverlightApplication1.Page::content
IL_0027: callvirt System.Windows.Controls.ContentControl::set_Content //Set_Content is called  with SilverlightApplication1.Page object’s data member content
IL_002c: ldarg.0
IL_002d: ldfld SilverlightApplication1.Page::TabList
IL_0032: callvirt System.Windows.Controls.ItemsControl::get_Items
IL_0037: ldarg.0
IL_0038: ldfld SilverlightApplication1.Page::item

We have 2 interesting objects TabItem and SilverlightApplication1.Page::content here

4. First get the address of SilverlightApplication1.Page::content object which is being set in TabItem because that’s what resulting in exception

0:007> !do 1003b3c0
Name:        SilverlightApplication1.Page
Fields:
MT    Field   Offset                 Type VT     Attr    Value Name
0f42e730  400000d       58 …Windows.UIElement  0 instance 1011c0e8 content

0:007> !dumpheap -mt 0e396648
Address       MT     Size
1003ed3c 0e396648      140
101276a0 0e396648      140
1013c050 0e396648      140
total 3 objects
Statistics:
MT    Count    TotalSize Class Name
0e396648        3          420 System.Windows.Controls.TabItem
Total 3 objects

We have 3 TabItem objects on heap and the address of the content object(UIElement) is 1011c0e8

5. The next step to find out is which TabItem object has this UIElement and why is this TabItem object still around.

0:007> !dumpheap -mt 0e396648
Address       MT     Size
1003ed3c 0e396648      140
101276a0 0e396648      140
1013c050 0e396648      140
total 3 objects

0:007> !do 101276a0
Name:        System.Windows.Controls.TabItem
MethodTable: 0e396648
EEClass:     0e3bb768
Size:        140(0×8c) bytes
File:        System.Windows.Controls, Version=2.0.5.0, Culture=neutral
Fields:
MT    Field   Offset                 Type VT     Attr    Value Name

0ec444e8  40002b4       34        System.Object  0 instance 1011c0e8 _treeContent

0:007> !gcroot 101276a0
Note: Roots found on stacks may be false positives. Run “!help gcroot” for
more info.
Scan Thread 7 OSTHread 1710
Scan Thread 27 OSTHread 1200
Scan Thread 28 OSTHread 1278
Scan Thread 31 OSTHread 1488

6. TabItem object with address = 0e396648 has the reference to content object 1011c0e8 and this object is not rooted so that means this object is ready to be garbage collected.

There is already a TabItem object on managed heap holding reference to the same content object which is being assigned to new TabItem object on SilverlightApplication1.Page.ShowHideAngelina_Click(), that’s why we get System.InvaildOperationException with the error message “Element is already the child of another element”

7. You can get the address of Method Description for each of the methods in SilverlightApplication1.Page

0:007> !dumpmt -md 0f224094
————————————–
MethodDesc Table
Entry       MethodDesc      JIT Name
0e320848   0f224038 JIT SilverlightApplication1.Page.ShowHideAngelina_Click(System.Object,
0e321300   0f224058      JIT SilverlightApplication1.Page.HideContent()
0e321278   0f224060      JIT SilverlightApplication1.Page.ShowContent()

Let’s look at the implementation of ShowHideAngelina_Click

0:007> !dumpil 0f224038
ilAddr = 0f46071c
IL_0012: ldstr “Show”
IL_0017: callvirt System.String::Contains
IL_001c: brfalse.s IL_003f
IL_001e: ldloc.0
IL_0039: call SilverlightApplication1.Page::ShowContent
IL_003e: ret
IL_003f: ldloc.0
IL_004a: ldarg.0
IL_004b: call SilverlightApplication1.Page::HideContent
IL_0050: ret

This method checks the Text of the button and calls ‘ShowContent’ if the text =“Show” otherwise calls ‘HideContent’, so let’s look at HideContent implementation

0:007> !dumpil 0f224058
ilAddr = 0f4607d6
IL_0000: ldarg.0
IL_0001: ldfld SilverlightApplication1.Page::item
IL_0006: brfalse.s IL_001f
IL_0008: ldarg.0
IL_0009: ldfld SilverlightApplication1.Page::TabList
IL_000e: callvirt System.Windows.Controls.ItemsControl::get_Items
IL_0013: ldarg.0
IL_0014: ldfld SilverlightApplication1.Page::item
IL_0019: callvirt class [System.Windows]System.Windows.PresentationF◆䷒¥⽫Ż�::Remove
IL_001e: pop
IL_001f: ret

HideContent calls TabList.get_Items and remove the TabItem from System.Windows.Control.TabControl

Resolution

HideContent() method removes the TabItem from TabControl.Items so unless TabItem is Garbage Collected  removed TabItem will still hold the reference to Content UIElement. You can quickly fix this issue by assiging Content = null, when that item is removed as shown below in HideContent

//OLD Buggy Implementation

private void HideContent()
{
if (item != null)
this.TabList.Items.Remove(item);
}

//new implementation to make sure Content is set to null

private void HideContent()
{
item.Content = null; //assign TabItem Content = NULL
if (item != null)
this.TabList.Items.Remove(item);
}

Share/Save/Bookmark

High CPU Usage and Windows Forms application hang with SQLCE database and the SqlCeLockTimeoutException

Thursday, May 7th, 2009

Problem Description

Windows froms application on a user machine is consuming high cpu resource and application appears to be hung.

Analysis

Usually, the cause of high cpu hang is

1. infinite loop

2. machine is busy

3. High CPU in GC

4. Thread is spinning with sleep time in 100s of milliseconds

We will start with collecting the user mode memory dump of windows forms application. We should always collect more than one dump to analyze hang scenario. I prefer to use adplus script with -r 3 300″ to collect where 3 specifies the number of times ADPlus will run in hang mode and 300 is the interval in seconds between each run.

1. Lets open the memory dump using WinDbg

2. The first command to run to analyze hang is !runaway extension to find out how long each of the threads have been running for

0:000> !runaway 3
User Mode Time
Thread       Time
0:218       0 days 0:08:38.359
8:a50       0 days 0:02:29.968
Kernel Mode Time
Thread       Time
8:a50       0 days 0:03:23.875
0:218       0 days 0:02:06.046

I have listed the interesting threads. Thread # 0 has consumed 2 mins in kernel mode and more than 8 mins in user mode and thread # 8 has consumed nearly equal amount of time around 2:xx in user and kernel mode.

3. Lets look at thread stack of each of thread # 0 and thread # 8

0:000> ~8e!clrstack
OS Thread Id: 0xa50 (8)
ESP       EIP
06d8f604 7c90e4f4 [InlinedCallFrame: 06d8f604] System.Windows.Forms.UnsafeNativeMethods.WaitMessage()
06d8f600 7b1d8e48 System.Windows.Forms.Application+ComponentManager.System.Windows.Forms.UnsafeNativeMethods.IMsoComponentManager.FPushMessageLoop(Int32, Int32, Int32)
06d8f69c 7b1d8937 System.Windows.Forms.Application+ThreadContext.RunMessageLoopInner(Int32, System.Windows.Forms.ApplicationContext)
06d8f6f0 7b1d8781 System.Windows.Forms.Application+ThreadContext.RunMessageLoop(Int32, System.Windows.Forms.ApplicationContext)
06d8f720 7b195911 System.Windows.Forms.Application.Run(System.Windows.Forms.Form)
06d8f734 05893609 MyApp.Splasher.ShowThread()

We can ignore thread 8 because this is displaying splasher screen and pumping the message.

0:000> ~0e!clrstack
OS Thread Id: 0×218 (0)
ESP       EIP
0012e4f0 7c90e4f4 [NDirectMethodFrameStandalone: 0012e4f0] System.Data.SqlServerCe.NativeMethods.CompileQueryPlan(IntPtr, System.String, System.Data.SqlServerCe.ResultSetOptions, IntPtr[], IntPtr, Int32, IntPtr ByRef, IntPtr)
0012e518 07d72c57 System.Data.SqlServerCe.SqlCeCommand.CompileQueryPlan()
System.Data.SqlServerCe.SqlCeCommand.ExecuteCommand(System.Data.CommandBehavior, System.String,

In Thread # 0 ExecuteCommand transtions the instruction to Unmanaged SQLCE code to compile query plan(SqlCeCommand.CompileQueryPlan())

We will now look at unmanaged stack

0:000> ~0kb
ChildEBP RetAddr  Args to Child
0012cd18 7c802455 0000007f 00000000 0012cd44 kernel32!SleepEx+0×61
0012cd28 7d57b50e 0000007f 04c105a8 0012cda4 kernel32!Sleep+0xf
0012cd44 7d57ffa0 00001388 00000002 04c105a8 sqlcese35!Session::_WaitForLock+0×5f
0012cd80 7d580b6b 0176a388 01764778 00000001 sqlcese35!Session::_Lock+0×280
0012cdcc 7d5571bb 0012cdf8 0000000c 00000003 sqlcese35!Session::LockRow+0xba
0012ce20 7d56c74e 004ad00c 04c105a8 003ff978 sqlcese35!BaseMover::_SetCurrentBookmark+0×88
0012ce34 7d56cad0 00000000 00000001 04c105a8 sqlcese35!IndexMover::_SetCurrentRow+0×80
0012ceb0 7d56cb52 0000000d 003ffa98 00000001 sqlcese35!IndexMover::_Seek+0×191
0012ced4 7d565e5c 00000000 00000000 04c105a8 sqlcese35!IndexMover::Seek+0×5a
0012cef4 7d57be24 003ff04c 00000000 00000000 sqlcese35!Cursor::Seek+0×88
0012cf24 7d7ba8fa 04bd4788 0180c8ce 0181963c sqlcese35!Session::OpenHistogram+0xcb
0012cf80 7d7b31f2 00000000 0180c8ce 0181963c sqlceqp35!StatMan::LoadHistogram+0×53
0012cfac 7d7b3558 04b9b31c 0180c8a4 0180f874 sqlceqp35!QPIndexSchemaInfo::LoadHistogramHelper+0×57
0012d524 7d79dab5 04b9b31c 0180c8a4 0180c8a4 sqlceqp35!QPIndexSchemaInfo::LoadHistogram+0×62
0012d54c 7d7b8b23 04b9b31c 00024f1c 00000069 sqlceqp35!QPIScanAccessPlan::CostIndexScan+0×56
0012d5fc 7d79dea3 04b9b31c 0012d818 0180c8a4 sqlceqp35!QPIndexList::Matches+0×265
0012d6a8 7d79e339 04b9b31c 0012d818 01802274 sqlceqp35!RelBaseBlockPlan::CreateIScanPlan+0×7d
0012d6f8 7d79f9ff 04b9b31c 0012d818 0012d728 sqlceqp35!RelBaseBlockPlan::CreatePlan+0×130

Let’s understand what’s going on in this call stack

Reader thread calls command.ExecuteReader(CommandBehavior.SingleRow)

ExecuteReader makes a call to ExecuteCommand

ExecuteCommand transtions the instruction to Unmanaged SQLCE code to compile query plan(SqlCeCommand.CompileQueryPlan())

The below steps are getting executed in umanaged

  • Unmanaged code creates the query plan sqlceqp35!RelBaseBlockPlan::CreatePlan
  • Reader Thread’s Query plan has created a query plan
  • Cursor is performing index seek on predicate column = @columnId
  • Index seeker sets the current row during seek
  • sql ce session tries to lock the current row
  • Thread just spins on lock with Sleep time of 127 milliseconds

Since thread sleep time is only 127 milliseconds so that explains the high cpu usage. So why is it spinning on lock forever.

4. Since this is a managed application, we should always look at all the exception objects on managed heap

0:000> !dumpheap -type Exception
Address       MT     Size
………………………………..
173e4aec 07d84d68       76
Statistics:
MT Count TotalSize Class Name
……………………………………………………
07d84d68       32         2432
System.Data.SqlServerCe.SqlCeLockTimeoutException
Total 44 objects

0:000> !pe 173e4aec
Exception object: 173e4aec
Exception type: System.Data.SqlServerCe.SqlCeLockTimeoutException
Message:
InnerException:
StackTrace (generated):
SP IP Function
0012E4A0 07D72D63 System_Data_SqlServerCe_ni!System.Data.SqlServerCe.SqlCeCommand.CompileQueryPlan()+0×24b
0012E560 07D73D4C System_Data_SqlServerCe_ni!System.Data.SqlServerCe.SqlCeCommand.ExecuteCommand(System.Data.CommandBehavior, System.String, System.Data.SqlServerCe.ResultSetOptions)+0×23c
0012E5A0 07D73EE6 System_Data_SqlServerCe_ni!System.Data.SqlServerCe.SqlCeCommand.ExecuteReader(System.Data.CommandBehavior)+0×16
0012E5A4 04A8C52D GlobalPlanning_Sync!GlobalPlanning.Sync.PlanetSyncManager.OnSyncProgress(System.Object, Microsoft.Synchronization.Data.SyncProgressEventArgs)+0×81d

Dumping the exception object confirms the same call stack as thread 0

You can get the error details by dumping the heap for

0:000> !dumpheap -type SqlServerCe.SqlCeError
Address       MT     Size
173e4a74 07d826dc       36
total 64 objects
Statistics:
MT    Count    TotalSize Class Name
……………………………………………………………………………………….
07d826dc       32         1152 System.Data.SqlServerCe.SqlCeError
Total 64 objects

You can get the formatted error message by dumping System.Data.SqlServerCe.SqlCeError.formattedMessage

0:000> !do 173762b0
SQL Server Compact timed out waiting for a lock. The default lock time is 2000ms for devices and 5000ms for desktops. The default lock timeout can be increased in the connection string using the ssce: default lock timeout property. [ Session id = 2,Thread id = 536,Process id = 3564,Table name = __SysObjects,Conflict type = s lock (x blocks),Resource = RID: 1197:12 ]

In above error message, you will notice that Resource = RID which is a row identifier that means a single row in a table is locked. Conflict type = s lock which means that this is in shared lock mode for read access.

This is confirmed by dumping SQLCECommand object on managed heap that select statement is being executed.

Another point to note is, with default isolation level SELECT statement in SQL doesn’t require S lock unless it is in a transaction.

SQL CE compact edition online book says that

“When a statement has waited longer than the LOCK_TIMEOUT setting, the blocked statement is canceled automatically, and the error message SSCE_M_LOCKTIMEOUT, “The system timed out waiting for a lock,” is returned to the application.However, SQL Server Compact 3.5 does not roll back or cancel any transaction that contains the statement. The application must have an error handler that can trap the error message SSCE_M_LOCKTIMEOUT. If an application does not trap the error, it can proceed without knowing that an individual statement within a transaction has been canceled. Errors can occur because statements later in the transaction might depend on the statement that was not executed.”

However, if your database file is fragmented or large its possible that your select query may not run with default time out of 5 seconds in desktop so it will keep on throwing locktimeout exception every 5 seconds for every select query.

There is only one connection object on managed heap and no other threads are executing the SQLCECommand so this is most likely a fragmented database because this issue gets resolved after database is compacted.

However, we should also look at thread stack memory to find any hidden exception to make sure there is no other handled exception resulting in not leaving the spinlock

you can get the Thread Environment Block stack address using !teb command and use dds command to dump words and symbols

0:000> !teb
TEB at 7ffdf000
ExceptionList:        0012cd08
StackBase:            00130000
StackLimit:           00126000

0:000> dds 00126000 00130000

………………………………………………………………………………………….

0012df94  7d58272d sqlcese35!SpinLock::Enter+0xd
0012dfa8  7d55bc0b sqlcese35!AutoSpinLock::AutoSpinLock+0×17
0012dfac  7d55bc48 sqlcese35!AutoSpinLock::Leave+0×12
0012dfb4  7d57c150 sqlcese35!Session::GetSyncSession+0×4c
0012dfc8  7d58ebbd sqlcese35!TrackingUtils::GetSyncSessionInfo+0×19
0012dfcc  7d58ebf4 sqlcese35!TrackingUtils::GetSyncSessionInfo+0×50

……………………………………………………………………………………………….

Only interesting bits are shown above for brevity but there is no exception dispatcher found on raw stack memory so that confirms that no exception has been thrown in this thread call stack

Resolution and Workaround

1. There is no way for you to determine in SQL CE whether a resource is locked before locking it, except to attempt to access the data catch the time out exception, so one workaround is to have a configurable SQL CE lock timeout so that when SQL CE database is fragmented over time and when you get an exception you can increase the timeout value

2.  SQL CE database is file based loaded in process’s address space so  SQL CE database can become fragmented over time and fragmentation causes some serious performance issue. To avoid fragmentation,  you can compact the database when locktimeout exception is thrown

However, if the lock is due to coding error in that case review your code path but in this case there is only one connection object with a command object on one thread so this was a fragmentation issue and locktimeout issue did get resolved after compacting the database.

Share/Save/Bookmark

Debugging compilation failure(cl.exe) using windbg - fatal error C1902: Program database manager mismatch; please check your installation

Sunday, May 3rd, 2009

Problem Description:

while trying to build a win32 console project from visual studio 2008, the error message is “fatal error C1902: Program database manager mismatch; please check your installation”

I decided to use WinDbg to find the root cause.

Steps to Debug

1. Since this fails during compilation, so we have to attach WinDbg to “cl.exe”and since visual studio 2008 is launching cl.exe so you have to auto-attach it (you can directly execute cl.exe)

2. In order to auto-attach a process to debugger, you can use “gflags.exe”(exists in WinDbg installation folder)

3. See the snapshot below on how to use gflags to auto attach to a process cl.exe when launched

Gflags Snapshot

4. Build your project and you will see WinDbg getting launched and attached to cl.exe

5. hit f5 and execute getlasterror command to see the details when stopped on exception
0:000> !gle
LastErrorValue: (Win32) 0 (0) - The operation completed successfully.
LastStatusValue: (NTSTATUS) 0xc0000135 - {Unable To Locate Component} This application has failed to start because %hs was not found. Re-installing the application may fix this problem.

6. Below is the loader message matching the last error description

LDR: LdrGetDllHandle, searching for C:\Program Files\Microsoft Visual Studio 9.0\VC\bin\mspdbsrv.exe from ……
kLDR: LdrpCheckForLoadedDll - Unable To Locate C:\Program Files\Microsoft Visual Studio 9.0\VC\bin\mspdbsrv.exe ……………..

7. Basically, build is failing because cl.exe(compiler) can’t find mspdbsrv.exe for symbols.

You can copy exe in the path(C:\Program Files\Microsoft Visual Studio 9.0\VC\bin\) or change the registry to point to the correct location. It appears that visual studio 2005 upgrade may have caused this issue.

Share/Save/Bookmark

The call to LoadLibrary(sos2) failed, Win32 error 0n14001, “This application has failed to start because the application configuration is incorrect. Reinstalling the application may fix this problem.”

Thursday, April 30th, 2009

Mike Semikin has reported this issue while executing windbg command .load sos2

The call to LoadLibrary(sos2) failed, Win32 error 0n14001
“This application has failed to start because the application configuration is incorrect. Reinstalling the application may fix this problem.”
Please check your debugger configuration and/or network access.

Resolution
This is because visual studio 2008 is used as a development environment for sos2 and CRT is installed in assembly cache(winSxS) folder so dependent dll will be missing

As an immediate workaround, please install visual c++ 2008 runtime redistributable from Microsoft,  download from http://www.microsoft.com/Downloads/details.aspx?familyid=9B2DA534-3E03-4391-8A4D-074B9F2BC1BF&displaylang=en

Share/Save/Bookmark

.NET Crash/OutofMemoryException/Memory Leak - .NET windows forms and infragistics datagrid and why is System.Drawing.Image object not getting finalized??

Monday, March 2nd, 2009

Issue Description
Windows forms application has crashed with OOM exception. Before application crashes, cpu is almost pegged at 100% for a few minutes

Root Cause Analysis using WinDbg

Collect full memory dump at set intervals

  • You could get a crash dump and analyze the managed heap to find out rooted objects. But, since we have access to the system I prefer to get a dump at set intervals and compare the managed heap statistics because that makes it a little easier to find the objects which are surviving GC over a period of time.
  • We will use ADPlus to automate this task
  • I will run the script to get a full memory dump 4 times every 2 minutes
  • Command to automate this task is “cscript.exe adplus.vbs -hang -pn <myapp.exe> -quiet -r 4 120″
  • First Dump file size is around 800MB which also indicates process’s memory usage at that time
  • Second Dump file size is around 1.2 GB
  • Third Dump file size is around 1.6 GB and a little later application has crashed.

This is a pure .net application, so we are going to jump ahead and look at the managed heap stats, gc handles and the objects in finalize queue. We will use sos2.dll copied under the same folder as windbg executable, we will dump only pinned and strong gchanldes to identify gc handles increasing over the time because these handles could cause memory leak. Please note that,!gcht (gchandles by type) command is only available in our windbg extension sos2.dll. You could use sos.dll!gchandles to dump gchandles but it won’t give you the objects and their stats by type and you will have to figure out yourself probably by looking at the root.
GCHandles Stats from First Dump
0:000> .load sos2
0:000> !gcht -t p

Pinned GC Handle Statistics:
Pinned Handles: 60
Statistics:
………………………………………………….
Total 60 objects
0:000> !gcht -t s
Strong GC Handle Statistics:
Strong Handles: 185
Statistics:
………………………………………………………
Total 185 objects
GCHandles Stats from Second Dump
0:000> !gcht -t p
Pinned GC Handle Statistics:
Pinned Handles: 60
Statistics:
………………………………………………………………
Total 60 objects
0:000> !gcht -t s
Strong GC Handle Statistics:
Strong Handles: 186
Statistics:
……………………………………………………………..
Total 186 objects

Lets move over since we don’t see anything interesting with gchandles, no. of pinned gchandles remain same and strong gc handles count has increased only by one.

  • We will compare finalize queue stats in dumps, I am only including the interesting objects and the interesting comments for the sake of brevity

Finalize Queue in first dump

0:000> !finalizequeue
generation 2 has 9433 finalizable objects (05501508->0550a86c)
Ready for finalization 0 objects (0550af4c->0550af4c)
Statistics:
MT    Count    TotalSize Class Name
7ae3c9f8     1907 45768 System.Drawing.Bitmap
…………………………………………….
Total 9873 objects

Finalize Queue in second dump

0:000> !finalizequeue
generation 2 has 10545 finalizable objects (05501508->0550b9cc)
Ready for finalization 0 objects (0550bdac->0550bdac)
Statistics:
MT    Count    TotalSize Class Name
7ae3c9f8     2951 70824 System.Drawing.Bitmap
…………………………………………….
Total 10793 objects

Aha, Do we see something interesting here???? Of course, numbers of finalizable objects in generation 2 have increased by almost 1000 and on top of that number of objects ready to be finalized is 0. So why are these objects not getting finalized?

  • We have to find out why System.Drawing.Bitmap is not getting finalized.

As shown in above step,  generation 2 has 9433 finalizable objects (05501508->0550a86c).
We have finalizable objects starting from memory address 05501508 and ending at 0550a86c. You don’t want to dumpheap by type(System.Drawing.Bitmap) to look at the roots to this object, you will have to dump too many objects unless you get lucky. The better way is probably to display the memory and get the address of an object. Size of the System.Drawing.Bitmap object is 24 Bytes so we may be able to get the object address by specifying the address range ending with finalize queue @ 0550a86c. We will subtract 24*4 = 96 bytes(60) from 0550a86c which is 550A80C.
First column is the finalize queue address and the rest are the memory addresses of the objects
0:000> dd 550A80C 0550a86c

0550a80c  17b6e074 17b6e11c 17b6e1c4 17b6e26c
…………………………………………………………………………………….
0550a86c  17b76734
0:000> !do 17b6e074

Name: System.Drawing.Bitmap —-> Make sure this is System.Drawing.Bitmap
MethodTable: 7ae3c9f8
EEClass: 7ade4014
Size: 24(0×18) bytes

0:000> !gcroot -nostacks 17b6e704
DOMAIN(001581B0):HANDLE(Strong):ff11f8:Root:01981b64(System.Threading.Thread)->
………………………………………………………………………………………
01d00f54(MyApp.MyForm)->
160875cc(MyApp.Controls.MyControl)->
1618b578(Infragistics.Win.UltraWinGrid.UltraGridRow)->
14f1a7c0(Infragistics.Win.UltraWinGrid.CellsCollection)->
…………………………………………………………………………………….
17b6e674(Infragistics.Win.UltraWinGrid.UltraGridCell)->
17b6e6f4(Infragistics.Win.UltraWinGrid.UnBoundData)->
17b6e704(System.Drawing.Bitmap)

This is rooted in some strong handles so this is not rooted in finalization queue what that means is object is not ready to be finalized yet as we saw in finalizeQ stats. I am hiding the customer data so basically, we have a windows forms containing user control with infragistics UltraGrid and the System.Drawing.Bitmap is being set in a cell.

Let’s look at the sample code
foreach (UltraGridRow row in rows)
{
row.Cells[someindex] =<bitmap object>
}
This is where we have the problem because if there are let’s say 5000 rows then we are creating 5000 bitmap objects and as long as form is alive these objects will never be disposed. System.Drawing.Bitmap uses unmanaged GDIPlus library and this is not a lightweight object that’s why it was crashing with outofmemory exception and only in a particular scenario but this may go un-noticed during test cycle by QA team unless the test case covers this very particular scenario.
Resolution
I am sure there are many ways to fix it but one easy way to fix is create the drawing objects for rows visible in the client area and handle scroll/resize events to set the image and dispose the objects not in use.

Share/Save/Bookmark

Beginner guide to WinDbg - Part 2

Monday, February 23rd, 2009

Thank you all for your response to Beginner guide to WinDbg - Part 1. Now is the time for Beginner guide to WinDbg - Part 2 and I would be more than happy to get your suggestions on what would you like to see in part 2

thanks

Share/Save/Bookmark

Memory Access Violation in SQL Server Compact Edition(CE)

Wednesday, February 18th, 2009

Scenario

Windows Forms Application is throwing first chance memory access violation exception. Windows Forms application implements Application.ThreadException to log any unhandled exceptions in UI Threads. Log file always have the call stack  SqlCeCommand.ExecuteResultSet->SqlCeCommand.CompileQueryPlan->[NativeMethod]CompileQueryPlan(native sql ce dll). This application is using only one thread to execute commands in SQL CE so any multithreading issue is ruled out.

Some Ranting

Windows forms application in question is a pure .net application with no interop layer so the immediate suspicion was on upgraded SQL CE 3.5 SP1. There has been a few memory access violation in the past which apparently got resolved after installing 3.5. That made it very easy to blame it on Microsoft SQL CE dll. However, I do have to agree that Microsoft SQL CE exception handling is not as good as other .NET libraries. You will get “some weird - not making any sense” exception when you have a parameter missing or the column data type in your parametrized sql query.

Steps to Resolution using WinDbg

1. Get a full memory dump and share with microsoft support team

2. Since this is a first chance memory access violation that means you need to get a full dump on first chance exception

3. Create a adplus configuration file as shown below

<ADPlus>
<Settings>
<RunMode>CRASH</RunMode>
<Option>Quiet</Option>
<ProcessName>MyApp.exe</ProcessName>
</Settings>
<Exceptions>
<Option>NoDumpOnFirstChance</Option>
<Config>
<Code>clr;av</Code><!–to get the full dump on clr access violation–>
<Actions1>FullDump</Actions1>
<ReturnAction1>gn</ReturnAction1>
</Config>
</Exceptions>
</ADPlus>

4. run cscript.exe adplus.vbs -c <config file name>

5. Analyze the memory dump with 1st chance access violation using windbg

FAULTING_IP:
kernel32!InterlockedExchange

and if we dump the managed stack and look at other threads nothing unusual and the manged stack is just pointing to SqlCeCommand.CompileQueryPlan

It didn’t make much sense to us and we don’t have enough time to dig into it without private symbols for sql ce native dll.

We may be getting wrong call stack because of Heap corruption, so let’s get another full memory after enabling the full page heap. You can google on full page heap in case you have never used it or for quick reference visit http://msdn.microsoft.com/en-us/library/ms220938(VS.80).aspx

You can enable full page heap with the following command gflags /p /enable myapp.exe /full which adds an entry into your system registry and one should always disable it once done, visit http://msdn.microsoft.com/en-us/library/cc265936.aspx for more information, gflags gets installed with debugging tools for windows.

6. Get a full memory dump after enabling the full page heap

7. Analyze the dump, lets look at the call stack on exception thread . Run sos!dumpstack command without -EE , if you want to look at the managed and unmanaged both at the same time.Exception is thrown while Executing the resultset

0:000> !dumpstack
OS Thread Id: 0×12bc (0)
Current frame: sqlceme35!ME_GetKeyInfo+0×36
ChildEBP RetAddr  Caller,Callee
0012e0f8 79ec6feb mscorwks!DoSpecialUnmanagedCodeDemand+0×65, calling mscorwks!_EH_epilog3
0012e124 08326091 08326091
0012e154 085d935c (MethodDesc 0×857973c +0×1dc System.Data.SqlServerCe.SqlCeDataReader.FillMetaData(System.Data.SqlServerCe.SqlCeCommand)), calling 08588150
0012e1cc 085e38e4 (MethodDesc 0×857a7fc +0×44 System.Data.SqlServerCe.SqlCeCommand.InitializeDataReader(System.Data.SqlServerCe.SqlCeDataReader, Int32)), calling 08588cec
0012e1dc 085e3c6e (MethodDesc 0×857a7b8 +0×15e System.Data.SqlServerCe.SqlCeCommand.ExecuteCommand(System.Data.CommandBehavior, System.String, System.Data.SqlServerCe.ResultSetOptions)), calling 085897d4
0012e218 085e3e8b (MethodDesc 0×857a7a8 +0×2b System.Data.SqlServerCe.SqlCeCommand.ExecuteResultSet(System.Data.SqlServerCe.ResultSetOptions, System.Data.SqlServerCe.SqlCeResultSet)), calling 085897b4

8. Let’s look at all the call stack from all the threads, i will only show the interesting thread for brevity

0:000> ~*e!clrstack

OS Thread Id: 0×12c4 (2)
ESP       EIP
02cffbc8 7c90e4f4 [NDirectMethodFrameStandalone: 02cffbc8] System.Data.SqlServerCe.NativeMethods.CloseStore(IntPtr)
02cffbd8 085dbbf6 System.Data.SqlServerCe.SqlCeConnection.ReleaseNativeInterfaces()
02cffbe4 085db597 System.Data.SqlServerCe.SqlCeConnection.Close(Boolean)
02cffbf0 085dbcdf System.Data.SqlServerCe.SqlCeConnection.Close()
02cffbfc 085d80b3 System.Data.SqlServerCe.SqlCeDataReader.Dispose(Boolean)
02cffc18 085d7f25 System.Data.SqlServerCe.SqlCeDataReader.Finalize()

I see that finalizer thread is disposing the SqlCeConnection object.

9. How many connection object do we have on managed heap?

0:000> !dumpheap -type System.Data.SqlServerCe.SqlCeConnection
Address       MT     Size
0347ac10 085f2de4       96
035bf8c8 085f2de4       96
total 2 objects
Statistics:
MT    Count    TotalSize Class Name
085f2de4        2          192 System.Data.SqlServerCe.SqlCeConnection
Total 2 objects

so hey, we have 2 SqlCeConnection objects on managed heap one of them is getting finalized so nothing to worry.

How about we look at the connection object address on exeption thread and finalizer thread.

0:000> ~2e!clrstack -p
OS Thread Id: 0×12c4 (2)
ESP       EIP
02cffbc8 7c90e4f4 [NDirectMethodFrameStandalone: 02cffbc8] System.Data.SqlServerCe.NativeMethods.CloseStore(IntPtr)
02cffbd8 085dbbf6 System.Data.SqlServerCe.SqlCeConnection.ReleaseNativeInterfaces()
PARAMETERS:
this = 0×0347ac10

dump the command object from thread executing resultset

0:000> !do 0×03805988
Name: System.Data.SqlServerCe.SqlCeCommand
MethodTable: 085f3194
EEClass: 0857561c
Size: 120(0×78) bytes
(C:\WINDOWS\assembly\GAC_MSIL\System.Data.SqlServerCe\3.5.1.0__89845dcd8080cc91\System.Data.SqlServerCe.dll)
Fields:
MT    Field   Offset                 Type VT     Attr    Value Name
085f2de4  40000f6       28 …e.SqlCeConnection  0 instance 0347ac10 connection

Guess what, ExecuteResultSet is using the same connection object as the one getting finalized. Now, everything is making sense and of course you are going to get the memory access violation when sql ce is in middle of executing your query in native library. But, why is this connection object getting disposed?

10. find the module name from exception thread

0:000> lmv m MyApp_SqlCe*
start    end        module name
06e10000 0700c000   MyApp_SqlCe   (deferred)

11. Save the module and browse using reflector

0:000> !savemodule 06e10000 c:\temp\myappsqlce.dll
3 sections in file
section 0 - VA=2000, VASize=1f5794, FileAddr=1000, FileSize=1f6000
section 1 - VA=1f8000, VASize=3c0, FileAddr=1f7000, FileSize=1000
section 2 - VA=1fa000, VASize=c, FileAddr=1f8000, FileSize=1000

12. The culprit

SqlCeDataReader reader = command.ExecuteReader(CommandBehavior.CloseConnection);
Basically, at some point before calling ExecuteResultSet, the above line was getting executed on the same connection object with CommandBehavior = CloseConnection. This was disposing the connection object that’s why depending on memory pressure and finalizer queue, memory access violation was thrown randomly.

You may get disposed exception or other managed exception but in this case the timing was such that connection object always ended up in getting finalized while query is getting executed in native sql ce dll.

Share/Save/Bookmark

sos2.dll version 1.2 released

Tuesday, November 18th, 2008

Version 1.2 supports a command to dump GC Handles by type. Actually,  support for this command was suggested by Ilya Ryzhenkov (ReSharper Product Manager at JetBrains) .

Download it from here

To Do List is to dump all the handles for a type along with stats with an option to specify the # of handles. For example, if there are 100 strong gchandles, you may want to print just 10 of them or so

GCHandlesByType(gcht)

————————————————
Get GC Handles Stat by Type
————————————————
!GCHandlesByType provides statistics about GCHandles by type.
Supported Types are
1. Pinned(p)
2. AsyncPinned(ap)
3. Strong(s)
4. WeakLong(wl)
5. WeakShort(ws)
6. RefCount(r)
Example Syntax
!gcht [-perdomain] -t <type>
-perdomain option, will display the stat broken down by AppDomain.
type specified is not case sensitive, for example command syntax to print the
stat for Strong handle type is
0:003> !gcht -t strong
||
0:003> !gcht -t s
||
0:003> !gcht -t Strong
Strong GC Handle Statistics:
Strong Handles: 15
Statistics:
MT    Count    TotalSize Class Name
793040bc        1           16 System.Object[]
79330fb8        1           28 System.SharedStatics
79331e38        2           48 System.Reflection.Assembly
79330ec0        1           56 System.Threading.Thread
79330c30        1           72 System.ExecutionEngineException
79330ba0        1           72 System.StackOverflowException
79330b10        1           72 System.OutOfMemoryException
793310cc        1          100 System.AppDomain
793325b0        4          144 System.Security.PermissionSet
79330cc0        2          144 System.Threading.ThreadAbortException
Total 15 objects
——————————

Download it from here

Share/Save/Bookmark

ADPlus Configuration File to the rescue

Saturday, November 15th, 2008

Click to Download access violation adplus configuration file

ADPlus Configuration file to the rescue

Someone asked me about getting a memory dump on breakpoints in production environment.

Usually, you won’t have the luxury of attaching a debugger and inserting a breakpoint in production environment. However, you can still get a memory dump under different conditions or execute a command line option using ADPlus configuration file. AdPlus does support -hang switch to take a memory dump of a process anytime but that’s not good enough if you need to take a memory dump on a particular first chance exception or even when a breakpoint is hit.

ADPlus script supports configuration file with -c switch to create a memory dump of a user mode win32 process. You should read more on ADPlus configuration file on WinDbg help.

Below is the example of a ADPlus configuration file, which will create a memory dump under the following conditions

  1. When the application throws an unhandled exception with the exception code 0×80000001, a guard page exception which occurs when you access for example a stack’s guard page.
  2. Creates a full dump, when breakpoint hits the function kernel32.dll!UnhandledExceptionFilter
  3. Creates a mini dump. When breakpoint hits the function kernel32.dll!SetUnhandledExceptionFilter

<ADPlus>

<!– RunMode could be crash or hang, Quiet suppresses the warning message box–>

<Settings>

<RunMode> CRASH </RunMode>

<Option> Quiet </Option>

<ProcessName> <process name><!–e.g. cmd.exe–> </ProcessName>

</Settings>

<!–

PreCommands is included to change the symbol path for kernel32.dll, the first command .sympath sets the symbol path to c:\windows\system32(kernel32.dll location and the 2nd command reload the kernel32.dll defaulting to export symbols.

The reason for loading the export symbols has to do with setting a breakpoint in kernel32.dll functions as described in my last blog entry

–>

<PreCommands>

<Cmd> .sympath c:\windows\system32 </Cmd>

<Cmd> .reload /f kernel32.dll </Cmd>

</PreCommands>

<Exceptions>

<Config>

<Code>0×80000001</Code>

<Actions1> MiniDump </Actions1>

<Actions2> FullDump </Actions2>

</Config>

</Exceptions>

<Breakpoints>

<NewBP>

<Type> BM </Type>

<Address> kernel32.dll!UnhandledExceptionFilter </Address>

<Actions> FullDump</Actions>

<CustomActions> r </CustomActions>

</NewBP>

<NewBP>

<Type> BM </Type>

<Address> kernel32.dll!SetUnhandledExceptionFilter </Address>

<Actions> MiniDump </Actions>

<CustomActions> r </CustomActions>

</NewBP>

</Breakpoints>

</ADPlus>

adplus command to execute configuration file(exception.cfg)

cscript.exe adplus.vbs -c exception.cfg

Share/Save/Bookmark

Breakpoint gotcha with kernel32.dll microsoft public symbols

Thursday, November 13th, 2008

While debugging crash dump generation issue as described in blog on Dr Watson gotcha, I noticed that you can’t set a breakpoint on kernerl32 functions since microsoft symbols server gives you access to stripped public symbols only. This is one of those scenario where you would rather have export symbols.

Scenario

While doing live debugging or attaching a debugger to generate a dump when it hits a breakpoint on kernel32!SetUnhandledExceptionFilter

Steps using WinDbg

Run the following command

0:021> bm kernel32!SetUnhandledExceptionFilter

You can use bm to set a symbol breakpoint that matches the pattern.

Gotcha

If you have symbol server path set correctly pointing to microsoft public symbol server, WinDbg will display the following message and it suggest you to switch to export symbols

No matching code symbols found, no breakpoints set.

If you are using public symbols, switch to full or export symbols.

How to Switch to Export Symbols to set a breakpoint?

Run the following commands

  • 0:000> .sympath c:\windows\system32

Symbol search path is: c:\windows\system32

This will set your symbol path to kernel32.dll which should be under your windows system folder in my case it is “c:\windows\system32″

  • 0:000> .reload /f kernel32.dll

*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\Windows\system32\kernel32.dll -

Don’t worry about ERROR message because this is what we want, we want it be to set to export symbols. .reload command will reload the symbols for kernerl32.dll defaulting to export symbols

  • 0:000> bm kernel32!SetUnhandledExceptionFilter

breakpoint 6 redefined

6: 7617d16f @!”kernel32!SetUnhandledExceptionFilter”

And your breakpoint is set using export symbols and of course you can use depends for all the exported symbols.

Share/Save/Bookmark

Akismet - Spam Blocked
March 2010
M T W T F S S
« Feb    
1234567
891011121314
15161718192021
22232425262728
293031  
Categories