Debugging compilation failure(cl.exe) using windbg – fatal error C1902: Program database manager mismatch; please check your installation

Problem Description:

while trying to build a win32 console project from visual studio 2008, the error message is “fatal error C1902: Program database manager mismatch; please check your installation”

I decided to use WinDbg to find the root cause.

Steps to Debug

1. Since this fails during compilation, so we have to attach WinDbg to “cl.exe”and since visual studio 2008 is launching cl.exe so you have to auto-attach it (you can directly execute cl.exe)

2. In order to auto-attach a process to debugger, you can use “gflags.exe”(exists in WinDbg installation folder)

3. See the snapshot below on how to use gflags to auto attach to a process cl.exe when launched

Gflags Snapshot

4. Build your project and you will see WinDbg getting launched and attached to cl.exe

5. hit f5 and execute getlasterror command to see the details when stopped on exception
0:000> !gle
LastErrorValue: (Win32) 0 (0) – The operation completed successfully.
LastStatusValue: (NTSTATUS) 0xc0000135 – {Unable To Locate Component} This application has failed to start because %hs was not found. Re-installing the application may fix this problem.

6. Below is the loader message matching the last error description

LDR: LdrGetDllHandle, searching for C:\Program Files\Microsoft Visual Studio 9.0\VC\bin\mspdbsrv.exe from ……
kLDR: LdrpCheckForLoadedDll – Unable To Locate C:\Program Files\Microsoft Visual Studio 9.0\VC\bin\mspdbsrv.exe ……………..

7. Basically, build is failing because cl.exe(compiler) can’t find mspdbsrv.exe for symbols.

You can copy exe in the path(C:\Program Files\Microsoft Visual Studio 9.0\VC\bin\) or change the registry to point to the correct location. It appears that visual studio 2005 upgrade may have caused this issue.

Share

The call to LoadLibrary(sos2) failed, Win32 error 0n14001, “This application has failed to start because the application configuration is incorrect. Reinstalling the application may fix this problem.”

Mike Semikin has reported this issue while executing windbg command .load sos2

The call to LoadLibrary(sos2) failed, Win32 error 0n14001
“This application has failed to start because the application configuration is incorrect. Reinstalling the application may fix this problem.”
Please check your debugger configuration and/or network access.

Resolution
This is because visual studio 2008 is used as a development environment for sos2 and CRT is installed in assembly cache(winSxS) folder so dependent dll will be missing

As an immediate workaround, please install visual c++ 2008 runtime redistributable from Microsoft,  download from http://www.microsoft.com/Downloads/details.aspx?familyid=9B2DA534-3E03-4391-8A4D-074B9F2BC1BF&displaylang=en

Share

.NET Crash/OutofMemoryException/Memory Leak – .NET windows forms and infragistics datagrid and why is System.Drawing.Image object not getting finalized??

Issue Description
Windows forms application has crashed with OOM exception. Before application crashes, cpu is almost pegged at 100% for a few minutes

Root Cause Analysis using WinDbg

Collect full memory dump at set intervals

  • You could get a crash dump and analyze the managed heap to find out rooted objects. But, since we have access to the system I prefer to get a dump at set intervals and compare the managed heap statistics because that makes it a little easier to find the objects which are surviving GC over a period of time.
  • We will use ADPlus to automate this task
  • I will run the script to get a full memory dump 4 times every 2 minutes
  • Command to automate this task is “cscript.exe adplus.vbs -hang -pn <myapp.exe> -quiet -r 4 120″
  • First Dump file size is around 800MB which also indicates process’s memory usage at that time
  • Second Dump file size is around 1.2 GB
  • Third Dump file size is around 1.6 GB and a little later application has crashed.

This is a pure .net application, so we are going to jump ahead and look at the managed heap stats, gc handles and the objects in finalize queue. We will use sos2.dll copied under the same folder as windbg executable, we will dump only pinned and strong gchanldes to identify gc handles increasing over the time because these handles could cause memory leak. Please note that,!gcht (gchandles by type) command is only available in our windbg extension sos2.dll. You could use sos.dll!gchandles to dump gchandles but it won’t give you the objects and their stats by type and you will have to figure out yourself probably by looking at the root.
GCHandles Stats from First Dump
0:000> .load sos2
0:000> !gcht -t p

Pinned GC Handle Statistics:
Pinned Handles: 60
Statistics:
………………………………………………….
Total 60 objects
0:000> !gcht -t s
Strong GC Handle Statistics:
Strong Handles: 185
Statistics:
………………………………………………………
Total 185 objects
GCHandles Stats from Second Dump
0:000> !gcht -t p
Pinned GC Handle Statistics:
Pinned Handles: 60
Statistics:
………………………………………………………………
Total 60 objects
0:000> !gcht -t s
Strong GC Handle Statistics:
Strong Handles: 186
Statistics:
……………………………………………………………..
Total 186 objects

Lets move over since we don’t see anything interesting with gchandles, no. of pinned gchandles remain same and strong gc handles count has increased only by one.

  • We will compare finalize queue stats in dumps, I am only including the interesting objects and the interesting comments for the sake of brevity

Finalize Queue in first dump

0:000> !finalizequeue
generation 2 has 9433 finalizable objects (05501508->0550a86c)
Ready for finalization 0 objects (0550af4c->0550af4c)
Statistics:
MT    Count    TotalSize Class Name
7ae3c9f8     1907 45768 System.Drawing.Bitmap
…………………………………………….
Total 9873 objects

Finalize Queue in second dump

0:000> !finalizequeue
generation 2 has 10545 finalizable objects (05501508->0550b9cc)
Ready for finalization 0 objects (0550bdac->0550bdac)
Statistics:
MT    Count    TotalSize Class Name
7ae3c9f8     2951 70824 System.Drawing.Bitmap
…………………………………………….
Total 10793 objects

Aha, Do we see something interesting here???? Of course, numbers of finalizable objects in generation 2 have increased by almost 1000 and on top of that number of objects ready to be finalized is 0. So why are these objects not getting finalized?

  • We have to find out why System.Drawing.Bitmap is not getting finalized.

As shown in above step,  generation 2 has 9433 finalizable objects (05501508->0550a86c).
We have finalizable objects starting from memory address 05501508 and ending at 0550a86c. You don’t want to dumpheap by type(System.Drawing.Bitmap) to look at the roots to this object, you will have to dump too many objects unless you get lucky. The better way is probably to display the memory and get the address of an object. Size of the System.Drawing.Bitmap object is 24 Bytes so we may be able to get the object address by specifying the address range ending with finalize queue @ 0550a86c. We will subtract 24*4 = 96 bytes(60) from 0550a86c which is 550A80C.
First column is the finalize queue address and the rest are the memory addresses of the objects
0:000> dd 550A80C 0550a86c

0550a80c  17b6e074 17b6e11c 17b6e1c4 17b6e26c
…………………………………………………………………………………….
0550a86c  17b76734
0:000> !do 17b6e074

Name: System.Drawing.Bitmap —-> Make sure this is System.Drawing.Bitmap
MethodTable: 7ae3c9f8
EEClass: 7ade4014
Size: 24(0×18) bytes

0:000> !gcroot -nostacks 17b6e704
DOMAIN(001581B0):HANDLE(Strong):ff11f8:Root:01981b64(System.Threading.Thread)->
………………………………………………………………………………………
01d00f54(MyApp.MyForm)->
160875cc(MyApp.Controls.MyControl)->
1618b578(Infragistics.Win.UltraWinGrid.UltraGridRow)->
14f1a7c0(Infragistics.Win.UltraWinGrid.CellsCollection)->
…………………………………………………………………………………….
17b6e674(Infragistics.Win.UltraWinGrid.UltraGridCell)->
17b6e6f4(Infragistics.Win.UltraWinGrid.UnBoundData)->
17b6e704(System.Drawing.Bitmap)

This is rooted in some strong handles so this is not rooted in finalization queue what that means is object is not ready to be finalized yet as we saw in finalizeQ stats. I am hiding the customer data so basically, we have a windows forms containing user control with infragistics UltraGrid and the System.Drawing.Bitmap is being set in a cell.

Let’s look at the sample code
foreach (UltraGridRow row in rows)
{
row.Cells[someindex] =<bitmap object>
}
This is where we have the problem because if there are let’s say 5000 rows then we are creating 5000 bitmap objects and as long as form is alive these objects will never be disposed. System.Drawing.Bitmap uses unmanaged GDIPlus library and this is not a lightweight object that’s why it was crashing with outofmemory exception and only in a particular scenario but this may go un-noticed during test cycle by QA team unless the test case covers this very particular scenario.
Resolution
I am sure there are many ways to fix it but one easy way to fix is create the drawing objects for rows visible in the client area and handle scroll/resize events to set the image and dispose the objects not in use.

Share

Memory Access Violation in SQL Server Compact Edition(CE)

Scenario

Windows Forms Application is throwing first chance memory access violation exception. Windows Forms application implements Application.ThreadException to log any unhandled exceptions in UI Threads. Log file always have the call stack  SqlCeCommand.ExecuteResultSet->SqlCeCommand.CompileQueryPlan->[NativeMethod]CompileQueryPlan(native sql ce dll). This application is using only one thread to execute commands in SQL CE so any multithreading issue is ruled out.

Some Ranting

Windows forms application in question is a pure .net application with no interop layer so the immediate suspicion was on upgraded SQL CE 3.5 SP1. There has been a few memory access violation in the past which apparently got resolved after installing 3.5. That made it very easy to blame it on Microsoft SQL CE dll. However, I do have to agree that Microsoft SQL CE exception handling is not as good as other .NET libraries. You will get “some weird – not making any sense” exception when you have a parameter missing or the column data type in your parametrized sql query.

Steps to Resolution using WinDbg

1. Get a full memory dump and share with microsoft support team

2. Since this is a first chance memory access violation that means you need to get a full dump on first chance exception

3. Create a adplus configuration file as shown below

<ADPlus>
<Settings>
<RunMode>CRASH</RunMode>
<Option>Quiet</Option>
<ProcessName>MyApp.exe</ProcessName>
</Settings>
<Exceptions>
<Option>NoDumpOnFirstChance</Option>
<Config>
<Code>clr;av</Code><!–to get the full dump on clr access violation–>
<Actions1>FullDump</Actions1>
<ReturnAction1>gn</ReturnAction1>
</Config>
</Exceptions>
</ADPlus>

4. run cscript.exe adplus.vbs -c <config file name>

5. Analyze the memory dump with 1st chance access violation using windbg

FAULTING_IP:
kernel32!InterlockedExchange

and if we dump the managed stack and look at other threads nothing unusual and the manged stack is just pointing to SqlCeCommand.CompileQueryPlan

It didn’t make much sense to us and we don’t have enough time to dig into it without private symbols for sql ce native dll.

We may be getting wrong call stack because of Heap corruption, so let’s get another full memory after enabling the full page heap. You can google on full page heap in case you have never used it or for quick reference visit http://msdn.microsoft.com/en-us/library/ms220938(VS.80).aspx

You can enable full page heap with the following command gflags /p /enable myapp.exe /full which adds an entry into your system registry and one should always disable it once done, visit http://msdn.microsoft.com/en-us/library/cc265936.aspx for more information, gflags gets installed with debugging tools for windows.

6. Get a full memory dump after enabling the full page heap

7. Analyze the dump, lets look at the call stack on exception thread . Run sos!dumpstack command without -EE , if you want to look at the managed and unmanaged both at the same time.Exception is thrown while Executing the resultset

0:000> !dumpstack
OS Thread Id: 0x12bc (0)
Current frame: sqlceme35!ME_GetKeyInfo+0×36
ChildEBP RetAddr  Caller,Callee
0012e0f8 79ec6feb mscorwks!DoSpecialUnmanagedCodeDemand+0×65, calling mscorwks!_EH_epilog3
0012e124 08326091 08326091
0012e154 085d935c (MethodDesc 0x857973c +0x1dc System.Data.SqlServerCe.SqlCeDataReader.FillMetaData(System.Data.SqlServerCe.SqlCeCommand)), calling 08588150
0012e1cc 085e38e4 (MethodDesc 0x857a7fc +0×44 System.Data.SqlServerCe.SqlCeCommand.InitializeDataReader(System.Data.SqlServerCe.SqlCeDataReader, Int32)), calling 08588cec
0012e1dc 085e3c6e (MethodDesc 0x857a7b8 +0x15e System.Data.SqlServerCe.SqlCeCommand.ExecuteCommand(System.Data.CommandBehavior, System.String, System.Data.SqlServerCe.ResultSetOptions)), calling 085897d4
0012e218 085e3e8b (MethodDesc 0x857a7a8 +0x2b System.Data.SqlServerCe.SqlCeCommand.ExecuteResultSet(System.Data.SqlServerCe.ResultSetOptions, System.Data.SqlServerCe.SqlCeResultSet)), calling 085897b4

8. Let’s look at all the call stack from all the threads, i will only show the interesting thread for brevity

0:000> ~*e!clrstack

OS Thread Id: 0x12c4 (2)
ESP       EIP
02cffbc8 7c90e4f4 [NDirectMethodFrameStandalone: 02cffbc8] System.Data.SqlServerCe.NativeMethods.CloseStore(IntPtr)
02cffbd8 085dbbf6 System.Data.SqlServerCe.SqlCeConnection.ReleaseNativeInterfaces()
02cffbe4 085db597 System.Data.SqlServerCe.SqlCeConnection.Close(Boolean)
02cffbf0 085dbcdf System.Data.SqlServerCe.SqlCeConnection.Close()
02cffbfc 085d80b3 System.Data.SqlServerCe.SqlCeDataReader.Dispose(Boolean)
02cffc18 085d7f25 System.Data.SqlServerCe.SqlCeDataReader.Finalize()

I see that finalizer thread is disposing the SqlCeConnection object.

9. How many connection object do we have on managed heap?

0:000> !dumpheap -type System.Data.SqlServerCe.SqlCeConnection
Address       MT     Size
0347ac10 085f2de4       96
035bf8c8 085f2de4       96
total 2 objects
Statistics:
MT    Count    TotalSize Class Name
085f2de4        2          192 System.Data.SqlServerCe.SqlCeConnection
Total 2 objects

so hey, we have 2 SqlCeConnection objects on managed heap one of them is getting finalized so nothing to worry.

How about we look at the connection object address on exeption thread and finalizer thread.

0:000> ~2e!clrstack -p
OS Thread Id: 0x12c4 (2)
ESP       EIP
02cffbc8 7c90e4f4 [NDirectMethodFrameStandalone: 02cffbc8] System.Data.SqlServerCe.NativeMethods.CloseStore(IntPtr)
02cffbd8 085dbbf6 System.Data.SqlServerCe.SqlCeConnection.ReleaseNativeInterfaces()
PARAMETERS:
this = 0x0347ac10

dump the command object from thread executing resultset

0:000> !do 0×03805988
Name: System.Data.SqlServerCe.SqlCeCommand
MethodTable: 085f3194
EEClass: 0857561c
Size: 120(0×78) bytes
(C:\WINDOWS\assembly\GAC_MSIL\System.Data.SqlServerCe\3.5.1.0__89845dcd8080cc91\System.Data.SqlServerCe.dll)
Fields:
MT    Field   Offset                 Type VT     Attr    Value Name
085f2de4  40000f6       28 …e.SqlCeConnection  0 instance 0347ac10 connection

Guess what, ExecuteResultSet is using the same connection object as the one getting finalized. Now, everything is making sense and of course you are going to get the memory access violation when sql ce is in middle of executing your query in native library. But, why is this connection object getting disposed?

10. find the module name from exception thread

0:000> lmv m MyApp_SqlCe*
start    end        module name
06e10000 0700c000   MyApp_SqlCe   (deferred)

11. Save the module and browse using reflector

0:000> !savemodule 06e10000 c:\temp\myappsqlce.dll
3 sections in file
section 0 – VA=2000, VASize=1f5794, FileAddr=1000, FileSize=1f6000
section 1 – VA=1f8000, VASize=3c0, FileAddr=1f7000, FileSize=1000
section 2 – VA=1fa000, VASize=c, FileAddr=1f8000, FileSize=1000

12. The culprit

SqlCeDataReader reader = command.ExecuteReader(CommandBehavior.CloseConnection);
Basically, at some point before calling ExecuteResultSet, the above line was getting executed on the same connection object with CommandBehavior = CloseConnection. This was disposing the connection object that’s why depending on memory pressure and finalizer queue, memory access violation was thrown randomly.

You may get disposed exception or other managed exception but in this case the timing was such that connection object always ended up in getting finalized while query is getting executed in native sql ce dll.

Share

sos2.dll version 1.2 released

Version 1.2 supports a command to dump GC Handles by type. Actually,  support for this command was suggested by Ilya Ryzhenkov (ReSharper Product Manager at JetBrains) .

Download it from here

To Do List is to dump all the handles for a type along with stats with an option to specify the # of handles. For example, if there are 100 strong gchandles, you may want to print just 10 of them or so

GCHandlesByType(gcht)

————————————————
Get GC Handles Stat by Type
————————————————
!GCHandlesByType provides statistics about GCHandles by type.
Supported Types are
1. Pinned(p)
2. AsyncPinned(ap)
3. Strong(s)
4. WeakLong(wl)
5. WeakShort(ws)
6. RefCount(r)
Example Syntax
!gcht [-perdomain] -t <type>
-perdomain option, will display the stat broken down by AppDomain.
type specified is not case sensitive, for example command syntax to print the
stat for Strong handle type is
0:003> !gcht -t strong
||
0:003> !gcht -t s
||
0:003> !gcht -t Strong
Strong GC Handle Statistics:
Strong Handles: 15
Statistics:
MT    Count    TotalSize Class Name
793040bc        1           16 System.Object[]
79330fb8        1           28 System.SharedStatics
79331e38        2           48 System.Reflection.Assembly
79330ec0        1           56 System.Threading.Thread
79330c30        1           72 System.ExecutionEngineException
79330ba0        1           72 System.StackOverflowException
79330b10        1           72 System.OutOfMemoryException
793310cc        1          100 System.AppDomain
793325b0        4          144 System.Security.PermissionSet
79330cc0        2          144 System.Threading.ThreadAbortException
Total 15 objects
——————————

Download it from here

Share

ADPlus Configuration File to the rescue

Click to Download access violation adplus configuration file

ADPlus Configuration file to the rescue

Someone asked me about getting a memory dump on breakpoints in production environment.

Usually, you won’t have the luxury of attaching a debugger and inserting a breakpoint in production environment. However, you can still get a memory dump under different conditions or execute a command line option using ADPlus configuration file. AdPlus does support -hang switch to take a memory dump of a process anytime but that’s not good enough if you need to take a memory dump on a particular first chance exception or even when a breakpoint is hit.

ADPlus script supports configuration file with -c switch to create a memory dump of a user mode win32 process. You should read more on ADPlus configuration file on WinDbg help.

Below is the example of a ADPlus configuration file, which will create a memory dump under the following conditions

  1. When the application throws an unhandled exception with the exception code 0×80000001, a guard page exception which occurs when you access for example a stack’s guard page.
  2. Creates a full dump, when breakpoint hits the function kernel32.dll!UnhandledExceptionFilter
  3. Creates a mini dump. When breakpoint hits the function kernel32.dll!SetUnhandledExceptionFilter

<ADPlus>

<!– RunMode could be crash or hang, Quiet suppresses the warning message box–>

<Settings>

<RunMode> CRASH </RunMode>

<Option> Quiet </Option>

<ProcessName> <process name><!–e.g. cmd.exe–> </ProcessName>

</Settings>

<!–

PreCommands is included to change the symbol path for kernel32.dll, the first command .sympath sets the symbol path to c:\windows\system32(kernel32.dll location and the 2nd command reload the kernel32.dll defaulting to export symbols.

The reason for loading the export symbols has to do with setting a breakpoint in kernel32.dll functions as described in my last blog entry

–>

<PreCommands>

<Cmd> .sympath c:\windows\system32 </Cmd>

<Cmd> .reload /f kernel32.dll </Cmd>

</PreCommands>

<Exceptions>

<Config>

<Code>0×80000001</Code>

<Actions1> MiniDump </Actions1>

<Actions2> FullDump </Actions2>

</Config>

</Exceptions>

<Breakpoints>

<NewBP>

<Type> BM </Type>

<Address> kernel32.dll!UnhandledExceptionFilter </Address>

<Actions> FullDump</Actions>

<CustomActions> r </CustomActions>

</NewBP>

<NewBP>

<Type> BM </Type>

<Address> kernel32.dll!SetUnhandledExceptionFilter </Address>

<Actions> MiniDump </Actions>

<CustomActions> r </CustomActions>

</NewBP>

</Breakpoints>

</ADPlus>

adplus command to execute configuration file(exception.cfg)

cscript.exe adplus.vbs -c exception.cfg

Share

Breakpoint gotcha with kernel32.dll microsoft public symbols

While debugging crash dump generation issue as described in blog on Dr Watson gotcha, I noticed that you can’t set a breakpoint on kernerl32 functions since microsoft symbols server gives you access to stripped public symbols only. This is one of those scenario where you would rather have export symbols.

Scenario

While doing live debugging or attaching a debugger to generate a dump when it hits a breakpoint on kernel32!SetUnhandledExceptionFilter

Steps using WinDbg

Run the following command

0:021> bm kernel32!SetUnhandledExceptionFilter

You can use bm to set a symbol breakpoint that matches the pattern.

Gotcha

If you have symbol server path set correctly pointing to microsoft public symbol server, WinDbg will display the following message and it suggest you to switch to export symbols

No matching code symbols found, no breakpoints set.

If you are using public symbols, switch to full or export symbols.

How to Switch to Export Symbols to set a breakpoint?

Run the following commands

  • 0:000> .sympath c:\windows\system32

Symbol search path is: c:\windows\system32

This will set your symbol path to kernel32.dll which should be under your windows system folder in my case it is “c:\windows\system32″

  • 0:000> .reload /f kernel32.dll

*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\Windows\system32\kernel32.dll -

Don’t worry about ERROR message because this is what we want, we want it be to set to export symbols. .reload command will reload the symbols for kernerl32.dll defaulting to export symbols

  • 0:000> bm kernel32!SetUnhandledExceptionFilter

breakpoint 6 redefined

6: 7617d16f @!”kernel32!SetUnhandledExceptionFilter”

And your breakpoint is set using export symbols and of course you can use depends for all the exported symbols.

Share