Memory Access Violation in SQL Server Compact Edition(CE)

Scenario

Windows Forms Application is throwing first chance memory access violation exception. Windows Forms application implements Application.ThreadException to log any unhandled exceptions in UI Threads. Log file always have the call stack  SqlCeCommand.ExecuteResultSet->SqlCeCommand.CompileQueryPlan->[NativeMethod]CompileQueryPlan(native sql ce dll). This application is using only one thread to execute commands in SQL CE so any multithreading issue is ruled out.

Some Ranting

Windows forms application in question is a pure .net application with no interop layer so the immediate suspicion was on upgraded SQL CE 3.5 SP1. There has been a few memory access violation in the past which apparently got resolved after installing 3.5. That made it very easy to blame it on Microsoft SQL CE dll. However, I do have to agree that Microsoft SQL CE exception handling is not as good as other .NET libraries. You will get “some weird – not making any sense” exception when you have a parameter missing or the column data type in your parametrized sql query.

Steps to Resolution using WinDbg

1. Get a full memory dump and share with microsoft support team

2. Since this is a first chance memory access violation that means you need to get a full dump on first chance exception

3. Create a adplus configuration file as shown below

<ADPlus>
<Settings>
<RunMode>CRASH</RunMode>
<Option>Quiet</Option>
<ProcessName>MyApp.exe</ProcessName>
</Settings>
<Exceptions>
<Option>NoDumpOnFirstChance</Option>
<Config>
<Code>clr;av</Code><!–to get the full dump on clr access violation–>
<Actions1>FullDump</Actions1>
<ReturnAction1>gn</ReturnAction1>
</Config>
</Exceptions>
</ADPlus>

4. run cscript.exe adplus.vbs -c <config file name>

5. Analyze the memory dump with 1st chance access violation using windbg

FAULTING_IP:
kernel32!InterlockedExchange

and if we dump the managed stack and look at other threads nothing unusual and the manged stack is just pointing to SqlCeCommand.CompileQueryPlan

It didn’t make much sense to us and we don’t have enough time to dig into it without private symbols for sql ce native dll.

We may be getting wrong call stack because of Heap corruption, so let’s get another full memory after enabling the full page heap. You can google on full page heap in case you have never used it or for quick reference visit http://msdn.microsoft.com/en-us/library/ms220938(VS.80).aspx

You can enable full page heap with the following command gflags /p /enable myapp.exe /full which adds an entry into your system registry and one should always disable it once done, visit http://msdn.microsoft.com/en-us/library/cc265936.aspx for more information, gflags gets installed with debugging tools for windows.

6. Get a full memory dump after enabling the full page heap

7. Analyze the dump, lets look at the call stack on exception thread . Run sos!dumpstack command without -EE , if you want to look at the managed and unmanaged both at the same time.Exception is thrown while Executing the resultset

0:000> !dumpstack
OS Thread Id: 0x12bc (0)
Current frame: sqlceme35!ME_GetKeyInfo+0×36
ChildEBP RetAddr  Caller,Callee
0012e0f8 79ec6feb mscorwks!DoSpecialUnmanagedCodeDemand+0×65, calling mscorwks!_EH_epilog3
0012e124 08326091 08326091
0012e154 085d935c (MethodDesc 0x857973c +0x1dc System.Data.SqlServerCe.SqlCeDataReader.FillMetaData(System.Data.SqlServerCe.SqlCeCommand)), calling 08588150
0012e1cc 085e38e4 (MethodDesc 0x857a7fc +0×44 System.Data.SqlServerCe.SqlCeCommand.InitializeDataReader(System.Data.SqlServerCe.SqlCeDataReader, Int32)), calling 08588cec
0012e1dc 085e3c6e (MethodDesc 0x857a7b8 +0x15e System.Data.SqlServerCe.SqlCeCommand.ExecuteCommand(System.Data.CommandBehavior, System.String, System.Data.SqlServerCe.ResultSetOptions)), calling 085897d4
0012e218 085e3e8b (MethodDesc 0x857a7a8 +0x2b System.Data.SqlServerCe.SqlCeCommand.ExecuteResultSet(System.Data.SqlServerCe.ResultSetOptions, System.Data.SqlServerCe.SqlCeResultSet)), calling 085897b4

8. Let’s look at all the call stack from all the threads, i will only show the interesting thread for brevity

0:000> ~*e!clrstack

OS Thread Id: 0x12c4 (2)
ESP       EIP
02cffbc8 7c90e4f4 [NDirectMethodFrameStandalone: 02cffbc8] System.Data.SqlServerCe.NativeMethods.CloseStore(IntPtr)
02cffbd8 085dbbf6 System.Data.SqlServerCe.SqlCeConnection.ReleaseNativeInterfaces()
02cffbe4 085db597 System.Data.SqlServerCe.SqlCeConnection.Close(Boolean)
02cffbf0 085dbcdf System.Data.SqlServerCe.SqlCeConnection.Close()
02cffbfc 085d80b3 System.Data.SqlServerCe.SqlCeDataReader.Dispose(Boolean)
02cffc18 085d7f25 System.Data.SqlServerCe.SqlCeDataReader.Finalize()

I see that finalizer thread is disposing the SqlCeConnection object.

9. How many connection object do we have on managed heap?

0:000> !dumpheap -type System.Data.SqlServerCe.SqlCeConnection
Address       MT     Size
0347ac10 085f2de4       96
035bf8c8 085f2de4       96
total 2 objects
Statistics:
MT    Count    TotalSize Class Name
085f2de4        2          192 System.Data.SqlServerCe.SqlCeConnection
Total 2 objects

so hey, we have 2 SqlCeConnection objects on managed heap one of them is getting finalized so nothing to worry.

How about we look at the connection object address on exeption thread and finalizer thread.

0:000> ~2e!clrstack -p
OS Thread Id: 0x12c4 (2)
ESP       EIP
02cffbc8 7c90e4f4 [NDirectMethodFrameStandalone: 02cffbc8] System.Data.SqlServerCe.NativeMethods.CloseStore(IntPtr)
02cffbd8 085dbbf6 System.Data.SqlServerCe.SqlCeConnection.ReleaseNativeInterfaces()
PARAMETERS:
this = 0x0347ac10

dump the command object from thread executing resultset

0:000> !do 0×03805988
Name: System.Data.SqlServerCe.SqlCeCommand
MethodTable: 085f3194
EEClass: 0857561c
Size: 120(0×78) bytes
(C:\WINDOWS\assembly\GAC_MSIL\System.Data.SqlServerCe\3.5.1.0__89845dcd8080cc91\System.Data.SqlServerCe.dll)
Fields:
MT    Field   Offset                 Type VT     Attr    Value Name
085f2de4  40000f6       28 …e.SqlCeConnection  0 instance 0347ac10 connection

Guess what, ExecuteResultSet is using the same connection object as the one getting finalized. Now, everything is making sense and of course you are going to get the memory access violation when sql ce is in middle of executing your query in native library. But, why is this connection object getting disposed?

10. find the module name from exception thread

0:000> lmv m MyApp_SqlCe*
start    end        module name
06e10000 0700c000   MyApp_SqlCe   (deferred)

11. Save the module and browse using reflector

0:000> !savemodule 06e10000 c:\temp\myappsqlce.dll
3 sections in file
section 0 – VA=2000, VASize=1f5794, FileAddr=1000, FileSize=1f6000
section 1 – VA=1f8000, VASize=3c0, FileAddr=1f7000, FileSize=1000
section 2 – VA=1fa000, VASize=c, FileAddr=1f8000, FileSize=1000

12. The culprit

SqlCeDataReader reader = command.ExecuteReader(CommandBehavior.CloseConnection);
Basically, at some point before calling ExecuteResultSet, the above line was getting executed on the same connection object with CommandBehavior = CloseConnection. This was disposing the connection object that’s why depending on memory pressure and finalizer queue, memory access violation was thrown randomly.

You may get disposed exception or other managed exception but in this case the timing was such that connection object always ended up in getting finalized while query is getting executed in native sql ce dll.