Starting Down In-Memory Lane – SQL Server 2014 In-Memory OLTP Beginnings Part 3

In Part 1 and Part 2 of this post I talked about how to setup memory optimized tables and use natively compiled stored procedures with SQL Server In-Memory OLTP. Now I want to talk about how do you monitor and troubleshoot this feature, especially around memory usage with possible memory exhaustion.

Performance monitor logging is built into Windows and can be very useful to monitor SQL Server, as SQL Server exposes number of counter objects to Perfmon. With In-Memory OLTP SQL Server added number of counters with XTP in the name.

XTP 

Following counter objects are present per each SQL Server 2014 instance on the machine:

XTP Cursors
XTP Garbage Collection
XTP Phantom Processor
XTP Transaction Log
XTP Transactions

You can also query counters straight from SQL Server, like:

SELECT * FROM sys.dm_os_performance_counters WHERE object_name LIKE '%xtp%'

Results on my machine are quite expectable:

xtp2

 

Some more details on new XTP counter objects that

  • XTP Cursor. The XTP Cursors performance object contains counters related to internal XTP engine cursors. Cursors are the low-level building blocks the XTP engine uses to process Transact-SQL queries. As such, you do not typically have direct control over them.
  • XTP Garbage Collection. The XTP Garbage Collection performance object contains counters related to the XTP engine’s garbage collector.
  • XTP Phantom Processor. The XTP Phantom Processor performance object contains counters related to the XTP engine’s phantom processing subsystem. This component is responsible for detecting phantom rows in transactions running at the SERIALIZABLE isolation level.
  • XTP Storage. The XTP Storage performance object contains counters related to XTP storage in SQL Server.
  • XTP Transaction Log. The XTP Transaction Log performance object contains counters related to XTP transaction logging in SQL Server.
  • XTP Transactions. The XTP Transactions performance object contains counters related to XTP engine transactions in SQL Server.

In-Memory OLTP engine is fully integrated into SQL Server Memory Manager. So new values are added to familiar SQL Server DMVs to reflect XTP values. Here is example in good old memory clerks DMV:

xtp3

The first row shows the memory allocated by system threads. The second row with name DB_ID_8 represents the consumers in the database objects and the third row with memory node-id 64 represents memory allocated to DAC (Dedicated Admin Connection).

There are also newly introduced XTP DMVs to monitor In-Memory usage. These are documented in SQL BOL.

Some of the interesting one’s to me:

  • sys. dm_db_xtp_checkpoint_stats is DMV will return information about In-Memory OLTP checkpoint operations in the current database.   By using this DMV you can you explore the log statistics since SQL Server started up for your databases that contain In-Memory OLTP tables.  With this DMV you can determine stats related to log blocks written, wait statistics, LSN information, and more.    If this DMV is run against a database that doesn’t contain any In-Memory OLTP tables then it will return no data.
  • sys.dm_db_xtp_checkpoint_files. This DMV shows information about In-Memory checkpoint files.  With this DMV you can determine information about the DATA and DELTA files associated with your In-Memory tables. It can be very useful in estimating storage space allocated to your memory optimized tables or determining number of inserted and deleted rows
  • sys.dm_db_xtp_memory_consumers. This DMV reports the amount of memory allocated and used based on memory_consumer_id’s.  With this view you can identify the amount of allocated verses used memory for a given database object
  • sys.dm_db_xtp_table_memory_stats.

    This DMV will show you the amount of memory used by your In-Memory OLTP tables.    This DMV exposes the memory information for both user and system tables. By using this DMV you can see the amount of space allocated, and used for both row data and index data. Here is a sample query that will return the amount of space allocated and used for In-Memory OLTP objects in a database. SELECT OBJECT_NAME(object_id) ObjectName, Object_Id, SUM( memory_allocated_for_indexes_kb + memory_allocated_for_table_kb) AS memoryallocated_object_in_kb, SUM( memory_used_by_indexes_kb + memory_used_by_table_kb) AS memoryused_object_in_kb FROM sys.dm_db_xtp_table_memory_stats GROUP by object_id;

  • sys.dm_xtp_gc_stats provides information on In-Memory OLTP garbage collection . For example sys.dm_xtp_gc_stats So running – select * from sys.dm_xtp_gc_stats you gets you information on number of rows examined by the garbage collection subsystem since your instance of SQL Server has started, etc.

Number of DMVs will have columns labeled as “dusty corner..” . So what is “dusty corner”?  I was wondering that as well until I found an answer in Bob Beauchemin’s blog (he rocks as usual) – “The term has to do with how the In-Memory OLTP feature stores data in memory and how its garbage collection works. Because a single copy of a row’s data is stored along with multiple index pointers (both hash and BwTree indexes are ultimately pointer-based), all of index pointers must be “unlinked” before an old row can be garbage collected. Since threads can unlink old pointers while running queries, any index ranges with a lot of activity will quickly unlink the appropriate pointers. However, if an index or index range is rarely used, special scans by the system garbage collector will be needed to find these pointers. They’re “hiding” in the dusty corners (apparently dust forms on unused index ranges, I envision Carol Burnett, duster in hand

So the presence of dusty corner scans means some index ranges aren’t being used much. If, by looking at index usage, you can determine that an entire index is almost never being used (Database Engine Tuning Advisor doesn’t tune indexes for in-memory tables, that I’m aware of), that index would be a candidate for removal. However, in-memory tables don’t support filtered indexes so, if another part of the index range is frequently used, you’ll have to decide if it’s worth letting old versions hang around for longer. Until those dusty corner scans unlink that last pointer.”

Finally, SQL Server introduced number of new xEvents for in-Memory OLTP as well. Extended Events is an event infrastructure that is highly scalable and configurable for server systems. Extended Events is a light weight performance monitoring system that uses very few performance resources. As you may know these are destined to replace SQL Server Profiler since SQL Server 2012: To begin with there are 3 new packages, all belonging to the new In-Memory OLTP Engine: XTP, split up into 3 dlls:

Package Description Module Name
XtpRuntime Extended events for the XTP Runtime

{InstanceDir}\MSSQL\Binn\
hkruntime.dll

XtpEngine Extended events for the XTP Engine

{InstanceDir}\MSSQL\Binn\
hkengine.dll

XtpCompile Extended events for the XTP Compile

{InstanceDir}\MSSQL\Binn\
hkcompile.dll

The number of Extended Events increased from 625 in SQL Server 2012 (SP1) to 870 in SQL Server 2014 (RTM).   For more on xEvents see SQL BOL.

Starting Down In-Memory Lane – SQL Server 2014 In-Memory OLTP Beginnings Part 2

Continuing from my first post on this subject  where we looked at creating memory optimized tables and querying these with T-SQL, lets look at creating natively compiled stored procedures. So what are new natively compiled stored procedures? Stored procedures that are marked with NATIVE_COMPILATION are natively compiled. This means the Transact-SQL statements in the procedure are all compiled to native code for efficient execution of performance-critical business logic.These are destined for heavy OLTP operations, such as heavy inserts and updates vs. reporting\DSS style queries, which require joins and heavy aggregations.

These stored procedures contain what is known as ATOMIC blocks. concept in SQL Server 2014 that can be used only with native procs. The basic thing an ATOMIC block gives you is that all statements within the block either succeed or fail, atomically. That means that you don’t have to explicitly do BEGIN TRAN…COMMIT…ROLLBACK inside natively compiled procs.

As with all transactions that touch memory-optimized tables, with natively compiled stored procedures you will need to consider retry logic to deal with potential failures such as write conflicts (error 41302) or dependency failures (error 41301).  You can implement this logic in two ways:

  • On the client side. The client application catches an error as thrown from SQL Server and retry based on above error codes
  • On the server side, using a “wrapper stored procedure”. The client would call an interpreted T-SQL stored procedure that calls the native proc. The wrapper procedure then uses try/catch logic to catch the error and retry the procedure call if needed. You do need to consider in this case the possibility that results are returned to the client before the failure, and the client would not know to discard them. Therefore, to be safe, it is best to use this method only with native procs that do not return any result sets to the client. In addition, consider that using a wrapper stored procedure does impact performance. For optimal performance, use client-side retries.

For more on retry logic guidelines see SQL BOL article

So simplest example based on our test memory optimized tables is here

-- Drop stored procedure if it already exists
IF OBJECT_ID('dbo.insertOderItem') IS NOT NULL
DROP PROCEDURE dbo.InsertOrderItem
GO

CREATE PROCEDURE dbo.InsertOrderItem
(@OrderItemID int,@ItemId int, @OrderId int)
WITH NATIVE_COMPILATION, SCHEMABINDING, EXECUTE AS OWNER
AS
BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'us_english')
INSERT INTO dbo.OrderItems VALUES (@OrderItemID,@ItemId,@OrderId)
END

Some interesting syntax items to note:

  • SCHEMABINDING.

    Natively Compiled Stored Procedure must be bound to the schema of the objects used in the procedure. So tables cannot be dropped that are referenced by the procedure.

  • BEGIN ATOMIC: That is the atomic block, I referenced above.

    This will ensure, if the Stored Procedure is invoked outside the active transaction context, that it will start a new transaction. An ATOMIC block must have two required options TRANSACTION ISOLATION LEVEL and LANGUAGE.

  • TRANSACTION ISOLATION LEVEL. I will not go into SQL Server Isolation Level’s here, but you can see more in SQL BOL article. Also see my previous post on some findings with Isolation Level with Websphere AS and JDBC.

Avoid implicit conversions by matching parameter types passed in with parameter types in the procedure and memory optimized table schema. The Exists clause is one of the T-SQL constructs that are not supported, but you can simulate it like this:

DECLARE @exists BIT = 0
SELECT TOP 1 @exists = 1 FROM MyTable WHERE …
IF @exists = 1

In the next post I will try to move into monitoring and troubleshooting in-Memory OLTP.

Starting Down In-Memory Lane – SQL Server 2014 In-Memory OLTP Beginnings Part 1

Starting from announcement some time ago and related blog post by David Campbell – The coming in-memory database tipping point I was really excited to get my hands on SQL SERVER 2014 CTP and experiment with new In-Memory OLTP, aka Hekaton feature. I spoke about it higher level to number of customers, but not long ago after downloading SQL 2014 CTP2 I started actually playing with it. So how do you create memory optimize tables and query these?
First lets touch on what are memory optimized tables in SQL 2014:
Memory-Optimized tables are fully ACID compliant data storage structure, which stores it’s data mainly within the system’s memory, allowing users rapid access to data stored within the In-Memory structure in a transactional (OLTP) environment; which means users can read and write to this data structure without running into transactional consistency issues.
This feature has been achieved by SQL Server 2014 through taking the following approach:

  • In-Memory Row-Versioning: Memory-Optimized tables maintains various versions of the same row dispersed over the table’s in-memory data structure, this allows the reading and writing of rows concurrently without transactional inconsistency issues. That being said, and as with the SNAPSHOT ISOLATION LEVEL, row-versioning handles conflicts by throwing an error to the application, expecting the application to implement some sort of retry logic for failed transactions.
  • Data Durability is guaranteed through transaction logging
  • Checkpoint and Delta files: placed on a special FileGroup for Memory-Optimized storage, these files are written to sequentially (and so optimised for spinning-disk), and are used mainly during SQL Server start-up to load in the Memory-Optimized tables in a database.
  • More on architecture on In-Memory OLTP can be found here as well – Architectural Overview of SQL Server 2014’s In-Memory OLTP Technology

So let fire up SQL 2014 SSMS and create a database called Orders, in order to create memory optimized tables in that database we will have to add a memory optimized file group to that database and file to that file group, essentially enabling In-Memory OLTP in that database, create memory optimized tables, insert some data and query these with T-SQL. In upcoming part 2 of this post I will create a simple natively compiled stored procedure as well and get data that way.

  • Create Database and enable In-Memory OLTP:

    --CREATE DATABASE
    create database orders
    go
    --ADD MEMORY OPTIMIZED FILEGROUP
    alter database orders
    add filegroup orders_in_memory_optimized_filegroup contains memory_optimized_data
    go
    --ADD FILE TO THAT FILEGROUP
    alter database orders
    add file (name ='orders_in_memory_oltp_1', filename ='C:\Program Files\Microsoft SQL Server\MSSQL12.GENNADYKSQL2K14\MSSQL\DATA\Orders_InMemory_OLTP_1')
    To filegroup orders_in_memory_optimized_filegroup;

    go

    --SET THAT FILEGROUP TO DEFAULT
    IF NOT EXISTS (SELECT name FROM sys.filegroups
    WHERE is_default=1 AND name = N'orders_in_memory_optimized_filegroup')
    ALTER DATABASE orders MODIFY FILEGROUP orders_in_memory_optimized_filegroup DEFAULT

    go

  • Now lets create some memory optimized tables

    create table Orders
    (
    Order_Id int not null primary key nonclustered hash with (bucket_count =20000) ,
    OrderDate_Time datetime2 not null,

    )
    with (memory_optimized=on, durability =schema_and_data)

    GO

    create table Items
    (
    Item_Id int not null primary key nonclustered hash with (bucket_count =20000) ,
    ItemDesc varchar (200) not null,

    )
    with (memory_optimized=on, durability =schema_and_data)

    GO

    create table OrderItems

    (
    Order_Item_Id int not null primary key nonclustered hash with (bucket_count =20000),
    Item_Id int not null ,
    Order_Id int not null

    )
    with (memory_optimized=on, durability =schema_and_data)

Couple of interesting items above worth stopping and looking deeper.

  • Note durability option I used creating these tables – SCHEMA_AND_DATA. This is actually default setting. This option ensures that data is recovered to the Memory-Optimized table when SQL Server is restarted, or is recovering from a crash. If I picked another setting – SCHEMA_ONLY, memory-optimized table will be truncated if/when SQL Server is restarted or is recovering from a crash,however the Memory-Optimized table will be re-created as a blank table at the end of the restart/recovery operation. Alas, my data in the table wouldn’t persist.
  • That new weird bucket_count index parameter. The bucket_count index parameter on a Non-Clustered Hash Index dictates the size of the Hash table allocated for the index. In real world unlike my demo there is a case for careful capacity planning here. Higher bucket count could lead to larger memory utilization and longer scans, lower bucket count could lead to performance degradation on lookups and inserts. Microsoft recommends the bucket_count should be twice the maximum number of unique index keys.
  • Indexing. Only 8 indexes are allowed on a Memory Optimized table. These also cannot be altered, but will need to be dropped and recreated. All indexes are covering indexes, i.e. they include all columns on the table
  • Constraints. Primary Key is a must requirement for memory optimized table. No foreign key constraints are supported.

Now lets load some data into our tables

insert Orders values (1,GETUTCDATE())
insert Orders values (2,GETUTCDATE())
insert Orders values (3,GETUTCDATE())
insert Orders values (4,GETUTCDATE())

go

insert Items values (1, 'Bike')
insert Items values (2,'Chair' )

go

insert OrderItems values (1,1,2)
insert orderitems values (2,2,1)
insert orderitems values (3,1,3)
insert orderitems values (4,2,4)

go

Now lets get some data from these tables via T-SQL, example lets get all of the orders for chairs

select o.order_id, o.OrderDate_Time from orders o
inner join OrderItems oi on o.Order_Id=oi.Order_Id
inner join Items i on oi.Item_Id= i.item_id
where i.ItemDesc ='Chair'

Here is the easy result as expected:

chair_query

And execution plan:

chair_exec_plan

There is also new report added to SSMS that does great job reporting memory usage by memory optimized objects from DMVs. Note tables we created are there:

memory_usage

Next in part 2 and 3 I plan to move into natively compiled procs, In-Memory OLTP DMVs, perfmon counters and troubleshooting.

A Cursor By Any Other Name

Recently again I had to give customer DBA team advice on most performing JDBC settings for data access from SQL Server via MS JDBC Driver. Unfortunately, again majority of “SQL People” after looking at wait stats and traces from SQL Server and seeing some old fashioned data access cursors with commands like see sp_cursoruprepare, sp_cursorfetch, etc. just refuse to go deeper there. As far as majority of Java folks unfortunately they are fairly oblivious to database performance working on the front end and happy just to get data they need from the database.
So when they run into performance issues they tend to blame the driver first. Oh, if we could only change that JDBC driver…
My statement to that is always:

Driver is just a part of the stack. Number of options are controlled from WebSphere, Oracle TopLink settings, etc. In my quick drawing below you can see that its application that controls driver properties and way driver fetches not visa versa:

driver

JDBC Cursor Types and SQL Server
SQL Server supports number of cursor types. Full list is available at “Understanding Cursor Types”
Two of these are the most performing (TYPE_SS_DIRECT_FORWARD_ONLY,TYPE_SS_SERVER_CURSOR_FORWARD_ONLY), especially TYPE_SS_DIRECT_FORWARD_ONLY,
which corresponds to SQL forward only\read only cursor. Cursor type is one of the fields on the SQLServerResultSet class , so if I wanted to use TYPE_SS_DIRECT_FORWARD_ONLY cursor:

public static void executeStatement(Connection con) {
try {
String SQL = "SELECT TOP 10 * FROM Person.Contact";
Statement stmt = con.createStatement (SQLServerResultSet.TYPE_SS_SERVER_CURSOR_FORWARD_ONLY, +
SQLServerResultSet.CONCUR_READ_ONLY);

ResultSet rs = stmt.executeQuery(SQL);

while (rs.next()) {
System.out.println(rs.getString(4) + " " + rs.getString(6));
}
rs.close();
stmt.close();
}
catch (Exception e) {
e.printStackTrace();
}
}

Now above forward only read only cursor isn’t updateable as you understand, but its fastest as fetching data. As far as updatable cursors, using server-side cursors typically can improve performance, but server-side cursors cannot be used with scroll-insensitive result sets or with scroll-sensitive result sets that are not generated from a database table that contains a primary key. To use server-side cursors, set the UseServerSideUpdatableCursors property to true.
InsensitiveResultSetBufferSize
To improve performance when using scroll-insensitive result sets, the driver can cache the result set data in memory instead of writing it to disk. By default, the driver caches 2 MB of insensitive result set data in memory and writes any remaining result set data to disk. Performance can be improved by increasing the amount of memory used by the driver before writing data to disk or by forcing the driver to never write insensitive result set data to disk. The maximum cache size setting is 2 GB.

SendStringParametersAsUnicode to false when querying against VARCHAR data types in SQL Server to eliminate implicit conversions . Most of the JDBC drivers pass string parameters to SQL Server as Unicode, by default. The problem here is, if the Unicode parameters reference the VARCHAR key columns in an index, the SQL Server engine will not use the appropriate index for query resolution, thereby increasing the unwanted table scans.

Use PreparedStatement or CallableStatement interface vs. Statement Inteface in JDBC. You will get better performance in general executing prepared statements or stored procedures with these interfaces vs. Statement. PreparedStatement query is compiled and placed in SQL Procedure cache, in that way its very similar to performance boost and cache reuse you get with parameterized sp_executessql calls vs. standard exec in SQL Server.

Hey what happenned to my Isolation Level

As with one of my customers they do a lot of J2EE on IBM Websphere AS with SQL Server I observed an interesting phenomenon while troubleshooting excessive blocking.

First lets remind everyone on SQL Server Isolation Levels. SQL Server Supports six isolation levels. These are: READ COMMITTED (default), READ UNCOMMITTED, REPEATABLE READ, SNAPSHOT, SERIALIZABLE and READ COMMITTED SNAPSHOT. As you may understand using correct isolation level has a big impact on performance and concurrency, because:
1. It controls whether locks are taken, what kind of locks and how long these are held.
2. Whether transaction allows for “dirty reads” , uncommitted modifications by another transaction , reads only committed changes or even places exclusive lock and blocks until row is freed.
More on this is in SQL BOL –
The default isolation level in SQL Server is READ COMMITTED, which means that query retrieves the committed version of the row that existed at the time the statement or transaction started.

As customer reported database blocking problems I fired up old trusty blocking DMV output and noted that for some odd reason despite default ISOLATION LEVEL in database being READ COMMITTED my head blocker was running under REPEATABLE READ.

iso2

As per SQL Server BOL for READ COMMITTED– “Specifies that statements cannot read data that has been modified but not committed by other transactions. This prevents dirty reads. Data can be changed by other transactions between individual statements within the current transaction, resulting in nonrepeatable reads or phantom data.” Vs. REPEATABLE READ – “Specifies that statements cannot read data that has been modified but not yet committed by other transactions and that no other transactions can modify data that has been read by the current transaction until the current transaction completes.Shared locks are placed on all data read by each statement in the transaction and are held until the transaction completes. This prevents other transactions from modifying any rows that have been read by the current transaction. Other transactions can insert new rows that match the search conditions of statements issued by the current transaction. If the current transaction then retries the statement it will retrieve the new rows, which results in phantom reads. Because shared locks are held to the end of a transaction instead of being released at the end of each statement, concurrency is lower than the default READ COMMITTED isolation level. Use this option only when necessary.”
Obviously REPEATABLE READ is more strict isolation level, therefore prone to more blocking.
So why? To do get to the bottom of this I had to do some research on Websphere front, mainly into something called Websphere access intent policies. As per IBM – “WebSphere Application Server access intent policies provide a consistent way of defining the isolation level for CMP bean data across the different relational databases in your environment. Within a deployed application, the combination of an access intent policy concurrency definition and access type signifies the isolation level value that Application Server sets on a database connection”
And looks also that default with SQL Server is Repeatable Read as referenced in IBM docs.

As soon as customer changed policy to wsOptimisticRead which translated to READ COMMITTED in SQL Server world vs. wsPessimisticUpdateWeakestLockAtLoad WebSphere default blocking in the system subsided dramatically. So note to all SQL DBAs troubleshooting blocking and concurrency performance issues from J2EE\IBM WebSphere AS – check isolation level, don’t just assume that if database is set to READ COMMITTED by default that’s what your client connections are running under. If you see unusually restrictive isolation level (REPEATABLE READ, SERIALIZABLE) that you don’t expect, check with your development team on Access Intent Policy in WebSphere.