Artisan.Orm or How To Reinvent the Wheel

0
152

This article is a continuation and development of the theme raised in the article:

The next article:

Introduction

Have you read the article “Don’t write your own ORM“?

I have. And I was very depressed by the end of the article.

But not too long, only until the first comment to it:

Coders: if you want to write an ORM please do so!

Nicholas de Lioncourt, you are my inspirer and hero!

So I did, I wrote my own ORM! The wheel was reinvented again!

Of course, it is not a full and sophisticated ORM like EF, it is just a tiny micro-ORM, which only works with SQL Server and has basic functionality for storing and reading data. But at least it performs what is expected and does it well!

This article is about why I did that and how.

Why?

Mostly because I have not found the elegant way to do what I want using the existing ORM frameworks. I am conscious that I may have missed something out of sight.

My main requirement to the ORM is performance. I’ve heard the opinion that the speed of development is more important than a slight subsidence of system performance. And I saw the consequences: “Our system has matured and become slow”. So I cannot relax until everything is done to ensure that the best performance is achieved. Let it be a little more work for my fingers but I will be able to sleep well afterwards.

The second, neither database first approach nor code first is good enough — the second part usually suffers. The best solution is to leave an application domain model to OOP world, a database to relational world and let the ORM eliminate the difference.

The third, I still believe that an application should work with a database through stored procedures. Yes, I know that we are at the end of 2016, but old school is cool! There are many reasons for that: performance, security, maintainability. Note, I did not say a word about keeping business logic in SP. And one more reason is that stored procedure is an efficient way to save and retrieve the object graphs at once.

The fourth, in continuation of the previous paragraph, I believe the CRUD commands as SQL text in the application code are evil. Not the absolute evil of course, but at least a terrible sin. The right place to query a database is a repository. And even there — call a stored procedure! It’s enough for my auto-da-fe, I guess.

The fifth, the existing ORMs usually suppose that an application domain model and a database are developed synchronously from the beginning, therefore objects match to tables, property names and types correspond to column names and types. But in the world where I live, it’s not always so. Rather often, I find myself in a situation when the application domain model is already created, it has complex OO structure and it is necessary to make a persistent storage for it. Or another case, when new UI is being created for very old legacy database. In such case, I would like to have an instrument which helps me to integrate two different parts with less efforts.

The above assertions are my point of view and of course not the ultimate truth.

You are free to have another opinion, even the opposite one! 🙂

To make the long story short, here are my initial targets:

  • The best performance
  • Full control on data transformation in mappers
  • Preferably the repository pattern
  • Database interactions mostly through stored procedures
  • Convenient multiple result sets reading
  • Easy table-valued parameter creation
  • Overall code reducing

The Way to My Own ORM

So I started from the beginning, I mean from the pure ADO.NET.

For example, here is how to get a User by Id with ADO.NET:

public static User GetUserById(int id)
{
 using (SqlConnection connection = new SqlConnection(connectionString))
 {
 using (var cmd = connection.CreateCommand())
 {
 cmd.CommandType = CommandType.Text;
 cmd.CommandText = "select Id, Login, Name, Email from dbo.Users where Id = @Id";
 
 cmd.Parameters.Add( new SqlParameter
 { 
 ParameterName = "@Id",
 Direction = ParameterDirection.Input,
 SqlDbType = SqlDbType.Int,
 Value = id
 });

 User user = null;

 connection.Open();

 using (var dr = cmd.ExecuteReader())
 {
 if (dr.Read())
 {
 user = new User
 {
 Id = (int)dr["Id"],
 Login = (string)dr["Login"],
 Name = (string)dr["Name"],
 Email = (string)dr["Email"]
 }; 
 }
 }

 connection.Close();
 
 return user;
 }
 }
}

There is too much code for the simple sample. What it would be for the whole object graph saving!

Data-to-object Mapping

Almost the whole code can be refactored into extension methods to SqlConnection, SqlCommand and SqlDataReader classes, except the part inside using..ExecuteReader block, I mean this one:

using (var dr = cmd.ExecuteReader())
{
 if (dr.Read())
 {
 user = new User
 {
 Id = (int)dr["Id"],
 Login = (string)dr["Login"],
 Name = (string)dr["Name"],
 Email = (string)dr["Email"]
 }; 
 }
}

What if to refactor it a little and take out this Func delegate:

Func<SqlDataReader, User> createUser = (dr) => {
 return new User
 {
 Id = (int)dr["Id"],
 Login = (string)dr["Login"],
 Name = (string)drr["Name"],
 Email = (string)dr["Email"]
 };
}

Than a simplified code for using..ExecuteReader block turns into:

using (var dr = cmd.ExecuteReader())
{
 if (dr.Read())
 createUser(dr); 
}

Why is this important part? Because ORMs are usually based on how to create and use that Func<SqlDataReader, T> delegate to create <T> object.

Here is The Tale of Three Monkeys and A Wolf article, it demonstrates three main approaches to that task.

Dapper, for example, generates that Func delegate with DynamicMethods and keeps it in cache. This allows to have the performance as if the Func delegate was handwritten. It is the best solution for automapping, I think.

But I have a paranoia about black boxes for such critical things as data. And I want to have more power and control on how the data is transformed.

Therefore, for my ORM, I prefer the handwritten Func delegate. That means I prepare and keep somewhere the code for data-to-object transformation, when some other ORMs autogenerate it.

Also for SqlDataReader, I prefer ordinal access than access by column name:

Func<SqlDataReader, User> createUser = (dr) => {
 var i = 0;

 return new User
 {
 Id = dr.GetInt32 (i) ,
 Login = dr.GetString (++i) ,
 Name = dr.GetString (++i) ,
 Email = dr.GetString (++i)
 };
} 

Why ordinal access is better? Because it is faster. And because in that case, the column names don’t even matter.

Of course, there is always a risk to mix up the order, but to be on the safe side, you can create, for example, a view:

create view dbo.vwUsers
as
(
 select
 Id ,
 [Login]	,
 Name	,
 Email
 from
 dbo.Users
);

and always use select * from vwUsers when fetching the User.
Yes, I know that select * from is bad practice, but we are here to break the rules, dude.

Object-to-data Mapping

Besides the data-to-object mapping direction, I need the reversed one, I mean object-to-data.

The technique of object graph saving means creating user-defined data types for each application domain type that is to be persisted.

So I’ve got two more mapping delegates for a domain type: one to create a DataTable and another to transform an object to a DataRow.

Static Mapper Class

Where to keep those Func delegates? My solution is to make a static class with static methods and decorate the class with custom MapperFor attribute, like this:

[MapperFor(typeof(User)]
public static class UserMapper 
{
 public static User CreateObject(SqlDataReader dr)
 {
 var i = 0;

 return new User 
 {
 Id = dr.GetInt32 (i) ,
 Login = dr.GetString (++i) ,
 Name = dr.GetString (++i) ,
 Email = dr.GetString (++i)
 };
 }
 
 public static DataTable CreateDataTable()
 {
 return new DataTable("UserTableType")
 
 .AddColumn< Int32 >( "Id" )
 .AddColumn< String >( "Login" )
 .AddColumn< String >( "Name" )
 .AddColumn< String >( "Email" );
 }

 public static object[] CreateDataRow(User obj)
 {
 return new object[]
 {
 obj.Id ,
 obj.Login ,
 obj.Name ,
 obj.Email
 };
 }
}

When the application starts, a special MappingManager iterates MapperFor attributes, converts static methods to Func delegates and caches them into dictionaries, for instance in Dictionary<Type, Func<SqlDataReader, T>>.

Extension Methods

Once there is the cache with Func delegates in the application, and it is possible to get it by type, then all the routine calls of SqlCommand or SqlDataReader executions can be refactored into generic extension methods, for example:

cmd.ReadTo<User>();

cmd.ReadToList<User>();

cmd.AddTableParam("@Records", records);

dr.ReadTo<string>()

dr.ReadToArray<int>()

dr.ReadToDictionary<int, User>()

Auto-mapping

When property/column names, quantity and types of an object and a database query matches, it is secure to use auto-mapping and do not write static mapper. My ORM has several extension methods for those cases:

cmd.ReadAs<User>();

cmd.ReadAsList<User>();

dr.ReadAsArray<Record>()

The ReadAs methods use auto-mapping based on the expression trees compilation:

  • When SqlDataReader reads the first row from a result set, it calls the expression trees builder which uses the API.
  • The expression trees builder looks for the columns names in SqlDataReader and matches them to target object properties.
  • Then the expression trees are compiled to Func<SqlDataReader, T> mapping delegate.
  • This mapping delegate is cached into a dictionary with a composite string key.
  • The composite key consists of the SqlCommand text and the full name of target object type.
  • Therefore every next SqlDataReader reading uses the compiled and cached mapping delegate.

So auto-mapping performance is almost the same as the performance of handwritten mappers.
Dammit, I became a creator of a black box. Am I slipping down to the dark side?

Read-Methods

The most frequent interaction with a database is the data fetching. So the collection of the mapper-required and auto mapping Read-methods became the main part of my ORM:

Read-method Description
ReadTo<T> to read a single value or object using an existing mapper
ReadToAsync<T> to read a single value or object asynchronously using an existing mapper
ReadAs<T> to read a single object using automapping
ReadAsAsync<T> to read a single object asynchronously using automapping
ReadToList<T> to read a list of values or objects using an existing mapper
ReadToListAsync<T> to read a list of values or objects asynchronously using an existing mapper
ReadAsList<T> to read a list of objects using automapping
ReadAsListAsync<T> to read a list of objects asynchronously using automapping
ReadToArray<T> to read an array of values or objects using an existing mapper
ReadToArrayAsync<T> to read an array list of values or objects asynchronously using an existing mapper
ReadAsArray<T> to read an array list of objects using automapping
ReadAsArrayAsync<T> to read an array list of objects asynchronously using automapping
ReadToObjectRow<T> to read a single ObjectRow using an existing mapper
ReadToObjectRowAsync<T> to read a single ObjectRow asynchronously using an existing mapper
ReadAsObjectRows to read ObjectRows using an automapping
ReadAsObjectRowsAsync to read ObjectRows asynchronously using automapping
ReadToDictionary<TKey, TValue> to read a dictionary of objects with first column as a key using an existing mapper
ReadToDictionaryAsync<TKey, TValue> to read a dictionary of objects with first column as a key asynchronously using an existing mapper
ReadAsDictionary<TKey, TValue> to read a dictionary of objects with first column as a key using automapping
ReadAsDictionaryAsync<TKey, TValue> to read a dictionary of objects with first column as a key asynchronously using automapping
ReadToEnumerable<T> to read an IEnumerable of objects using an existing mapper (sync method only)
ReadAsEnumerable<T> to read an IEnumerable of objects using automapping (sync method only)

Read more about:

Inline Mappers

All the above ReatTo extension methods accept Func<SqlDataReader, T> as parameter, so besides the separate mappers and auto-mapping these function can be used with so-called inline mappers:

var user = cmd.ReadTo(dr => new User 
{
 Id = dr.GetInt32 (0) ,
 Login = dr.GetString (1) ,
 Name = dr.GetString (2) ,
 Email = dr.GetString (3)
});

Repository Base

Because I decided to use a repository pattern, I created a base class for all custom repositories. This RepositoryBase class can take the responsibility for connection, command creation and transaction, if the last is required. One repository can have one connection, one transaction in a time and create many commands.

In order to encapsulate all the logic of disposing the expensive resources (close connection and dispose command) RepositoryBase class has methods which create SqlCommand and pass it as an argument to Func or Action parameter:

public <T> GetByCommand<T>(Func<SqlCommand, T> func)

public void RunCommand(Action<SqlCommand> action)

public int ExecuteCommand(Action<SqlCommand> action)

Repository Methods

Inheritance from the RepositoryBase class, the handwritten mappers, the extensions methods to SqlCommand and SqlDataReader allow to make the repository methods more compact, code more readable and self-explanatory.

Click ORM and ADO.NET tabs to see the difference:

public User GetUserById(int id)
{
 return ReadTo<User>("dbo.GetUserById", new SqlParameter("Id", id));
}
public static User GetUserById(int id)
{
 using (SqlConnection connection = new SqlConnection(connectionString))
 {
 using (var cmd = connection.CreateCommand())
 {
 cmd.CommandType = CommandType.StoredProcedure;
 cmd.CommandText = "dbo.GetUserById";
 
 cmd.Parameters.Add( new SqlParameter
 { 
 ParameterName = "@UserId",
 Direction = ParameterDirection.Input,
 SqlDbType = SqlDbType.Int,
 Value = id
 });

 User user = null;

 connection.Open();

 using (var dr = cmd.ExecuteReader())
 {
 if (dr.Read())
 {
 user = new User
 {
 Id = (int)dr["Id"],
 Login = (string)dr["Login"],
 Name = (string)dr["Name"],
 Email = (string)dr["Email"]
 }; 
 }
 }

 connection.Close();
 
 return user;
 }
 }
}

This is how to save a User and read it back:

public static User SaveUser(User user)
{
 return GetByCommand(cmd =>
 {
 cmd.UseProcedure("dbo.SaveUser");

 cmd.AddTableRowParam("@User", user);

 return cmd.ReadTo<User>;
 });
}
public static User SaveUser(User user)
{
 using (SqlConnection connection = new SqlConnection(connectionString))
 {
 using (var cmd = connection.CreateCommand())
 {
 cmd.CommandType = CommandType.StoredProcedure;
 cmd.CommandText = "dbo.SaveUser";
 
 var userTable = new DataTable("UserTableType");

 userTable.Columns.Add( "Id" , typeof( Int32 ));
 userTable.Columns.Add( "Login" , typeof( String ));
 userTable.Columns.Add( "Name" , typeof( String ));
 userTable.Columns.Add( "Email" , typeof( String ));

 userTable.Rows.Add(new object[] {
 user.Id, 
 user.Login, 
 user.Name,
 user.Email
 });

 cmd.Parameters.Add( new SqlParameter
 { 
 ParameterName = "@User",
 Direction = ParameterDirection.Input,
 SqlDbType = SqlDbType.Structured,
 TypeName = userTable.TableName,
 Value = userTable
 });

 User user = null;

 connection.Open();

 using (var dr = cmd.ExecuteReader())
 {
 if (dr.Read())
 {
 user = new User
 {
 Id = (int)dr["Id"],
 Login = (string)dr["Login"],
 Name = (string)dr["Name"],
 Email = (string)dr["Email"]
 }; 
 }
 }

 connection.Close();
 
 return user;
 }
 }
}

And here is the example from the article about graph saving which is rewritten to the ORM use:

public static IList<GrandRecord> SaveGrandRecords(IList<GrandRecord> grandRecords)
{
 var records = grandRecords.SelectMany(gr => gr.Records);
 var childRecords = records.SelectMany(r => r.ChildRecords);

 return GetByCommand(cmd =>
 {
 cmd.UseProcedure("dbo.SaveGrandRecords");

 cmd.AddTableParam("@GrandRecords", grandRecords);

 cmd.AddTableParam("@Records", records);

 cmd.AddTableParam("@ChildRecords", childRecords);

 return cmd.GetByReader(dr => {

 var grandRecords = dr.ReadToList<GrandRecord>();

 var records = dr.ReadToList<Record>();

 var childRecords = dr.ReadToList<ChildRecord>();

 dr.Close();

 grandRecords.MergeJoin(
 records, 
 (gr, r) => gr.Id == r.GrandRecordId,
 (gr, r) => { r.GrandRecord = gr; gr.Records.Add(r); },

 childRecords,
 (r, cr) => r.Id == cr.RecordId,
 (r, cr) => { cr.Record = r; r.ChildRecords.Add(cr); }
 );

 return grandRecords;
 });
 });
}
public static IList<GrandRecord> SaveGrandRecords(IList<GrandRecord> grandRecords)
{
 var id = int.MinValue;

 foreach (var grandRecord in grandRecords)
 {
 if (grandRecord.Id == 0)
 grandRecord.Id = id++;

 foreach (var record in grandRecord.Records)
 {
 if (record.Id == 0)
 record.Id = id++;

 record.GrandRecordId = grandRecord.Id;

 foreach (var childRecord in record.ChildRecords)
 {
 if (childRecord.Id == 0)
 childRecord.Id = id++;

 childRecord.RecordId = record.Id;
 }
 }
 }

 var connectionString = 
 @"Data Source=.\SQLEXPRESS;Initial Catalog=ObjectGraphs;Integrated Security=True;"

 using (SqlConnection connection = new SqlConnection(connectionString))
 {
 using (var cmd = connection.CreateCommand())
 {
 cmd.CommandType = CommandType.StoredProcedure;
 cmd.CommandText = "dbo.SaveGrandRecords";


 var grandRecordTable = new DataTable("GrandRecordTableType");

 grandRecordTable.Columns.Add( "Id" , typeof( Int32 ));
 grandRecordTable.Columns.Add( "Name" , typeof( String ));

 foreach(var grandRecord in grandRecords) 
 {
 grandRecordTable.Rows.Add(new object[] {
 grandRecord.Id, 
 grandRecord.Name
 });
 }

 cmd.Parameters.Add( new SqlParameter
 { 
 ParameterName = "@GrandRecords",
 Direction = ParameterDirection.Input,
 SqlDbType = SqlDbType.Structured,
 TypeName = grandRecordTable.TableName,
 Value = grandRecordTable
 });



 var recordTable = new DataTable("RecordTableType");

 recordTable.Columns.Add( "Id" , typeof( Int32 ));
 recordTable.Columns.Add( "GrandRecordId" , typeof( Int32 ));
 recordTable.Columns.Add( "Name" , typeof( String ));

 var records = grandRecords.SelectMany(gr => gr.Records);

 foreach(var record in records) 
 {
 recordTable.Rows.Add(new object[] {
 record.Id, 
 record.GrandRecordId, 
 record.Name
 });
 }

 cmd.Parameters.Add( new SqlParameter
 { 
 ParameterName = "@Records",
 Direction = ParameterDirection.Input,
 SqlDbType = SqlDbType.Structured,
 TypeName = recordTable.TableName,
 Value = recordTable
 });


 var childRecordTable = new DataTable("ChildRecordTableType");

 childRecordTable.Columns.Add( "Id" , typeof( Int32 ));
 childRecordTable.Columns.Add( "RecordId" , typeof( Int32 ));
 childRecordTable.Columns.Add( "Name" , typeof( String ));

 var childRecords = records.SelectMany(r => r.ChildRecords);

 foreach(var childRecord in childRecords) 
 {
 childRecordTable.Rows.Add(new object[] {
 childRecord.Id, 
 childRecord.RecordId, 
 childRecord.Name
 });
 }

 cmd.Parameters.Add( new SqlParameter
 { 
 ParameterName = "@ChildRecords",
 Direction = ParameterDirection.Input,
 SqlDbType = SqlDbType.Structured,
 TypeName = childRecordTable.TableName,
 Value = childRecordTable
 });

 
 var savedGrandRecords = new List<GrandRecord>();
 var savedRecords = new List<Record>();
 var savedChildRecords = new List<ChildRecord>();

 connection.Open();

 using (var dr = cmd.ExecuteReader())
 {
 while (dr.Read())
 {
 savedGrandRecords.Add(
 new GrandRecord
 {
 Id = dr.GetInt32(0),
 Name = dr.GetString(1),
 Records = new List<Record>()
 }
 );
 }

 dr.NextResult();

 while (dr.Read())
 {
 savedRecords.Add(
 new Record
 {
 Id = dr.GetInt32(0),
 GrandRecordId = dr.GetInt32(1),
 Name = dr.GetString(2),
 ChildRecords = new List<ChildRecord>()
 }
 );
 }

 dr.NextResult();

 while (dr.Read())
 {
 savedChildRecords.Add(
 new ChildRecord
 {
 Id = dr.GetInt32(0),
 RecordId = dr.GetInt32(1),
 Name = dr.GetString(2)
 }
 );
 }
 }

 connection.Close();
 
 var recordEnumerator = records.GetEnumerator();
 var record = recordEnumerator.MoveNext() 
 ? recordEnumerator.Current 
 : null;

 var childRecordEnumerator = childRecords.GetEnumerator();
 var childRecord = childRecordEnumerator.MoveNext() 
 ? childRecordEnumerator.Current 
 : null;


 foreach (var grandRecord in grandRecords)
 {
 grandRecord.Records = new List<Record>();

 while (record != null && record.GrandRecordId == grandRecord.Id)
 {
 record.ChildRecords = new List<ChildRecord>();

 while (childRecord != null && childRecord.RecordId == record.Id)
 {
 record.ChildRecords.Add(childRecord);

 childRecord = childRecordEnumerator.MoveNext() 
 ? childRecordEnumerator.Current 
 : null;
 }

 grandRecord.Records.Add(record);

 record = recordEnumerator.MoveNext() 
 ? recordEnumerator.Current 
 : null;
 }
 }

 return savedGrandRecords;
 }
 }
}

About Artisan.Orm

The above approach has been tested in several projects and proved its efficiency. So I decided to create a separate library.

I named my project Artisan.Orm, because it allows you to neatly and accurately tune up the DAL.
Ford’s conveyor is cool, but Ferrari are assembled by hand! 🙂

If you are interested in the project, please visit Artisan.Orm GitHub page and its documentation wiki.

GitHub +Wiki

Artisan.Orm is also available as NuGet Package.

NuGet

About the Source Code

The attached archive contains a copy of the solution from GitHub, created in Visual Studio 2015, which consists of three projects:

  • Artisan.Orm – DLL project which contains the Artisan.Orm classes
  • Database – SSDT project to create database for SQL Server 2016 to provide data for testing
  • Tests – Test project with examples of the code use

In order to install the database and run the tests, change the connection string in file Artisan.publish.xml and in App.config to yours.

History

  • 16th November, 2016

    • Initial publication
  • 23rd November, 2016
    • Added section about auto-mapping via expression trees
    • Source code updated version 1.0.5
  • 30th December, 2016
    • Added section about Read-methods
    • Source code updated to version 1.0.7
  • 31th January, 2017
    • Added ReadToDictionary and ReadAsDictionary extension methods
    • Source code updated to version 1.0.8
  • 13th June, 2017
    • Added paragraph about Inline Mappers
    • Source code updated to version 1.1.0
    • Added link to the article “Artisan Way of Data Reply”

LEAVE A REPLY