Code contracts – Where invariant method called?

I assume you already know code contracts and have heard about invariant methods. This post will demonstrate how simple code compiles with invariant method.
This simple class declares a method as invariant method:

public class InvariantExample
{
public void Method()
{
Console.WriteLine("In Method");
}

[
ContractInvariantMethod]
protected void InvariantMethod()
{
Contract.Invariant(2 > 1);
}
}

Using reflector we can see that Method compiles to:

public void Method()
{
Console.WriteLine("In Method");
this.$InvariantMethod$();
}

As we can see the compiled method ends with a call to our invariant method. Assuming we had more methods in our class we would have seen similar call in each one of them. The invariants is less explicit than most contracts features but it’s a powerful part of it, use wisely!

Easy Wins Optimizations – HashSet

HashSet is a data structure introduced in .Net 3.5 framework. The main advantage of HashSet is that it adds and fetches items in very low complexity (constant amortized time). We’ll try to see how much faster than List it can be. In the example we’ll go over code that stores all historical lottery results and checks if some results appear among the historical ones.

Registering results

public abstract class ResultsRegistrarBase
{
protected ICollection<long> codes;

public void AddResult(int num1, int num2, int num3, int num4, int num5, int num6)
{
long code = ConvertToCode(num1, num2, num3, num4, num5, num6);

codes.Add(code);
}

public bool ContainsResult(int num1, int num2, int num3, int num4, int num5, int num6)
{
long code = ConvertToCode(num1, num2, num3, num4, num5, num6);

return codes.Contains(code);
}

private static long ConvertToCode(int num1, int num2, int num3, int num4, int num5, int num6)
{
return num1 + (long) Math.Pow(10, 0) + num2 * (long) Math.Pow(10, 2) +
num3 * (
long) Math.Pow(10, 4) + num4 * (long) Math.Pow(10, 6) +
num5 * (
long) Math.Pow(10, 8) + num6 * (long) Math.Pow(10, 10);
}
}

This class is in charge of storing and finding the results. It assumes it gets the numbers sorted and the ConvertToCode method maps all the numbers to one “big” number”. For example: 1,5,20,30,45,49 will be mapped to 494530200501. The codes collection will be initialized in a base class:
The naive code:

public class NaiveResultsRegistrar : ResultsRegistrarBase
{
public NaiveResultsRegistrar()
{
codes =
new List<long>();
}
}

The code based on HashSet:

public class HashSetResultsRegistrar : ResultsRegistrarBase
{
public HashSetResultsRegistrar()
{
codes =
new HashSet<long>();
}
}

How much faster is it than the naive code?

The difference is huge. In a simple case where we have 250,000 historical results and we’re querying the collection 50,000 times the average running time of the naive code is 1:28 minutes while the code based on HashSet takes only 0.2 seconds. The HashSet code run ~360 times faster than the naive code. The reason for this major difference is that List has to iterate over all the items in order to determine if it contains the requested item. HashSet uses a hash method, without getting to implementation details, it enables the Contains method look in very few entries in order to determine whether or not an item is contained in the collection.
This major improvement was received with minimal change in the code. So in cases where we query a large collection about whether some items are in it, usage of HashSet can give us major running time  improvement with minimal change.

How to pass LINQ-To-SQL data objects with associations through WCF service?

Sometimes we need to pass a simple data object through WCF service. There are a few simple steps to performing this task.

Make data objects declare columns as Data Members

This step is very simple: all that’s needed is to define in the DBML editor that the serialization mode is Unidirectional.
image
By defining this serialization mode the generated code will be decorated with DataContract and DataMember attributes on classes and columns. By doing so the objects are compatible with WCF service and declared as part of the contract.

Visit the objects associations properties before returning them from the service

For example, if we’re dealing with this simple schema and our service returns a customer with requested ID.
image
Assuming we have this naive code:

public class StoreService : IStoreService
{
public Customer GetCustomer(int id)
{
using (var store = new StoreDataContext())
{
return store.Customers.First(dbCustomer => dbCustomer.ID == id);
}
}
}

The client will get the correct customer, but it won’t have the Orders property filled, even if the customer has orders the property value will be null. The reason for this behavior is that the generated code doesn’t execute the query against the database when it’s serializing. The solution is to access all data members before it serializes.

public class StoreService : IStoreService
{
public Customer GetCustomerWithRelations(int id)
{
using (var store = new StoreDataContext())
{
Customer customer = store.Customers.First(dbCustomer => dbCustomer.ID == id);
VisitDataMembers(customer);
return customer;
}
}

public static void VisitDataMembers(object dataMember)
{
if (dataMember == null)
{
return;
}

bool isEnumerable = dataMember is IEnumerable;
if (isEnumerable)
{
foreach (var item in (IEnumerable)dataMember)
{
VisitDataMembers(item);
}
}

var properties = dataMember.GetType().GetProperties();
IEnumerable<PropertyInfo> dataMemberProperties =
properties.Where(property => property.IsDefined(
typeof(DataMemberAttribute), true));
foreach (PropertyInfo dataMemberProperty in dataMemberProperties)
{
var dataMemberValue = dataMemberProperty.GetValue(dataMember, null);
VisitDataMembers(dataMemberValue);
}
}
}

This VisitDataMembers method ensures all data members’ properties are accessed at least once before serialization and the client will be able to use all the data which is accessible for the data object tree. For example in the sample schema, the client will be able to get from customer to orders to products to materials.

Easy Wins Optimizations – Sorted List

This is the first post of easy wins optimizations posts. The need for these posts comes from the fact the most of us hardly ever need it 🙂 Sounds weird? It should. What it means is that we don’t need it most of the time; the most naive code does the best work for us. When we encounter the case that requires optimization we hardly remember the simplest ways of doing it.
In this post on sorted list we’ll go over an example of optimizing a simple case.

How to find a person in a collection whose age is closest to mine?

Assume we have a person class with age defined on it:

public class Person
{
public double Age { get; private set; }

public Person(double age)
{
Age = age;
}
}

What we want is to collect many people into a list, and any time we asked about an age, we find the closest person to this age. We can think of it as an application for a dating service that matches people by age.

The naive solution

So what can be simpler than collecting all the people in a list? Probably nothing, this is how we do it:

public class NaivePersonFinder : IPersonFinder
{
private readonly List<Person> persons = new List<Person>();

public void Add(Person person)
{
persons.Add(person);
}

public Person FindPeopleInRange(double requiredAge)
{
Person closestPerson = null;
double smallestDiffrence = double.MaxValue;
foreach (var person in persons)
{
double difference = Math.Abs(person.Age - requiredAge);
if (difference < smallestDiffrence)
{
smallestDiffrence = difference;
closestPerson = person;
}
}

return closestPerson;
}
}

The code iterates over the collection in the most straightforward way there is and updates the closest person whenever it encounter a closer person. Simple! Isn’t it?

But I need faster results!

The sorted list solution
How does sorted list help us here? Sorted list allow us to search items using Binary Search. Binary search saves a lot of time since it has to query very few items in the collection in order to find the closest one. For example, to find the closest item in a collection of 1,000 items it’ll have to query ~10 and saves iteration over the hole 1,000.

Some code

public class SortedPersonFinder : IPersonFinder
{
private readonly SortedList<double, Person> persons = new SortedList<double, Person>();

public void Add(Person person)
{
persons.Add(person.Age, person);
}

public Person FindPeopleInRange(double requiredAge)
{
if (persons.ContainsKey(requiredAge))
{
return persons[requiredAge];
}

IList<double> sortedAges = persons.Keys;

int index = BinarySearchIndex(requiredAge);

var closestPersons = new List<Person>(3);

if (index > 0)
{
closestPersons.Add(persons[sortedAges[index -
1]]);
}

if (index < sortedAges.Count - 1)
{
closestPersons.Add(persons[sortedAges[index +
1]]);
}

closestPersons.Add(persons[sortedAges[index]]);

return closestPersons.OrderBy(person => Math.Abs(person.Age - requiredAge)).First();
}

private int BinarySearchIndex(double requiredAge)
{
var sortedAges = persons.Keys;

var scopeStart = 0;
var scopeEnd = sortedAges.Count - 1;

int index = 0;
while (scopeEnd > scopeStart )
{
index = (scopeStart + scopeEnd) /
2;
double ageAtIndex = sortedAges[index];
if (requiredAge > ageAtIndex)
{
scopeStart = index +
1;
}
else
{
scopeEnd = index;
}
}
return index;
}
}

The code is more complex than the naive version, but it finds the results much faster. It uses .Net Sorted List collection. This is a collection that updates a collection of  keys whenever the collection changes. The keys are kept ordered and we can easily access them and binary search them.

How much faster is it?

In a collection of 100,000 people and 50,000 queries the running time of the sorted list is much more faster than the naive:
The naive solution did it in average time of ~100 seconds
The sorted list solution had average time of ~6 seconds. This is a major improvement; it means that it ran more the 15 times faster than the naive solution.

Conclusion

If we have a collection in which order can be applied AND we also need quick results for queries based on the order, sorted list can easily improve running time. We must always remember that the running time is usually negligible even in the most naive code. If no optimization required – don’t optimize, more gain will be earned from simple and maintainable code.