Visual Studio 2010 code complexity extension

An Alpha version of code complexity addin for Visual Studio 2010 is available. The extension can be found at project page at CodePlex.
The extension shows the complexity of the methods in the IDE near the method and measures the method “health”.
For example, view of healthy methods (low complexity):
GoodMethods
And example of too complex method:
BadMethod
Currently, the complexity metric shown is simple complexity (defined by Code Complete). This metric counts the number of possible paths in the method. A healthy method is one with low complexity; a method with 10 paths is not so good and is worth simplifying.

Keep it green!

Intercepting unmanaged call in managed code

This post will demonstrate how to intercept unmanaged calls in the executing process. There are many reasons for intercepting unmanaged calls, among them monitoring, debugging and some other hacks.
This post will demonstrate how to intercept calls to CreateFile from Kernel32 library. The CreateFile method is called for opening an existing file and creating new one. For example, calls to File.OpenText will initiate a call to CreateFile.

Hooking CreateFile in unmanaged code

We’ll define a function pointer type for CreateFile:

typedef HANDLE (WINAPI *FileCreateFunction)(
LPCWSTR lpFileName,
DWORD dwDesiredAccess,
DWORD dwShareMode,
LPSECURITY_ATTRIBUTES lpSecurityAttributes,
DWORD dwCreationDisposition,
DWORD dwFlagsAndAttributes,
HANDLE hTemplateFile);

Afterwards, we’ll store the original CreateFile method and create a hook which will redirect calls:

FileCreateFunction OriginalCreateFile = (FileCreateFunction)GetProcAddress(GetModuleHandle(L"kernel32"), "CreateFileW");
HANDLE WINAPI CreateFileHook(
LPCWSTR lpFileName,
DWORD dwDesiredAccess,
DWORD dwShareMode,
LPSECURITY_ATTRIBUTES lpSecurityAttributes,
DWORD dwCreationDisposition,
DWORD dwFlagsAndAttributes,
HANDLE hTemplateFile)
{
bool hasListener = createFileCallback != NULL;
if(hasListener)
{
createFileCallback(lpFileName);
}

return OriginalCreateFile(
lpFileName,
dwDesiredAccess,
dwShareMode,
lpSecurityAttributes,
dwCreationDisposition,
dwFlagsAndAttributes,
hTemplateFile);
}

For now, ignore the code about the callback; we’ll use this code later for notifying the managed when a file is created.

As we can see, the hook function has the same signature as the original one. This is obvious since the function caller will not be changed, just the target method. The hook observes the function arguments and forwards the call to the original call.

Now, we’ll forward the calls from CreateFile to the new hook using mhook library:

BOOL APIENTRY DllMain( HMODULE hModule,
DWORD ul_reason_for_call,
LPVOID lpReserved)
{
switch (ul_reason_for_call)
{
case DLL_PROCESS_ATTACH:
Mhook_SetHook((PVOID*)&OriginalCreateFile, CreateFileHook);
break;
case DLL_PROCESS_DETACH:
createFileCallback = NULL;
Mhook_Unhook((PVOID*)&OriginalCreateFile);
break;
case DLL_THREAD_ATTACH:
case DLL_THREAD_DETACH:
break;
}

return TRUE;
}

This code will hook the CreateFile method (OriginalCreateFile) to our new defined hook when the library is loaded into the process.

Using the library in managed code

In order to load the unmanaged library we’ll use P/invoke calls:

[DllImport("kernel32", SetLastError = true)]
static extern IntPtr LoadLibrary(string lpFileName);

[
DllImport("kernel32.dll", SetLastError = true)]
static extern bool FreeLibrary(IntPtr hModule);

Loading the library:

static void Main(string[] args)
{
IntPtr library = LoadLibrary(@"..\..\..\Debug\InterceptionLibrary.dll");
FreeLibrary(library);
}

Right now, all the calls will behave the same, but all will be routed through our new hook (a user of the software will experience no difference).

Preparing a callback in unmanaged code

We would like to know what file is being created, so we’ll define a callback function pointer that matches it:

typedef void (*NotifyCallbackFunction)(const TCHAR* fileName);

The managed code will register a callback using the method:

extern "C" __declspec(dllexport) void RegisterFileCreateListener(
NotifyCallbackFunction callback )
{
createFileCallback = callback;
}

This method is exposed so it can be used by the managed code using:

extern "C" __declspec(dllexport)

Registering the callback through managed code

First, in order to call the register method, we’ll need to define a delegate into the register method will be loaded (ignore the callback delegate for now):

[UnmanagedFunctionPointer(CallingConvention.Cdecl)]
private delegate void RegisterFileListenerDel(OnFileCreatedDel callback);

In order to load the register method into the managed assembly, we’ll need to use:

[DllImport("kernel32", SetLastError = true)]
static extern IntPtr GetProcAddress(IntPtr hModule, string procName);

Loading the method is done by:

IntPtr pAddressOfFunctionToCall = GetProcAddress(library, "RegisterFileCreateListener");
var registerListener = (RegisterFileListenerDel)
Marshal.GetDelegateForFunctionPointer(
pAddressOfFunctionToCall,
typeof (RegisterFileListenerDel));

Now, we’ll have to define a callback delegate and method:

[UnmanagedFunctionPointer(CallingConvention.Cdecl, CharSet = CharSet.Unicode)]
private delegate void OnFileCreatedDel(string fileName);
private static void OnFileCreated(string fileName)
{
Console.WriteLine("File opened: {0}", fileName);
}

Now, all that’s needed is to register the callback using the method loaded in the previous step:

registerListener(OnFileCreated);

That’s all, as long as the library is loaded our callback we’ll be notified on every CreateFile call.

Example

This is our final version of the main method in the managed code:

static void Main(string[] args)
{
IntPtr library = LoadLibrary(@"..\..\..\Debug\InterceptionLibrary.dll");

IntPtr pAddressOfFunctionToCall = GetProcAddress(library, "RegisterFileCreateListener");
var registerListener = (RegisterFileListenerDel)
Marshal.GetDelegateForFunctionPointer(
pAddressOfFunctionToCall,
typeof (RegisterFileListenerDel));
registerListener(OnFileCreated);

File.OpenText(@"C:\Development\wow.txt").Dispose();
File.CreateText(@"C:\Development\wow2.txt").Dispose();

FreeLibrary(library);
}

Running this code will notify these files created:
image

Summary

In order to be notified about a native call in managed code:

  1. Create an unmanaged library
  2. Hook the requested method using mhook
  3. Create and expose a callback registration method
  4. In managed code load the library
  5. Register a callback

You can download the source code example here.

Alternatives

There several alternatives. For example, another possible way to intercept calls in unmanaged calls is Detours, a library developed by Microsoft. Another possible solution is EasyHook, a library that allows intercepting unmanaged calls directly from managed code.

Automatic generation of View-Model – test drive

In the previous post I tried to present an approach for automatic View-Model generation. When trying to use it in a real life project as simple as a registration form, many missing features were revealed.

Here’s a list of issues which were noticed on simple implementation:

  • There’s no way to access the model from the abstract View-Model
  • There’s no way no pass arguments to the constructor
  • No built-in way to define the order of validations
  • No way to raise event of PropertyChanged other than the property being set
  • No way to declare additional errors on properties which are not mapped to the model

Actual attempt to use the framework in registration form

image
This is the simple registration – all fields are mandatory, email must match some regular expression and password verification must be the same as the password. The model has 3 properties – Email, Name and Password. These properties are mapped to the View-Model. Let’s see how this code looks like in the View-Model using the new framework:

public abstract class UserRegistrationViewModel : 
INotifyPropertyChanged, IDataErrorInfo, IUserViewModel
{
private static readonly ViewModelGenerator viewModelsGenerator = new ViewModelGenerator();

public static UserRegistrationViewModel CreateUserRegistrationViewModel(User user)
{
return viewModelsGenerator.Generate<UserRegistrationViewModel>(user);
}

protected UserRegistrationViewModel() { }

private const string MANDATORY_FIELD_ERROR_MESSAGE = "This field is mandatory";
private const string PASSWORD_VERIFICATION_PROPERTY_NAME = "PasswordVerification";

[
Model]
private readonly User user;
private ICommand save;

public event PropertyChangedEventHandler PropertyChanged = (sender, args) => { };

[
Validation(typeof (MandatoryValidator))]
public abstract string Name { get; set; }

[
Validation(typeof(EmailValidator))]
[
Validation(typeof (MandatoryValidator))]
public abstract string Email { get; set; }

[
Validation(typeof(MandatoryValidator))]
[
RelatedProperty(PASSWORD_VERIFICATION_PROPERTY_NAME)]
public abstract string Password { get; set; }

private string passwordVerification;
public string PasswordVerification
{
get { return passwordVerification; }
set
{
passwordVerification =
value;
}
}

private string GetPasswordVerificationValidation()
{
if(string.IsNullOrEmpty(passwordVerification))
{
return MANDATORY_FIELD_ERROR_MESSAGE;
}

if (user.Password != passwordVerification)
{
return "Password and Verification must be same";
}

return null;
}

public virtual string this[string columnName]
{
get
{
if (columnName == PASSWORD_VERIFICATION_PROPERTY_NAME)
{
return GetPasswordVerificationValidation();
}

return null;
}
}

public string Error { get { return null; } }

public ICommand Save
{
get
{
if (save == null)
{
save =
new SaveCommand(user);
}

return save;
}
}
}

Let’s see how this code demonstrates the solutions to some of the missing features.

Accessing the model – in order to get an instance of the model, a new attribute is presented [Model]. Decorating a field with it will make sure that field is initialized with the model instance.

Raising PropertyChaned event on other property than the one being set – in order to deal with it a new attribute is presented [RelatedProperty]. When the property is being set, the PropertyChanged event will be raised for all properties – the one being set and the related properties.

Defining additional errors for non-mapped properties – the generated View-Model tries to resolve errors for mapped properties, if it find none, it forwards the call the base View-Model (the abstract class) and checks for additional errors.

Download

The framework source can be found in CodePlex. It is still very partial and contains many bugs but it already works in the basic scenarios 🙂

Automatic generation of View-Model – first attempt

I recently found myself writing the same code again and again. As can be guessed, I’m writing a WPF application based on MVVM architecture. In order to avoid writing the same code again I am trying to generate the View-Model automatically using Castle Dynamic Proxy.

Simplest scenario – forwarding calls to model

This is probably the simplest case we encounter. This is so simple we are tempted to bind the view directly to the model.

Naive implementation

public class Model
{
public object Prop { get; set; }
}

public class ViewModel
{
private readonly Model model;

public ViewModel(Model model)
{
this.model = model;
}

public object Prop
{
get { return model.Prop; }
set { model.Prop = value; }
}
}

The generated alternative

So, what we’d like to achieve is skipping the forwarding implementation. The View-Model can look like:

public abstract class ViewModel
{
public abstract object Prop { get; set; }
}

So far, it’s very simple 🙂 Let’s see how it’s being used with the Model:

[Test]
public void Generate_SetPropertyValue_ModelPropertyUpdated()
{
var viewModelGenerator = new ViewModelGenerator();
var model = new Model();

ViewModel generatedViewModel = viewModelGenerator.Generate<ViewModel>(model);

object valueToAssign = new object();
generatedViewModel.Prop = valueToAssign;

Assert.That(model.Prop, Is.SameAs(valueToAssign));
}

This example shows that we’ve created a View-Model based on a Model instance, simulated a call to a property setter and the new value automatically reflected the Model. That was simple, wasn’t it?

Next scenario – Implementing INotifyPropertyChanged

This is a very common scenario, which has a very common implementation. It’s so common I’ll skip the naive implementation and jump to the generated version directly:

The generated alternative

public abstract class ViewModel : INotifyPropertyChanged
{
public abstract object Prop { get; set; }
public event PropertyChangedEventHandler PropertyChanged;
}

[
Test]
public void Generate_SetPropertyValue_PropertyChangedRaised()
{
var viewModelGenerator = new ViewModelGenerator();
var model = new Model();
ViewModel generatedViewModel = viewModelGenerator.Generate<ViewModel>(model);

bool wasRaised = false;
generatedViewModel.PropertyChanged += (sender, args) => wasRaised =
true;

generatedViewModel.Prop =
new object();

Assert.That(wasRaised, Is.True);
}

In this case we’ve generated a View-Model that implements INotifyPropertyChanged. We had to do nothing but declare the View-Model implements INotifyPropertyChanged. Whenever a property value is changed the event will be raised with the property name.

Last scenario – Implementing IDataErrorInfo

This case is trivial too so I’ll skip again the naive solution.

The generated alternative

First we’ll take a look at the format of the abstract View-Model and the validation declaration:

public abstract class ViewModel : IDataErrorInfo
{
public abstract string this[string columnName] { get; }
public abstract string Error { get; }

[
Validation(typeof(DummyValidator))]
public abstract object Prop { get; set; }
}

public class DummyValidator : IValidator
{
public string Validate(object value)
{
return "error";
}
}

The view model is now implementing the IDataErrorInfo interface. The interfaces require implementation of indexer that maps from property to error message. The automatic View-Model generation should take care of it. In order to declare the required validations we’ll use the Validation attribute. The attribute defines which validator to use on the property. The validation result will be mapped to the property name.
For example:

[Test]
public void Generate_SetInvalidPropertyValue_PropertyErrorIsCorrect()
{
var viewModelGenerator = new ViewModelGenerator();
var model = new Model();

ViewModel generatedViewModel = viewModelGenerator.Generate<ViewModel>(model);

generatedViewModel.Prop =
new object();

Assert.That(generatedViewModel["Prop"], Is.EqualTo("error"));
}

Conclusion

MVVM is used many times in common scenarios. The code is usually a template which we can generate dynamically. This way the code duplication will be drastically reduced and allow uniform implementation.
The examples here were simplified but the concept should be clear. There’s much more work on the framework in order to cover more real life scenarios, soon to come 🙂 I’ll upload the framework source code to CodePlex before the next post.

Worker thread using parallel tasks

Worker thread is a known pattern – there’s work to do, it needs to be done asynchronously and we want to get all the work results when it’s ready. What we’re going to see is an implementation of it as an alternative to the common implementations. This implementation will take advantage of the new parallel tasks library.
To formalize the requirements:

  • The worker queues items to process
  • The items are processed asynchronously
  • Only one item can be processed at a time
  • The items are processed in the order they were queued
  • The worker will store the processed results in the order they were processed

The worker class

public class Worker
{
private readonly IItemsProcessor itemsProcessor;
private Task lastTask;

public IList ProcessedItems { get; private set; }

public Worker(IItemsProcessor itemsProcessor)
{
this.itemsProcessor = itemsProcessor;
ProcessedItems =
new List();
InitializeNullTask();
}

private void InitializeNullTask()
{
lastTask =
new Task(() => default(TResult));
lastTask.Start();
}

public void ProcessItem(TItem item)
{
var nextTask = lastTask
.ContinueWith(task =>
{
var processItem = itemsProcessor.ProcessItem(item);
ProcessedItems.Add(processItem);
});
lastTask = nextTask;
}

public void WaitForPendingItems()
{
using (var sync = new ManualResetEvent(false))
{
lastTask.ContinueWith(task => sync.Set());
sync.WaitOne();
}
}
}

The worker creates a task for each item which needs to be processed. Each task is executed in the thread pool, the point where we ensure that the tasks are run in the correct order is the ContinueWith call. ContinueWith takes care of the order of the tasks’ execution.

The InitializeNullTask creates a task that, surprisingly, does nothing but set a head to the tasks queue. This task helps us avoid in ProcessItem to check if this is the first item to process. The first task starts with call to Start while all the others start with ContinueWith.

WaitForPendingItems also enqueues a task. This time, the task is only waiting to be executed, which means all other items were already processed. When the task starts it releases the enqueuing thread.

Usage example

In this example we’ll download a list of web pages and check print their sizes. The downloader implements the IItemsProcessor we’ve seen the worker expects.

public class WebUrlsDownloader : IItemsProcessor<string, byte[]>
{
public byte[] ProcessItem(string url)
{
using (var webClient = new WebClient())
{
return webClient.DownloadData(url);
}
}
}

And the actual usage:

public void DownloadFiles()
{
var worker = new Worker<string, byte[]>(new WebUrlsDownloader());
worker.ProcessItem(
@"http://msdn.microsoft.com/en-us/library/dd537608.aspx");
worker.ProcessItem(
@"http://msdn.microsoft.com/en-us/library/dd537609.aspx");
worker.ProcessItem(
@"http://msdn.microsoft.com/en-us/library/dd997405.aspx");
worker.WaitForPendingItems();
Console.WriteLine("Finished downloading files:");
foreach (var processedItem in worker.ProcessedItems)
{
Console.WriteLine("Downloaded file with size: {0}", processedItem.Length);
}
}

Unit testing F#

F# is a cool and exciting language. It’s much more than an academic language; it can solve many real life (coding) problems in its functional manner. As to production code, it must be covered with unit tests. I’ll show a simple example of a unit test written in C# with moq against F# code.
The code under test is a simple lottery calculator: it takes a list of participants and says how much every participant won – 5 times the number of hits. Simple, isn’t it?
Let’s take a look at the code under test, it could probably be written better, but ignore this for now 🙂

type LotteryCalculator(winningNumbers:System.Collections.Generic.IList) = 
let calculatePrize(participatingNumbers) =
let numOfHits = participatingNumbers |>
Seq.filter (
fun participatingNumber -> winningNumbers.Contains participatingNumber) |>
Set.ofSeq |>
Set.count
numOfHits *
5

member this.CalculatePrizes(participants:System.Collections.Generic.IEnumerable) =
participants
|> Seq.map (
fun participant -> (participant, calculatePrize(participant.GetTicket())))

So, we have an API method, CalculatePrizes(), which take a collection of participants, look at their tickets and returns a tuples list of participant and prize. So we have something like this: CalculatePrizes: System.Collections.Generic.IEnumerable –> seq

After dealing with some issues, which I’ll write about in a future post, I wrote this test:

[Test]
public void CalculatePrizes_RecievesTwoParticipant_MatchCorrectPrizes()
{
var firstParticipantMock = new Moq.Mock<IParticipant>();
IParticipant firstParticipant = firstParticipantMock.Object;
firstParticipantMock.Setup(fake => fake.GetTicket()).Returns(
new List<int> {1, 4, 5});

var secondParticipantMock = new Moq.Mock<IParticipant>();
IParticipant secondParticipant = secondParticipantMock.Object;
secondParticipantMock.Setup(fake => fake.GetTicket()).Returns(
new List<int> { 1, 2, 4 });

var lotteryCalculator = new LotteryCalculator(new List<int> {1, 2, 3});
var participantPrizes = lotteryCalculator.CalculatePrizes(new[] { firstParticipant, secondParticipant });

var prizes = participantPrizes.ToDictionary(tuple => tuple.Item1, tuple => tuple.Item2);

Assert.That(prizes[firstParticipant], Is.EqualTo(5));
Assert.That(prizes[secondParticipant], Is.EqualTo(10));
}

It’s nice that we can keep our unit tests written in C#, it’s also important in cases where the users of the code (most likely us in another module) will use it in C# as well.

Code contracts – Where invariant method called?

I assume you already know code contracts and have heard about invariant methods. This post will demonstrate how simple code compiles with invariant method.
This simple class declares a method as invariant method:

public class InvariantExample
{
public void Method()
{
Console.WriteLine("In Method");
}

[
ContractInvariantMethod]
protected void InvariantMethod()
{
Contract.Invariant(2 > 1);
}
}

Using reflector we can see that Method compiles to:

public void Method()
{
Console.WriteLine("In Method");
this.$InvariantMethod$();
}

As we can see the compiled method ends with a call to our invariant method. Assuming we had more methods in our class we would have seen similar call in each one of them. The invariants is less explicit than most contracts features but it’s a powerful part of it, use wisely!

Easy Wins Optimizations – HashSet

HashSet is a data structure introduced in .Net 3.5 framework. The main advantage of HashSet is that it adds and fetches items in very low complexity (constant amortized time). We’ll try to see how much faster than List it can be. In the example we’ll go over code that stores all historical lottery results and checks if some results appear among the historical ones.

Registering results

public abstract class ResultsRegistrarBase
{
protected ICollection<long> codes;

public void AddResult(int num1, int num2, int num3, int num4, int num5, int num6)
{
long code = ConvertToCode(num1, num2, num3, num4, num5, num6);

codes.Add(code);
}

public bool ContainsResult(int num1, int num2, int num3, int num4, int num5, int num6)
{
long code = ConvertToCode(num1, num2, num3, num4, num5, num6);

return codes.Contains(code);
}

private static long ConvertToCode(int num1, int num2, int num3, int num4, int num5, int num6)
{
return num1 + (long) Math.Pow(10, 0) + num2 * (long) Math.Pow(10, 2) +
num3 * (
long) Math.Pow(10, 4) + num4 * (long) Math.Pow(10, 6) +
num5 * (
long) Math.Pow(10, 8) + num6 * (long) Math.Pow(10, 10);
}
}

This class is in charge of storing and finding the results. It assumes it gets the numbers sorted and the ConvertToCode method maps all the numbers to one “big” number”. For example: 1,5,20,30,45,49 will be mapped to 494530200501. The codes collection will be initialized in a base class:
The naive code:

public class NaiveResultsRegistrar : ResultsRegistrarBase
{
public NaiveResultsRegistrar()
{
codes =
new List<long>();
}
}

The code based on HashSet:

public class HashSetResultsRegistrar : ResultsRegistrarBase
{
public HashSetResultsRegistrar()
{
codes =
new HashSet<long>();
}
}

How much faster is it than the naive code?

The difference is huge. In a simple case where we have 250,000 historical results and we’re querying the collection 50,000 times the average running time of the naive code is 1:28 minutes while the code based on HashSet takes only 0.2 seconds. The HashSet code run ~360 times faster than the naive code. The reason for this major difference is that List has to iterate over all the items in order to determine if it contains the requested item. HashSet uses a hash method, without getting to implementation details, it enables the Contains method look in very few entries in order to determine whether or not an item is contained in the collection.
This major improvement was received with minimal change in the code. So in cases where we query a large collection about whether some items are in it, usage of HashSet can give us major running time  improvement with minimal change.

How to pass LINQ-To-SQL data objects with associations through WCF service?

Sometimes we need to pass a simple data object through WCF service. There are a few simple steps to performing this task.

Make data objects declare columns as Data Members

This step is very simple: all that’s needed is to define in the DBML editor that the serialization mode is Unidirectional.
image
By defining this serialization mode the generated code will be decorated with DataContract and DataMember attributes on classes and columns. By doing so the objects are compatible with WCF service and declared as part of the contract.

Visit the objects associations properties before returning them from the service

For example, if we’re dealing with this simple schema and our service returns a customer with requested ID.
image
Assuming we have this naive code:

public class StoreService : IStoreService
{
public Customer GetCustomer(int id)
{
using (var store = new StoreDataContext())
{
return store.Customers.First(dbCustomer => dbCustomer.ID == id);
}
}
}

The client will get the correct customer, but it won’t have the Orders property filled, even if the customer has orders the property value will be null. The reason for this behavior is that the generated code doesn’t execute the query against the database when it’s serializing. The solution is to access all data members before it serializes.

public class StoreService : IStoreService
{
public Customer GetCustomerWithRelations(int id)
{
using (var store = new StoreDataContext())
{
Customer customer = store.Customers.First(dbCustomer => dbCustomer.ID == id);
VisitDataMembers(customer);
return customer;
}
}

public static void VisitDataMembers(object dataMember)
{
if (dataMember == null)
{
return;
}

bool isEnumerable = dataMember is IEnumerable;
if (isEnumerable)
{
foreach (var item in (IEnumerable)dataMember)
{
VisitDataMembers(item);
}
}

var properties = dataMember.GetType().GetProperties();
IEnumerable<PropertyInfo> dataMemberProperties =
properties.Where(property => property.IsDefined(
typeof(DataMemberAttribute), true));
foreach (PropertyInfo dataMemberProperty in dataMemberProperties)
{
var dataMemberValue = dataMemberProperty.GetValue(dataMember, null);
VisitDataMembers(dataMemberValue);
}
}
}

This VisitDataMembers method ensures all data members’ properties are accessed at least once before serialization and the client will be able to use all the data which is accessible for the data object tree. For example in the sample schema, the client will be able to get from customer to orders to products to materials.

Easy Wins Optimizations – Sorted List

This is the first post of easy wins optimizations posts. The need for these posts comes from the fact the most of us hardly ever need it 🙂 Sounds weird? It should. What it means is that we don’t need it most of the time; the most naive code does the best work for us. When we encounter the case that requires optimization we hardly remember the simplest ways of doing it.
In this post on sorted list we’ll go over an example of optimizing a simple case.

How to find a person in a collection whose age is closest to mine?

Assume we have a person class with age defined on it:

public class Person
{
public double Age { get; private set; }

public Person(double age)
{
Age = age;
}
}

What we want is to collect many people into a list, and any time we asked about an age, we find the closest person to this age. We can think of it as an application for a dating service that matches people by age.

The naive solution

So what can be simpler than collecting all the people in a list? Probably nothing, this is how we do it:

public class NaivePersonFinder : IPersonFinder
{
private readonly List<Person> persons = new List<Person>();

public void Add(Person person)
{
persons.Add(person);
}

public Person FindPeopleInRange(double requiredAge)
{
Person closestPerson = null;
double smallestDiffrence = double.MaxValue;
foreach (var person in persons)
{
double difference = Math.Abs(person.Age - requiredAge);
if (difference < smallestDiffrence)
{
smallestDiffrence = difference;
closestPerson = person;
}
}

return closestPerson;
}
}

The code iterates over the collection in the most straightforward way there is and updates the closest person whenever it encounter a closer person. Simple! Isn’t it?

But I need faster results!

The sorted list solution
How does sorted list help us here? Sorted list allow us to search items using Binary Search. Binary search saves a lot of time since it has to query very few items in the collection in order to find the closest one. For example, to find the closest item in a collection of 1,000 items it’ll have to query ~10 and saves iteration over the hole 1,000.

Some code

public class SortedPersonFinder : IPersonFinder
{
private readonly SortedList<double, Person> persons = new SortedList<double, Person>();

public void Add(Person person)
{
persons.Add(person.Age, person);
}

public Person FindPeopleInRange(double requiredAge)
{
if (persons.ContainsKey(requiredAge))
{
return persons[requiredAge];
}

IList<double> sortedAges = persons.Keys;

int index = BinarySearchIndex(requiredAge);

var closestPersons = new List<Person>(3);

if (index > 0)
{
closestPersons.Add(persons[sortedAges[index -
1]]);
}

if (index < sortedAges.Count - 1)
{
closestPersons.Add(persons[sortedAges[index +
1]]);
}

closestPersons.Add(persons[sortedAges[index]]);

return closestPersons.OrderBy(person => Math.Abs(person.Age - requiredAge)).First();
}

private int BinarySearchIndex(double requiredAge)
{
var sortedAges = persons.Keys;

var scopeStart = 0;
var scopeEnd = sortedAges.Count - 1;

int index = 0;
while (scopeEnd > scopeStart )
{
index = (scopeStart + scopeEnd) /
2;
double ageAtIndex = sortedAges[index];
if (requiredAge > ageAtIndex)
{
scopeStart = index +
1;
}
else
{
scopeEnd = index;
}
}
return index;
}
}

The code is more complex than the naive version, but it finds the results much faster. It uses .Net Sorted List collection. This is a collection that updates a collection of  keys whenever the collection changes. The keys are kept ordered and we can easily access them and binary search them.

How much faster is it?

In a collection of 100,000 people and 50,000 queries the running time of the sorted list is much more faster than the naive:
The naive solution did it in average time of ~100 seconds
The sorted list solution had average time of ~6 seconds. This is a major improvement; it means that it ran more the 15 times faster than the naive solution.

Conclusion

If we have a collection in which order can be applied AND we also need quick results for queries based on the order, sorted list can easily improve running time. We must always remember that the running time is usually negligible even in the most naive code. If no optimization required – don’t optimize, more gain will be earned from simple and maintainable code.