Lazy ConcurrentDictionary

Well that's an interesting title but what does it mean? I assume you know about the .NET System.Collections.Concurrent namespace. It provides several thread-safe collection classes, one of which is ConcurrentDictionary<TKey, TValue>. The standard Dictionary class is not thread-safe and when a reader detects a simultaneous write it throws an InvalidOperation: Collection was modified; enumeration operation may not execute. A ConcurrentDictionary can be read from and written to simultaneously from multiple threads.

This post specifically concerns two methods of the ConcurrentDictionary class: AddOrUpdate and GetOrAdd. Both accept a delegate for generating the value to be updated or added. So what happens when multiple threads call GetOrAdd with the same key? This may occur when you use the dictionary as a simple cache, for example: two threads need the same cached value or add it when it isn't available.

In case two threads call GetOrAdd with the same key and the value factory is a delegate, the delegate is actually run on both threads. This is documented in the remarks section but it may not always be what you want. The operation to get the value could be CPU or I/O intensive so ideally you want to run it just once for a particular dictionary key. In a project I'm currently working on, obtaining the value involves traversing a directory structure looking for a certain file. Something I'd like to do just once.

So this is where the 'lazy' part of this post comes into play. Suppose we have a ConcurrentDictionary<string, string> where the value takes some time to compute. For example:

var dictionary = new ConcurrentDictionary<string, string>();  
var value = dictionary.GetOrAdd("key", _ =>  
{
    // Perform a CPU or I/O intensive operation to obtain the value.
    return "value";
});

How to prevent the operation from running twice? The framework offers the Lazy<T> class for that. It can be used for lazy initialization of values but it also offers handy support for our case. First we change the dictionary to have lazy values:

var dictionary = new ConcurrentDictionary<string, Lazy<string>>();  

Next, we move the value generator delegate to the lazy instance and specify a LazyThreadSafetyMode of ExecutionAndPublication. This means only one thread is allowed to initialize the value:

var lazy = new Lazy<string>(() =>  
{
    // Perform a CPU or I/O intensive operation to obtain the value.
    return "value";
}, LazyThreadSafetyMode.ExecutionAndPublication);

var lazyValue = dictionary.GetOrAdd("key", lazy);  
var value = lazyValue.Value;  

What happens here is two things:

  • We add the Lazy<string> instance to the concurrent dictionary in a thread-safe manner.
  • Only the first thread that calls Lazy<T>.Value actually executes the value generator, other threads simply wait for the value to become available.

I wrote this post mainly because I wasn't aware that the ConcurrentDictionary methods that accept a delegate are thread-safe but still allow their delegates to run simultaneously. Maybe you didn't know this either and this post provides you a simple way of preventing this.