Some multi-threading gotchas

Most programmers know that they must make their code thread-safe when accessing shared data from multiple threads at the same time.
However, there are some cases which are not so obvious and which might cause very subtle bugs that appear only in some particular circumstances.

Below is a list of such cases that I started to ‘collect’. Probably for some programmers the next examples are nothing new, but at least for me they were ‘unexpected’, at least initially when I read about them:

Example 1:

    class Foo
    {
        int _answer;
        bool _complete;

        public void A()
        {
            _answer = 123;
            _complete = true;
        }

        public void B()
        {
            if (_complete) Console.WriteLine(_answer);
        }
    }

Seems harmless and thread-safe, right? Well, not allways :) – if A() and B() are executed in parallel from multiple threads, in some particular cases Console.WriteLine will display… 0 (zero).
Source and explanation: http://www.albahari.com/threading/part4.aspx#_Memory_Barriers_and_Volatility – in short, the two assignments could be reordered on some processors.

Example 2:

    class Program
    {
        static void Main()
        {
            bool complete = false;
            var t = new Thread(() =>
                {
                    bool toggle = false;
                    while (!complete) // waiting for the flag to be set...
                    {
                        toggle = !toggle; // do something :) 
                    }
                });
            t.Start();

            Thread.Sleep(1000); // wait a bit to let the thread to run..
            complete = true; // 'signal' the thread

            t.Join();        // wait the thread to complete - blocks indefinitely (!)
        }
    }

The thread in the above code will never complete (if the code is compiled in release mode with optimizations on).
Details and a solution again at: http://www.albahari.com/threading/part4.aspx#_Memory_Barriers_and_Volatility

Example 3:

    class Test
    {
        private bool _flag = true;
        public void Run()
        {
            // Set _flag to false on another thread
            new Thread(() => { _flag = false; }).Start();
            // Poll the _flag field until it is set to false
            while (_flag) ;
            // The loop might never terminate!
        }
    }

Notice the ‘might’ – sometimes it will terminate, sometimes the .NET JIT compiler will decide to ‘cache’ the _flag like so:
if (_flag) { while (true); }
I managed to reproduce the ‘never-ending loop’ by compiling on release mode, optimizations enabled, on 1 out of 3 tries.
Source: http://msdn.microsoft.com/en-us/magazine/jj883956.aspx

Example 4:

    class Program
    {
        class Test
        {
            private decimal _x;

            public void Write(decimal newValue)
            {
                _x = newValue; // just a 'simple' write
            }

            public void Read()
            {
                decimal v = _x; // read the written value
                Console.WriteLine(v);
            }

        }

        static void Main()
        {
            Test f = new Test();

            const int numThreads = 100;
            Thread[] threads = new Thread[numThreads];
            for (int i = 0; i < numThreads; i++)
            {
                if (i % 2 == 0)
                {
                    int k = i;
                    threads[i] = new Thread(() => f.Write(8999999999999999999999999999m));
                }
                else
                {
                    threads[i] = new Thread(f.Read);
                }
            }
            // some threads write while other read from the same memory location
            for (int i = 0; i < numThreads; i++)
            {
                threads[i].Start();
            }

            // wait for all threads to complete
            for (int i = 0; i < numThreads; i++)
            {
                threads[i].Join();
            }
            Console.ReadLine();
        }
    }

Is it possible to read a value that was never written? Yes :)
Source and explanation: http://www.bluebytesoftware.com/blog/2006/02/08/ThreadsafetyTornReadsAndTheLike.aspx
Why? decimal _x = newValue;
is not an atomic operation, even on 64-bit processors (decimal is stored on 128 bits).
Even a long x = 1231243124214423; might be non-atomic on 32-bits processors.
An example that better reproduces the issue described above can be found at: http://stackoverflow.com/questions/11360645/should-i-always-synchronize-access-to-all-double-field-property-variables-that

The conclusion? Try to avoid sharing state and instances between multiple threads, but if you can’t, make sure you read twice the excellent article on this topic by Joseph Albahari: http://www.albahari.com/threading/
Do not assume the code will always run on x86-x64 architecture where you can make some some assumptions – even if .NET 4.5 does not support Itanium processors anymore, recently .NET code started to execute on ARM processors, that have their own different instruction set.

Posted in .NET, C# | Tagged , , | Scrie un comentariu

Tommy

Tommy (3)Tommy (1)Tommy (2)

Tommy, a set on Flickr.

Via Flickr:
My 2′nd cat (one year old)

Posted in Uncategorized | Scrie un comentariu

Event aggregator in .NET

A few days ago I had to choose an event aggregator implementation for using it inside the project (ASP.NET in this case). Of course, there are plenty of alternatives, probably the most famous being the one from Prism . Since that one has dependencies on WPF (or Silverlight) I had to look elsewhere for ASP.NET.

After a bit of digging, I realized that the answer was in front of me: for recent .NET Framework versions (4.0 and above at least) we already have a library (from Microsoft) that comes with an event aggregator implementation: Reactive Extensions (RX).
It’s free and open source (https://rx.codeplex.com/ - Apache license).

The class I was looking for is Subject<T>: http://msdn.microsoft.com/en-us/library/hh229173(v=vs.103).aspx . Despite it’s name, it has inside all what’s needed for an event aggregator.

If we come back to the definition, the event aggergator pattern is very simple: ‘Channel events from multiple objects into a single object to simplify registration for clients. An Event Aggregator is a simple element of indirection. In its simplest form you have it register with all the source objects you are interested in, and have all target objects register with the Event Aggregator. The Event Aggregator responds to any event from a source object by propagating that event to the target objects.‘.

public sealed class Subject<T> : ISubject<T>, ISubject<T, T>, IObserver, IObservable, IDisposable

Let’s take a look at Subject<T> from RX: ‘A subject acts similar to a proxy. It performs as both a subscriber and a publisher. This is accomplished by supporting the IObserver and IObservable interfaces. The IObserver interface can be used to subscribe the subject to multiple streams or sequences of data. The data is then published through it’s IObservable interface to all subscribed observers.‘.

If we replace ‘subscribe’ with ‘register’, ‘observable’ with ‘source objects’, ‘observer’ with ‘target objects’, and ‘streams or sequence of data’ with ‘events’, we get an event aggregator, despite the different terminology.

The methods we are interested in are two:
- public void OnNext(T value) – used by the source to ‘trigger’ (publish) an event
- public static IDisposable Subscribe<T>(this IObservable source, Action<T> onNext) – used by the target object to register (subscribe)

where T is the (custom) event type. In this context ‘event’ is not used to mean an .NET framework event – but the ‘generic’ event concept. The T will contain just the event payload and can be any type. Somehow it’s similar to EventArgs from .NET.

How can we use the Subject<T> class as an event aggregator? Easy.
The only precondition is to have a way to pass the same Subject instance to all interested parties (using a DI container or some other way). We build the Subject easily:

Subject subject = new Subject();

Subscribe just as easy from anywhere in the application:

...
var subscription = subject.AsObservable().Subscribe(ev => { // ... do something with the event } );
...
// unsubscribe when we are no longer interested, or before our class will be disposed
subscription.Dispose();

Trigger the event from a very different and distant class:

// ...
subject.OnNext(new MyCustomEventType(...));

Since we have access to RX source code (http://rx.codeplex.com/SourceControl/changeset/view/2b5dbddd740b#Rx/NET/Source/System.Reactive.Linq/Reactive/Subjects/Subject.cs)
we can verify what’s really happening behind the scenes:

   public IDisposable Subscribe(IObserver<T> observer)
   {
       // ...
       var obs = oldObserver as Observer<T>;
       if (obs != null)
       {
          newObserver = obs.Add(observer);
       }
       else
       {
          newObserver = new Observer<T>(new ImmutableList<IObserver<T>>(new[] { oldObserver, observer }));
       }
       // ...
       return new Subscription(this, observer);
    }

- each time an observer it’s added, it’s added to a list maintained by the (internal) Observer<T> class. Inside Onserver<T> we have:

internal class Observer<T> : IObserver<T>
    {
        private readonly ImmutableList<IObserver<T>> _observers;
    // ...
    }

ImmutableList is just a thread-safe collection that build a new array each time an element is added to it.
Probably the same collection will be publicly released by Microsoft soon: https://nuget.org/packages/Microsoft.Bcl.Immutable

When an event is triggered/published, the Subject class just iterates this collection:

public void OnNext(T value)
        {
            foreach (var observer in _observers.Data)
                observer.OnNext(value);
        }

To sum up, Subject<T> can be used as a basic event aggregator, even if it does not have some fancy features from Prism, like publishing the events on the subscriber UI thread.
Instead, we can use the full power of Reactive Extensions to filter the ‘stream’ of events in any way we want:

    using (subject.AsObservable()
                  .Where(se => se.Status == 1)
                  .Subscribe(se => { eventWasRaised = true; // ... })
          )
    {

      // ...
    }

Of course, a quick google search reaveals that the above idea is nothing new, and was use dby other people before:
http://machadogj.com/2011/3/yet-another-event-aggregator-using-rx.html or https://github.com/shiftkey/Reactive.EventAggregator
The only disadvantage is that it brings a dependency on Reactive Extensions..

Posted in .NET, C# | Tagged , , , , , , , , | Scrie un comentariu

Entity Framework asynch – behind the magic

As many devs have probably found out (http://blogs.msdn.com/b/adonet/archive/2012/10/30/ef6-alpha-1-available-on-nuget.aspx), the next Entity Framework version (6.0) will support the task-based asynchronous patterns that were introduced in .NET 4.5 (async and all the stuff).

I won’t go into details on why asynch is useful, what problems does it solve (very short and oversimplified: UI responsiveness in client applications and limited number of threads from thread pool that wait after a blocking, non CPU-bound operations, in web applications under heavy load).
For a very good (but deep) tutorial on asynch in C# 5.0, anybody can read the series of post in Erric Lipper blog: http://blogs.msdn.com/b/ericlippert/archive/tags/async/.
There I learned how the new async keyword is just a syntactic sugar, but a very powerful one – implementing the equivalent code with the existing language features is possible, but can become very painful, quickly, when the case is non-trivial.

I also won’t go into details why asynchronous DB calls aren’t always the best idea (http://blogs.msdn.com/b/rickandy/archive/2009/11/14/should-my-database-calls-be-asynchronous.aspx), even on a busy web application (on desktop apps directly accessing the DB, a background thread can do the job just fine).

On a practical note, why async support was added to EF (not an easy task)? Probably the main reason was future support for EF inside WinRT, where each call longer than 50ms must be asynchronous (sure, most WinRT applications won’t need a local database, but still).

When I first tried the async support in EF 6 alpha 1, I was curious to know what is really happening when an async db call is done using EF, and this is not explained in many blog posts – a new thread is spawned or..? Fortunately, now most of the stack, EF6 and many parts of .NET Framework are open source, so I can see what’s happening.
For EF6 we can even do a
git clone https://git01.codeplex.com/entityframework.git
and have a local copy of the entire source code.
For .NET Framework source code, the easiest way is to use Resharper to get with a simple F12 the original source code from http://referencesource.microsoft.com/netframework.aspx in a transparent way.

So, what’s really happening behind a nice call like this?

        private static async Task SaveInput(string question, string answer)
        {
            using (var ctx = new AnswersContext())
            {
                var answerObj = new Answer() { Content = answer };
                // ...
                await ctx.SaveChangesAsync();
            }
        }

After I skipped over several layers of Entity Framework code, at the bottom of EF implementation, I found this, somewhere deep inside DynamicUpdateCommand class:

 
internal override async Task<long> ExecuteAsync(
//...
rowsAffected = await command.ExecuteNonQueryAsync(cancellationToken).ConfigureAwait(continueOnCapturedContext: false);
//...

where ‘command’ is a DbCommand
so, yes, EF6 is just delegating the async call to ADO.NET provider, no new thread created yet.
No Resharper needed yet, since I had the source code of EF6 alpha.

Does ADO.NET have task-based async calls? Yes it does, since .NET 4.5.
What does this means? That all companies that offer ADO.NET providers will have to update them in order to offer true task-based async calls.
Fortunately, the base class (DbCommand), if the method is not overridden, executes the synchronous version:

public virtual Task<int> ExecuteNonQueryAsync(CancellationToken cancellationToken)
// ...
return Task.FromResult<int>(this.ExecuteNonQuery());
// ... 

Since we are using the ADO.NET provider for SQL Server, hopefully, this method is truly asynch, so if we dig inside SqlCommand from .NET Framework 4.5 (with a decompiler), we find this:

public override Task<int> ExecuteNonQueryAsync(CancellationToken cancellationToken)
{ // ...
Task<int>.Factory.FromAsync(
  new Func<AsyncCallback, object, IAsyncResult>(this.BeginExecuteNonQueryAsync), 
  new Func<IAsyncResult, int>(this.EndExecuteNonQueryAsync), (object) null)
         .ContinueWith((Action<Task<int>>) (t => // ...
// ...
}

going again to the bottom of SqlClient asynchronous implementation, we encounter the same code which is executed when BeginExecuteNonQuery is called, which is present in ADO.NET since a long time (.NET 2.0).
Even if the code was refactored since those times, the description on how this is achieved remains valid:
Asynchronous Command Execution in ADO.NET 2.0
ADO.NET/SqlClient asynchronous command execution support is based on true asynchronous network I/O under the covers (or non-blocking signaling in the case of shared memory)

More details on what happens at TDS level (the protocol used to communicate with SQL Server) is described at:
http://blogs.msdn.com/b/adonet/archive/2012/04/20/using-sqldatareader-s-new-async-methods-in-net-4-5-beta.aspx

Below that is only the TDS implementation and the support offered by Windows for shared memory or async network I/O, a very interesting subject in itself, for those who like to understand the building blocks.
For a more high-level explanation on why async in C# 5.0 does not require any extra thread (except those already created by the operating system and shared by many applications): http://blogs.msdn.com/b/ericlippert/archive/2010/11/04/asynchrony-in-c-5-0-part-four-it-s-not-magic.aspx

The conclusion: trust the implementation :) , but read an article like http://msdn.microsoft.com/en-us/magazine/hh456402.aspx on async performance before going in that direction.
Also, if somebody is looking for a fire-and-forget asynch call into SQL Server, when the result does not matter, asynch calls from ADO.NET are not the answer – Remus Rusanu has an interesting article: http://rusanu.com/2009/08/05/asynchronous-procedure-execution/

Posted in .NET, Entity Framework, Uncategorized | Tagged , , , , , | Scrie un comentariu

Debugging a performance issue

One more post on “back to basics” series.. :) I was playing these days with a generic HTTP handler (.ashx) in ASP.NET 4.0 (it could be a web service, it doesn’t matter in this context).
The http handler tried to do a simple task: return to the client several objects, encoded as JSON. Usually there were around 50 objects, each with 200 string properties (don’t ask why). The list of objects was encoded using Newtonsoft Json.Net, and send to the browser to be displayed using JQuery – nothing special here.

When I first run the code, surprise – it took the server around 1-2 seconds to send the response to the browser, even if I run everything on a fast quad-core machine and the processor was not busy with something else, and the browser was on the same computer (so no network latency..).

The code was also pretty basic:

    public void ProcessRequest(HttpContext context)
    {
      // ...
      context.Response.ContentType = "application/json";
      context.Response.BufferOutput = false;
      // ...
      ResultsList rows = GenerateResultsList(...); // create ~ 50 'rows', each with 200 string properties
      // serialize and write the list of objects directly to the HttpResponse, as JSON
      JsonSerializer jsonSerializer = new JsonSerializer();
      using (JsonWriter writer = new JsonTextWriter(response.Output))
      {
        jsonSerializer.Serialize(writer, rows);
      }

      context.Response.Flush();
    }

Quite low-level, but simple.
Probably you already spotted the problem in the above code, but since the real code was split into several separate classes and methods, it took me a few more steps to discover the issue.

Trying to figure out what’s happening, I compiled in release mode, fired Fiddler and got this:
Fiddler statistics
- over 2s and ~ 800KB of content.

Taking a closer look at the raw response, I saw this:
Fiddler - raw response
- tiny chunks of data, compressed with GZIP..

Something was not quite right..

Since a local download of 800KB of data should be much faster, I fired RedGate Ant Profiler to see where the bottleneck was:
Ants Profiler - the culprit
Pretty clear – even if I call JsonSerializer.Serialize only once, the response is flushed in tiny bits, 20.000 times (digging deeper with reflector, once for every JSON property and value).
Indeed, the browser receives the response immediately, as soon as it’s generated, but at what cost..

The fix was simple:
context.Response.BufferOutput = true;

Now the profiling looks much nicer:
Ants Profiler 2

Also the amount of data transferred over the wire looks better:
Fiddler statistics final
- the response is almost eight times smaller.

Also the raw HTTP response has only one gzipped chunk:
Fiddler raw final

Note to self: don’t try to be clever by thinking that sending the response immediately to the browser will always improve the ‘responsiveness’ of the application; don’t be quick to blame the JSON serializer for performance issues when the cause might be in my own code. :) When you see a performance problem, measure and profile the code, don’t try to guess the cause, unless it’s obvious.

Posted in .NET, Web | Tagged , , , | 2 comentarii

A Romanian “Yellowstone”

Un ochi spre cerRăsuflarea pământuluiFiertură receMartorul trecutuluiStele verziSoldații
La plajă?Lacul St. AnaLacul St. AnaLiniștea vulcanuluiColț de raiTot omul..
ReflexiiZenPiticii se răcorescOglinda munțilorMordorFrații
Apă de foc (2)Apă de foc (1)VerticaleDincoloMineraleRasuflarea muntelui

A Romanian "Yellowstone", a set on Flickr.

In and around Bodoc mountains, Balvanyos and St. Ana lake

Posted in Photos | Tagged , , , | Scrie un comentariu

MSTest and continous integration without Visual Studio

I recently had to setup a CI server (Jenkins, but this applies to any other, like CC.Net or TFS) and had to automate the execution of unit tests after each build.
The project uses MSTest and everywhere I searched it was written that MSTest can be installed only by installing Visual Studio too, or by manually coping the bits in a painful manual process (http://www.shunra.com/shunrablog/index.php/2009/04/23/running-mstest-without-visual-studio/ , http://sparethought.wordpress.com/2011/07/12/mstest-2010-on-the-build-server-without-vs2010-installed/ etc..).
Since I don’t enjoy having to manually register assemblies in GAC or set registry entries, I searched for a better way.

After thinking for a while I realized that Microsoft does not ask us to install a full blown Visual Studio on each TFS build agent, especially when we have multiple ones – I searched and found this: Visual Studio Test Agents 2010 – it’s an installer that allows me to install just the VS Test Agent part, without having to run it all the time.
Sure, it contains more stuff, not just MSTest, but is much smaller (less than 300MB) than a full VS2010 installation, that doesn’t let me install just MSTest.
The test agent installed MSTest where I expected to be: C:\Program Files\Microsoft Visual Studio 10.0\Common7\IDE\…

Of course, before using this in a real-world project, make sure you checked the licensing restriction – probably each developer using the CI server must have a valid VS2010 Premium, Ultimate or Test Professional license.

Posted in .NET, C# | Tagged , , , , | Un comentariu