Cleaning up .net with IDisposable and finalizers

One of the most common mistakes I see in .net code is the misuse of finalizers.

Finalizers should only be used to clean up unmanaged resources. There is no guarantee of the order in which finalizers will be called, and they are only given a short time to complete. Finanizers should never call Dispose on anything, and should never do things like flush a stream. If you get to the point where a finalizer needs to flush a stream, you have a bug elsewhere in that the object was not properly flushed and disposed.

If you do not have any unmanaged resources to clean up, do not implement a finalizer. Cleaning up managed resources should be done by implementing IDisposable. This is also where large references may be set to null in order for the garbage collector to collect them even if a reference to the object itself still exists. For example, MemoryStream does not set the reference to its buffer to null, so disposing it does not free the memory it consumes. In order to get the memory back from a MemoryStream, you need to remove the reference to it. Luckily, declaring the variable in a using statement will also mean that it goes out of scope after it is disposed and may be garbage collected.

When implementing a finalizer and IDisposable on the same class, we always want the finalizer to be run, but we also want to be able to clean up both managed and unmanaged resources early if Dispose is called.

This is where the Dispose(bool disposing) pattern comes from, in which managed resources should only be disposed if disposing is true, which is where a lot of people go wrong.

For this reason, I suggest being explicit and creating two virtual methods rather than one, telling people exactly what they should be doing in those methods. Overriding methods in derived classes should also ensure that they always call their base class’ implementation.

See also: Implementing Finalize and Dispose to Clean Up Unmanaged Resources (MSDN)

Problems with Transient Fault Handling Application Block (Topaz)

Microsoft’s Transient Fault Handling Application Block (Topaz) is a good way of getting all retry behaviour into the same shape, but it has some problems.

Problems with RetryPolicy.ExecuteAsync
One problem is that the CancellationToken is not passed to Task.Delay, so you can’t cancel mid-delay, which is a major problem if you use delays of more than a few seconds.

Another problem is that if you do cancel, the last (failed) task is returned, so if you cancel before the first exception you get a cancellation, but if you cancel after the first exception you get the last exception thrown. This is inconsistent behaviour, and if you need to know earlier exceptions, there’s already a Retrying event which contains them. If an operation is cancelled, you shouldn’t need to care what the previous exception was, because the final cause of incompletion was cancellation, not an exception.

Also, if you cancel, you have to handle OperationCancelledException in your detection and retry strategies, which gets old fast. It should be safe to treat a cancellation as a non-transient exception at the RetryPolicy level, because a CancellationToken can’t be un-cancelled. The only way you could get a different result after a cancellation is if you were throwing your own OperationCancelledException or using a new CancellationTokenSource inside each retry, both of which I would say are code smells.

The recursive nature of AsyncExecution also means that using ExecuteAsync will cause memory to slowly grow for each retry until the retry loop completes. That’s fine if you have a few retries, but not if you want to retry indefinitely.

Problems with API design
The API is relatively messy and convoluted for the simple functionality it provides. RetryStrategy has an abstract GetShouldRetry method which returns a ShouldRetry delegate, rather than just having an abstract ShouldRetry method to implement.

RetryPolicy has some methods marked as virtual, such as ExecuteAction but not others, such as ExecuteAsync. There should never be a reason to derive from RetryPolicy anyway when all of the behaviour you need to manipulate is contained in its dependencies; ITransientErrorDetectionStrategy and RetryStrategy.

There is no need for all of the overloaded constructors of RetryPolicy, most of which just use the extra parameters to construct a RetryStrategy, which could have been done more cleanly by the caller.

There is some redundancy between ITransientErrorDetectionStrategy and RetryStrategy, as both are asked whether the operation should be retried. A library provider may wish to define a detection strategy only, but this should be used by the consumer’s RetryStrategy rather than the RetryPolicy asking both.

RetryStrategy.FastFirstRetry is something which should be used by the RetryStrategy as part of ShouldRetry rather than used by the RetryPolicy.

Other problems
There is some weird behaviour like using Task.Delay.Wait rather than Thread.Sleep in ExecuteAction, and some potential bugs like not saving the Retrying event handler to a local variable in OnRetrying.

A solution
Here’s an implementation, based loosely on the RetryPolicy source, which I believe is an improvement:

The same behaviour is provided for both ExecuteAction and ExecuteAsync and cancellation is handled correctly.

The following CompatibilityRetryStrategy can be used to migrate existing implementations of ITransientErrorDetectionStrategy and RetryStrategy. You could also use something similar to make use of an existing ITransientErrorDetectionStrategy, such as SqlDatabaseTransientErrorDetectionStrategy.

At the time of writing, the reference source was last updated 18 Aug 2015, but the nuget package was last updated 26 April 2013, so it’s possible that it’s just not being distributed by nuget any more. However, the only fix I can see in there is passing the CancellationToken to Task.Delay.

Update:
Enterprise Library is no longer being developed so if there are any problems with libraries such as those described here, you should just migrate away from the library. However, using a consistent and reusable approach to retrying is useful. The code here can be used as a starting point to develop your own retry behaviour.

Remove UAC from a specific application.

This will remove the UAC prompt from a specific application executable.

It assumes that “Run this program as an administrator” is already unchecked on the compatibility tab and the executable still requests elevated privileges.

This should only be done if you either know that the executable doesn’t need elevated access but has been incorrectly set to UAC prompt every time, only needs elevated access for a feature that you do not use or if you don’t trust the executable with elevated access. Some features of the application may not work without elevated access.

1) Make a back up copy of your executable.

2) Create a manifest file:

3) Embed the manifest file into the executable (requires Windows SDK):

Thanks to Karan.

Compile ffmpeg 64 bit on Windows with MSYS/MinGW-w64

Set up the MSYS environment

Download MSYS from MinGW-builds. Extract it to a path with no spaces, to which you have write permissions. For example, D:\msys. This already includes useful features like pkg-config, Autotools and Git.

Download a pre-built MinGW-w64 from drangon.org and extract it inside the MSYS directory. This already includes useful features like Yasm.

Start MSYS with msys.bat

In MSYS, run:

Check that it has worked with gcc -v

Compile ffmpeg and libraries

These steps are similar to the Linux/BSD version of this guide.

NOTE: configure on ffmpeg in mingw is slow. Be patient. You should also check for success after each library has compiled.

This should build ffmpeg.exe as a 64 bit static executable that can be run directly in Windows x64, with H.264 and AAC support. It does not need to be run from MSYS. In my testing, the 64 bit version is approx. 10% faster than the 32 bit version.

If you want it to use DLLs instead of creating a static executable, change --disable-shared to --enable-shared and remove the -static from the ldflags in the ffmpeg configure.

You may also want to use --enable-avisynth (64 bit port). SDL is required for ffplay.

Park or Unpark your CPU cores the easy way.

CPU cores that are not under heavy load get parked to save on power, reduce heat, etc.

Some users (not me) may get a performance benefit out of disabling this parking and keeping their cores unparked.

There is a commonly used tool to do this, but it is slow and somewhat convoluted. All this tool does is modify a few values in the registry.

The different keys it modifies are actually different control sets. As CurrentControlSet is a pointer, it is the only one you need to modify. In fact, as the numbered control sets store the “last known good configuration” that you may see when you recover from a crash, you probably shouldn’t change them directly.

This means that all you need to do is:

Step 1:

Run regedit as Administrator
(type regedit in start menu, right click, Run as Administrator).

Step 2:

Go to HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Power\PowerSettings\
54533251-82be-4824-96c1-47b60b740d00\0cc5b647-c1df-4637-891a-dec35c318583

(you can just go to “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet”, right click, find, search for 0cc5b647-c1df-4637-891a-dec35c318583)

Step 3:

To park, set ValueMax to 100 as decimal (or 64 as hex).
To unpark, set ValueMax to 0.

Alternatively, set Attributes to 0 and you will be able to change the setting easily from Power Options in the Control Panel.

Step 4:

Reboot.


Core parking tool

Alternatively, I made a very simple tool to do it in one click. admin rights and .net 2.0 required to run, reboot is required for setting to take effect. Source code is in the archive.

See also: BitSum ParkControl.

You can also use the PowerCfg command line.

YouTube My Subscriptions grid view

YouTube has finally removed the old grid view my_subscriptions page. If you don’t like the frankly broken and unusable /feed/subscriptions page, there is a Chrome extension that will fix it for you.

Better Youtube Subscriptions Page
“Transforms Youtube’s broken feed subscription page and restore it to it’s former grid glory.”

In fact, I actually prefer it over the old grid page because it fills the grid to the width of the page, showing more on wider resolutions. It doesn’t seem to hide watched videos, but I didn’t like the grid page doing that anyway because it also hid partially watched videos (anything viewed for more than a couple of seconds).

This app does not require strange permissions that it shouldn’t need or an OAuth login (as YouTube Video Deck does) or anything like that, it just alters your subscription feed to appear as a grid.

As a side note, the YouTube API actually doesn’t need authentication to get channel or video information. You should only need to give login auth if it tracks what you have watched, automatically gets your subscriptions from your YouTube account or anything else that involves your account directly. Although it’s better than a 3rd party website saving your password, you should still be very careful who you give your OAuth authentication to, as it essentially logs them into your Google account for you.

It’s also compatible with the YouTube Ratings Preview extension, which shows the green/red rating bar on the thumbnail before you view the video, although this seems to only show ratings previews on the first 30 or so videos on each page.

Introduction to MFC

The Microsoft Foundation Class library is a framework for Windows GUIs in C++, based on the Win32 API C library.

To create a simple window, you need to create an app and a window.

The app class handles most of the behind the scenes stuff, including program entry. There must be one globally defined instance of the app for the program to run.

Here is a very basic MFC application. It just shows a blank window.

If you want to create a form with buttons, text boxes etc, you should probably look at CFormView. Alternatively, create a dialog-based application in Visual Studio rather than single document (SDI).

The solution to most of the US’ political problems

In a US election, most states are pretty much guaranteed to vote a certain way. Amazingly, these political parties are actually distinguishable from one another (this is not the case in my country, in which all politicians are equally corrupt, greedy liars that may as well be in a single political party). These parties have been arguing over the same points for decades.

The solution is obvious. Split the republican states and the democrat states into two separate countries. Swing states can either pick and choose, redefine their borders or emigrate. A couple of suggestions for the name of the resulting republican country are Acirema and ‘merica.

This way, the democratic country can have free healthcare, while the republican country can have all the guns they want and so on.

P.S. This post is tongue in cheek. Don’t take it too seriously (unless you’re a US president and think it’s a good idea!)

SQL Server random unique identifiers

A common method of producing random unique identifiers in SQL server is by using a GUID field, calling newid() to generate the data. For the most part, this works because it’s 128 bits worth of random data, which means there is a very low probability of duplicate records for most databases.

However, it is also common to combine this with the checksum() function to reduce it to a 32 bit integer. This makes collisions much more likely, even in relatively small databases. For example, the GUIDs 28258F69-6536-4198-BE37-94960ABF054F and 49B60D4B-DC4A-4E18-825E-B4C99713D011 both checksum to 0xC3AD13D3. With a table of about 100,000 rows collisions will start to occur more frequently by the birthday paradox.

Using this maths, we can see that using a 32 bit random number, the probability of getting at least one collision is 50% at around 77,500 rows and 99% at 200,000 rows. We can also see that if we increase this to a 53 bit number, 10 million rows gives a 0.55% chance of getting at least collision and 100 million rows gives a 42.5% chance of getting at least one collision, so 64 bit should be plenty.

For higher precision numbers, we can use mpmath

From here you can see that a 64 bit number has a 2.6% chance of getting a single collision in a 1 billion row table, and a 93% chance in a 10 billion row table.

A compromise of both is to simply truncate the GUID at 64 bits and optionally convert to a bigint.

If you leave it as binary and don’t need to convert to an integer type, this does not have to be 8 bytes. For example, you could have a 5 or a 10 byte code.

None of these are perfect but the probability of a collision decreases with more bits. If 128 bit is too long for you (e.g. to display to users) but 32 bit generates too many collisions, try a compromise such as 64 bit.

If you are consistent enough, you may even be able to store the original GUID and just display the truncated form, which could allow you to change the length displayed later without changing the probability of collisions. This is more flexible but may lead to confusion among users and consistency is required (differing lengths could lead to bugs).