The Hour-Long SQL Query, 2023 Edition

Back in May of 2015, I found an odd performance issue in the SQL libraries for .NET Framework. I blogged about it at the time, but essentially, reading data from a SQL column by calling IDataReader.GetChars(...) gave us O(N²) performance. This means that as the data grows, the time it takes to read the data grows exponentially.

The other night I decided to give it another shot. Fortunately, past me left just about all the code I needed in the original blog post (linked above), so I only had a few tweaks to make to get the original test up and running.

Yet at the same time, the code was running on a very different system. Back in 2015 I was on .NET Framework, but since then Microsoft have copied, ported, and refactored code as they brought features to .NET Core and then delivered performance improvement after performance improvement after performance improvement after performance improvement after performance improvement.

(If you have a spare few days I recommend reading those very long posts, there is some very interesting low-level performance stuff in there.)

They also discontinued the old System.Data.SqlClient libraries and introduced the newer Microsoft.Data.SqlClient libraries (also on GitHub), so although I expect that Microsoft mostly renamed the old code, we could be in for something completely different.

The other thing that has changed in all this time is that the .NET Framework bugtracker on Microsoft Connect - and indeed, all of Microsoft Connect - has been shut down and wiped from the internet, so there is no trace left of my original bug report. Fortunately Microsoft now operate on GitHub Issues, so at least it's not like they don't have a bug tracker at all.

I updated the original sample code to reproduce this so that I could now run it on .NET Framework or .NET Core, and so that I could choose whether to run it agains the older System.Data.SqlClient libraries or the newer Microsoft.Data.SqlClient libraries.

Armed with this, I set up a brand new Windows Server 2022 VM, installed SQL Server 2022 Developer Edition, and ran the tool in all four configurations.

A Windows command prompt that ran the sample code three times, showing that it took 1 second to read 1MB of data, 4 seconds to read 2MB of data, and 9 seconds to read 3MB of data. — I have never seen such a perfect O(N²) example before in my life.

In all four configurations, the sample application showed a very clear O(N²) performance characteristic - such a perfect representation that I don't even think I need to graph it this time. On this particular VM, the amount of time it takes to read N megabytes of XML is almost exactly N² seconds.

The original workarounds from 2015 all still work, and I have now escalated this as a new GitHub Issue.

Back in 2015, I concluded with:

Hopefully this will be fixed in a future .NET Framework release.

Unfortunately for my hopes, Microsoft announced in 2019 that "future investments in .NET will happen" for .NET Core, and that .NET Framework will basically only get "bug-, reliability- and security fixes" so there is now almost no chance of that happening.

Hopefully though this will at least get fixed in Microsoft.Data.SqlClient for all platforms that it supports. Perhaps. One day.

The Hour-Long SQL Query, 2023 Edition

Choosing your own MAC Address

Something I Learned Recently: WebFinger