German and .NET System.Globalization.CultureInfo - string sorting, it is obvious, now

Sometimes when programming, you have to check your sanity by writing a small test application to confirm what you think should be expected behaviour. I needed to do this a few moments ago. I need to understand sort-ordering in a .NET application. More pointedly, a bug was logged in JIRA, my task was to diagnose the problem, prioritize it, and fix it. The documented bug was that information about a trade was incorrectly ordered/sorted.

The observation is best explained with the code snippet below, that I wrote to confirm .NET/code page sort orders.

using System.Globalization;

namespace ConsoleApp4792
{
 class Program
 {
  static void Main(string[] args)
  {
    //https://msdn.microsoft.com/en-gb/library/ee825488(v=cs.20).aspx
    var ciDEDEpb = new CultureInfo("de_DE_phoneb"); // German
    var ciDEDE = new CultureInfo("de-DE");          // German
    var ciDECH = new CultureInfo("de-CH");          // German - Switzerland
    var ciENUK = new CultureInfo("en-GB");          // United Kingdom

    foreach (var s in new[] { new { s1 = "Duesseldorf", s2 = "Düsseldorf" },
                                new { s1 = "Strasse", s2 = "Straße" },
                                new { s1 = "aaa", s2 = "bbb" },
                                new { s1 = "aaa", s2 = "aaa" },
                                new { s1 = "bbb", s2 = "aaa" }
                            }

            )
  {
    Console.WriteLine($"String.Compare(.. {ciDEDEpb.ToString()} ..) == {String.Compare(s.s1, s.s2, ciDEDEpb, CompareOptions.None)}\ts1={s.s1}, s2={s.s2}");
    Console.WriteLine($"String.Compare(.. {ciDEDE.ToString()} ..) == {String.Compare(s.s1, s.s2, ciDEDE, CompareOptions.None)}\ts1={s.s1}, s2={s.s2}");
    Console.WriteLine($"String.Compare(.. {ciDECH.ToString()} ..) == {String.Compare(s.s1, s.s2, ciDECH, CompareOptions.None)}\ts1={s.s1}, s2={s.s2}");
    Console.WriteLine($"String.Compare(.. {ciENUK.ToString()} ..) == {String.Compare(s.s1, s.s2, ciENUK, CompareOptions.None)}\ts1={s.s1}, s2={s.s2}");
    Console.WriteLine();
   }
  }
 }
}

The console output from application execution is shown below:

String.Compare(.. de-DE_phoneb ..) == 0  s1=Duesseldorf, s2=Düsseldorf
String.Compare(.. de-DE ..) == -1        s1=Duesseldorf, s2=Düsseldorf
String.Compare(.. de-CH ..) == -1        s1=Duesseldorf, s2=Düsseldorf
String.Compare(.. en-GB ..) == -1        s1=Duesseldorf, s2=Düsseldorf

String.Compare(.. de-DE_phoneb ..) == 0  s1=Strasse, s2=Straße
String.Compare(.. de-DE ..) == 0         s1=Strasse, s2=Straße
String.Compare(.. de-CH ..) == 0         s1=Strasse, s2=Straße
String.Compare(.. en-GB ..) == 0         s1=Strasse, s2=Straße

String.Compare(.. de-DE_phoneb ..) == -1 s1=aaa, s2=bbb
String.Compare(.. de-DE ..) == -1        s1=aaa, s2=bbb
String.Compare(.. de-CH ..) == -1        s1=aaa, s2=bbb
String.Compare(.. en-GB ..) == -1        s1=aaa, s2=bbb

String.Compare(.. de-DE_phoneb ..) == 0  s1=aaa, s2=aaa
String.Compare(.. de-DE ..) == 0         s1=aaa, s2=aaa
String.Compare(.. de-CH ..) == 0         s1=aaa, s2=aaa
String.Compare(.. en-GB ..) == 0         s1=aaa, s2=aaa

String.Compare(.. de-DE_phoneb ..) == 1  s1=bbb, s2=aaa
String.Compare(.. de-DE ..) == 1         s1=bbb, s2=aaa
String.Compare(.. de-CH ..) == 1         s1=bbb, s2=aaa
String.Compare(.. en-GB ..) == 1         s1=bbb, s2=aaa

I fully appreciate the regular substitution of ü and ue, ä and ae, ö and oe, and ß and ss in German and Swiss-German (I am currently working in Zürich and this issue has direct applicability to my day-job) alphabets/keyboards, but I was surprised that, for the sole purposes of sorting (this is what String.Compare is all about, it not about string equality), that the ß and ss were treated with the same sort order. It makes sense of course, but I was still surprised. The biggest surprise for me was yet to come however.  I had not realized that German had two different collation/sort orders – see the German phone book sorting above. This is soooooooo German – why have just one sort order, when you can have two 🙂 * The more I learn about German, the more I love it! Anyway I rejected the bug/fix request – there was no bug, the sort order default was correct, although the user was expecting German phone book ordering.

My German friends will look at the code above and may reply with a single word, “selbstverständlich”. I know, now too.

*there is probably something similar in English, to handle for example McDonald and MacDonald, and the glorious apostrophe that some very lucky people have in their surnames.

— Published by Mike, 15:20:11 03 December 2017 (CET)

One Response to “German and .NET System.Globalization.CultureInfo – string sorting, it is obvious, now”

  1. And Mike O-Apostroph-Shea, what very “lucky people” might you be writing about here?

Leave a Reply