Does Your Code Pass the Turkey Test? (2008)

  • There's one more test: the Thailand test. I've been bitten by issues in Java where Calendar.getInstance() will return a Thai Buddhist calendar, which is exactly the same as the Gregorian calendar except 543 years in the future.

  • There’s another fun thing we can call the Britain test.

    The short form for September is Sept in en-GB, the only month abbreviated with four letters. It’s Sep in en-US and other en- locales. All other parsing is identical.

    For example, if you’re parsing abbreviated date formats on AWS, this parsing fails only on eu-west-1 and -2 servers, and only in September.

  • A lot of these seem to be the "not America" test.

  • As for dates and times interchange, use ISO 8601 (largest to smallest order).

    The EU and many other countries ordering of day, month, year (smallest to largest order) also makes sense, but is sadly ambiguous due to

    The US format of month, day, year (middle-endian order) https://9gag.com/gag/a2KEqOe

    US not TUrkey is the outlier

  • Of course, the Turkey test itself now fails the Turkey test as Turkey has been updated to TĂĽrkiye.

  • Most of these problems have incorrect solution. For instance, the actual solution to parsing portrait or landscape is to not use a string for this. It should never have been a string! Other better solutions apply for the rest too.

  • Am I missing something? The realisation comes down to „yes, there are regional differences in formatting“, and doesn’t strike me as a particularly profound insight.

  • Some extra tests...

    Central Europe has a lot of accentuated Latin characters. ěščřžýáíéů etc.

    Cyrillic letters are also worth trying, especially if you are trying to set up an international e-shop and need to print out labels with addresses.

    Spanish-speaking people tend to have very long full names, in case that a complete name is needed. Picasso was, in fact, Pablo Diego José Francisco de Paula Juan Nepomuceno María de los Remedios Cipriano de la Santísima Trinidad Ruiz y Picasso.

    Hungarians write surnames first, e.g. Orbán Viktor. Doing it otherwise looks unprofessional and may lead to confusion, because some first names can also be surnames.

    Most of Europe writes streetname first and house number second, e.g. Friedrichstrasse 52, so the other way round than Americans and Brits are used to.

  • Turkiye would like people to call it Turkiye.

    even Old New York was once New Amsterdam. Why? Maybe people just liked it better that way, and that's nobody's business but the Turks.

    https://www.youtube.com/watch?v=Uqnb_nU7RBE

    (Istanbul is the traditional Turkic name for the city, basically a borrowed/altered pronunciation of Constantinople)

  • > Or use the RegexOptions.ECMAScript option. In JavaECMAScript, “\d” means [0-9] which gives us: [...]

    Wow, the default in the Windows API does not do that? I would have 100% bitten by that at some point doing any development there, coming from a Unix background where I believe the "ECMAScript" behavior is pretty commonplace (and in fact it seems to be a subset of PCREs).

  • This is from 2008

  • I think the main problem here is that the behavior of the programming language is locale-sensitive by default. What languages other than C# behave like this?

  • OT: I like how he refers to a Hanselman post about great interview questions, and how around 40% of the questions in that post are obsolete. It shows how quickly our field evolves, for better or worse.

  • Example ticket: https://github.com/ROCm/ROCm/issues/2888

  • One of my first ever public projects actually had a bug report that ended up being the Turkish i problem. Didn’t know there were more!

  • The normal units test

  • It's infuriating that most calculator apps, including the quick ones in spotlight, force me to use a comma instead of a period for a decimal delimiter.

  • Ah yes, ye old m/d/y - d/m/y debate. One is clearly wrong but the Americans won’t listen…