javadatelocalelocaldate

Whats the point of Locale in DateTimeFormatter?


Given the following code


LocalDate localDateEpoch = LocalDate.parse(
        "01-Jan-2017", DateTimeFormatter.ofPattern("d-MMM-yyyy", Locale.FRENCH));

System.out.println("localDateEpoch: " + localDateEpoch);

The reason why i picked french is - French numbers are latin so in theory it should just interpret this (Same with the letters in "Jan"). Nevertheless i got this error

Exception in thread "main" java.time.format.DateTimeParseException: 
                             Text '01-Jan-2017' could not be parsed at index 3

My questions

Now i've done a bit of research on my end. Seems to resolve the lack of languages issues mentioned above you could just do Locale.forLanguageTag("fa"); but again why not have Locale.Iran/Persian. Saw a post mentioning about how to convert numerical values to Persian in this case, so i tried to do the opposite.

    LocalDate localDateEpoch = LocalDate.parse(
        "۰۱-Jan-۲۰۱۷", DateTimeFormatter.ofPattern("d-MMM-yyyy", Locale.forLanguageTag("fa")));

And again i got an error.

Exception in thread "main" java.time.format.DateTimeParseException: 
                             Text '۰۱-Jan-۲۰۱۷' could not be parsed at index 0

Also just if your curious i tried it without MMM and still got the same error.


Solution

  • tl;dr

    Just ignore the legacy constants like Locale.CANADA & Locale.CANADA_FRENCH.

    Instead, use Locale.of static factory methods.

    Locale.of( "fr" , "FR" )  // French, France
    Locale.of( "fr" , "CA" )  // French, Canada
    

    Details

    Whats the point of Locale in DateTimeFormatter?

    The locale defines two things needed for localization:

    CLDR

    Modern Java by default sources its locale information from the Common Locale Data Repository (CLDR) published by the Unicode Consortium. A copy of the CLDR is bundled within a JDK/JRE. Updates to the JDK/JRE often come with updates to the CLDR. The rules within the CLDR do change as human language & cultural norms evolve.

    Older versions of Java came with a simpler limited set of locale information/rules. The CLDR provides nearly world wide coverage, and also defines many sub-cultures too.

    Regarding DateTimeFormatter, the locale is used in translating the name of the month, name of the day, etc. And the locale determines how to abbreviate those names if needed. Plus the locale has rules for how to employ punctuation, such as some cultures use a FULL STOP (.) to terminate an abbreviation while other cultures do not.

    Ignore the Locale constants

    What's up with the lack of countries/languages?

    Apparently you mean the short list of a couple dozen constants defined on Locale.

    Those constants were defined in the early days of Java history. They were intended to be a convenience. But I doubt today they would be added to the Java API, for multiple reasons. The Java team later recommended ignoring those constants. Another Question here asked, How were the Locale constants chosen?. But no satisfying answer was posted there. I would advise you to remember that Java was rushed to market during the Web mania days, with some questionable decisions made in haste. Fortunately, Java evolves today with much more consideration.

    See The Java Tutorials by Oracle (free-of-cost) for info on instantiating a Locale object: Creating a Locale. And read the Javadoc for Locale. Notice that Java 19+ offers the of methods as static factories for Locale objects.

    Why are there locales for languages/countries (like French and France)?

    (A) As noted above, you should ignore the particularities of the Locale constants.

    (B) Specifying a country gets you a particular set of cultural norms, in addition to the language. For example, people in France and Québec both use the French language but have evolved different cultural norms.

    Example code

    You said:

    DateTimeFormatter.ofPattern("d-MMM-yyyy", Locale.FRENCH)

    Do not hard-code a format when localizing. The entire format may vary by culture. Instead, let java.time determine the appropriate format per locale.

    To soft-code the formats when localizing, call the DateTimeFormatter.ofLocalized… methods.

    System.out.println( Runtime.version( ) );
    
    List < Locale > locales =
            List.of(
                    Locale.of( "fr" , "FR" ) ,  // French, France
                    Locale.of( "fr" , "CA" ) ,  // French, Canada
                    Locale.of( "en" , "US" ) ,  // English, United States
                    Locale.of( "fa" , "IR" )    // Farsi, Iran
            );
    
    LocalDate today = LocalDate.now( ZoneId.of( "America/Edmonton" ) );
    DateTimeFormatter formatter = DateTimeFormatter.ofLocalizedDate( FormatStyle.MEDIUM );
    for ( Locale locale : locales )
    {
        String output = today.format( formatter.withLocale( locale ) );
        System.out.println( locale + " = " + output );
    }
    

    When run, notice how the French Canada localization abbreviates the name of the month of July differently than does the French France localization.

    And notice how the English United States localization re-orders the elements. And the Farsi Iran localization uses yet a third ordering.

    22.0.1+10
    fr_FR = 2 juil. 2024
    fr_CA = 2 juill. 2024
    en_US = Jul 2, 2024
    fa_IR = 2 ژوئیه 2024
    

    By the way, the examples here use a ISO 8601 version of the Gregorian calendar defined in the IsoChronology bundled with the java.time framework in Java. Your app may need to use a different chronology. For example, for real work in Farsi Iran, you might use a Chronology implementing an Islamic calendar. You will find five Chronology implementations bundled with Java. You can find ten more in the ThreeTen-Extra library. Additional third-party implementations may exist as well.

    Your code errors

    java.time.format.DateTimeParseException: Text '01-Jan-2017' could not be parsed at index 3

    Your text input includes a leading padding zero for the day-of-month. But your formatting pattern uses the code d. That single d indicates there will not be a leading padding zero for single-digit number such as 1. So your input 01- violates your promise of 1-.

    Your formatting pattern fails to match your input in another way.

    "01-Jan-2017", DateTimeFormatter.ofPattern("d-MMM-yyyy", Locale.FRENCH)

    Your text input has Jan for the name of the month. But in French the name of the month would be janv.. If the parsing would have progressed to that part of the input, you would have encountered another parsing exception thrown.

    Avoid parsing localized text. To exchange date-time values textually, use only standard ISO 8601 formats. These formats were expressly made for data-exchange, designed to be easy to parse by machines as well as easy to read by humans across cultures.

    Conveniently, the java.time classes use the ISO 8601 standard formats by default when parsing/generating text. So no need to define any formatting pattern for standard inputs/outputs.