rubyencodingutf-8ruby-1.9

hash strings get improperly encoded


I have a simple constant hash with string keys defined:

MY_CONSTANT_HASH = {
'key1' => 'value1'
}

Now, I've noticed that encoding.name on the key is US-ASCII. However, Encoding.default_internal is set to UTF-8 beforehand. Why is it not being properly encoded? I can't force_encoding later, because the object is frozen at that point, so I get this error:

can't modify frozen String

P.S.: I'm using ruby 1.9.3p0 (2011-10-30 revision 33570).


Solution

  • The default internal and external encodings are aimed at IO operations:

    The easiest thing for you to do is to add a # encoding=utf-8 comment to tell Ruby that the source file is UTF-8 encoded. For example, if you run this:

    # encoding=utf-8
    H = { 'this' => 'that' }
    puts H.keys.first.encoding
    

    as a stand-alone Ruby script you'll get UTF-8, but if you run this:

    H = { 'this' => 'that' }
    puts H.keys.first.encoding
    

    you'll probably get US-ASCII.