Given text in shift-jis encoding, how can I decode it into Elixir's native UTF-8 encoding, and vice-versa?
The Codepagex library supports this. You just need to figure out what it calls SHIFT_JIS.
Codepagex uses the mappings available from unicode.org. There is one for shift-jis but it's marked as OBSOLETE, so is not available in Codepagex. However, Microsoft's CP932 is also available, which is effectively SHIFT_JIS, so you can use that.
It's not enabled by default, so you need to enable in in config (and re-compile with mix deps.compile codepagex --force
if necessary):
config :codepagex, :encodings, [
"VENDORS/MICSFT/WINDOWS/CP932"
]
iex(1)> shift_jis = "VENDORS/MICSFT/WINDOWS/CP932"
"VENDORS/MICSFT/WINDOWS/CP932"
iex(2)> test = Codepagex.from_string!("テスト", shift_jis)
<<131, 101, 131, 88, 131, 103>>
iex(3)> Codepagex.to_string!(test, shift_jis)
"テスト"
I made an example repo where you can see it in action.