smlsmlnjpolymlmlton

Polymorphic coersion to Word64 in Standard ML


I would like create a polymorphic function that converts 8,16,32 bit words into 64 bit word. How can I do it?

UPDATE1

In the basis library all word structures have functions toLarge and fromLarge to convert to/from the LargeWord, which is as far as I understand just a synonym for Word32.

UPDATE2

According to the spec, word size must be power of two, but in SML/NJ I have

Standard ML of New Jersey v110.84 [built: Mon Dec 03 10:23:14 2018]
- Word.wordSize;
val it = 31 : int
- Word32.wordSize;
val it = 32 : int
- Word.toLarge;
val it = fn : word -> Word32.word
> LargeWord.wordSize;
val it = 32 : int

while in PolyML

Poly/ML 5.7.1 Release
> Word.wordSize;
val it = 63: int
> Word64.wordSize;
val it = 64: int
> Word.toLarge;
val it = fn: word -> ?.word
> LargeWord.wordSize;
val it = 64: int

How is that? Why Word.wordSize is not power of two? And why Word representation differs in these SML implementations?

UPDATE3

Actually, I want to be able "promote" smaller words into the larger ones using (<<) operator, but cannot figure it out how to do it.

UPDATE4

It seems that Word and LargeWord depend on the architecture and represent a machine word. Because SML/NJ does not support 64-bit arch, it has different word size.


Solution

  • You are right in that the types Word8.word, Word32.word and Word64.word only share the common type 'a which cannot generally be converted a Word64.word via parametric polymorphism.

    The exact function you are looking for could (and should) have been:

    Word<N>.toLargeWord : word -> LargeWord.word
    

    Unfortunately, as you have discovered, it appears that LargeWord.word is an alias to Word32 and not Word64 in SML/NJ. It doesn't look like Basis specifies that LargeWord.word must do this, but reality. In Poly/ML it appears that LargeWord.wordSize is 126 and in Moscow ML there is no LargeWord structure! Sigh. But at least in Poly/ML it can contain a Word64.word.

    In light of this, I'd suggest one of two things:

    1. You can use ad-hoc polymorphism: Since all three modules share the signature WORD and this signature holds, among other things:

      val toLargeInt : word -> LargeInt.int
      

      So a hack may be to convert to a LargeInt.int and then down to a Word64.word: You can build a functor that takes one module with the WORD signature and return a structure that contains the conversion to Word64.

      functor ToWord64 (WordN : WORD) = struct
          fun toWord64 (n : WordN.word) : Word64.word =
              Word64.fromLargeInt (WordN.toLargeInt n)
      end
      

      You can then instantiate this functor for each of your cases:

      structure Word8ToWord64 = ToWord64(Word8)
      
      val myWord64 = Word8ToWord64.toWord64 myWord8
      

      This is a bit messy and the hierarchy of existing modules that includes LargeWord was meant to avoid it.

    2. Alternatively, if you'd rather avoid this extra functor and arbitrary-precision integers as an intermediate representation, since this is both inefficient and unnecessary, you could change your standard library's LargeWord :> WORD so that it assumes the use of Word64 instead.

    This could have been avoided if the standard library had been written in a functorial style with LargeWord having/being a parameter fixed somewhere where you could override it. But it would also make the standard library more complex.

    With regards to ML module system design, I think the choice of placing toLargeWord in the WORD signature is one approach which is very convenient because you don't need a lot of functor instances, but, as you have witnessed, not very extensible. You can see the different philosophies applied in Jane Street's OCaml libraries Base and Core, where in Core you have e.g. Char.Map.t (convenient) and in Base you have Map.M(Char).t (extensible).

    I've assumed that your words are all unsigned.