scalaplayframeworkprotocol-buffersscalapb

convert ByteArray to String to ByteArray


I want to convert ByteArray to string and then convert the string to ByteArray,But while converting values changed. someone help to solve this problem.

person.proto:

syntax = "proto3";
  message Person{
    string name = 1;
    int32 age = 2;
  }

After sbt compile it gives case class Person (created by google protobuf while compiling)

My MainClass:

val newPerson = Person(
      name = "John Cena",
      age = 44                        //output
    )
    println(newPerson.toByteArray)    //[B@50da041d
    val l = newPerson.toByteArray.toString  
    println(l)                        //[B@7709e969
    val l1 = l.getBytes
    println(l1)                      //[B@f44b405

why the values changed?? how to convert correctly??


Solution

  • [B@... is the format that a JVM byte array's .toString returns, and is just [B (which means "byte array") and a hex-string which is analogous to the memory address at which the array resides (I'm deliberately not calling it a pointer but it's similar; the precise mapping of that hex-string to a memory address is JVM-dependent and could be affected by things like which garbage collector is in use). The important thing is that two different arrays with the same bytes in them will have different .toStrings. Note that in some places (e.g. the REPL), Scala will instead print something like Array(-127, 0, 0, 1) instead of calling .toString: this may cause confusion.

    It appears that toByteArray emits a new array each time it's called. So the first time you call newPerson.toByteArray, you get an array at a location corresponding to 50da041d. The second time you call it you get a byte array with the same contents at a location corresponding to 7709e969 and you save the string [B@7709e969 into the variable l. When you then call getBytes on that string (saving it in l1), you get a byte array which is an encoding of the string "[B@7709e969" at the location corresponding to f44b405.

    So at the locations corresponding to 50da041d and 7709e969 you have two different byte arrays which happen to contain the same elements (those elements being the bytes in the proto representation of newPerson). At the location corresponding to f44b405 you have a byte array where the bytes encode (in some character set, probably UTF-16?) [B@7709e969.

    Because a proto isn't really a string, there's no general way to get a useful string (depending on what definition of useful you're dealing with). You could try interpreting a byte array from toByteArray as a string with a given character encoding, but there's no guarantee that any given proto will be valid in an arbitrary character encoding.

    An encoding which is purely 8-bit, like ISO-8859-1 is guaranteed to at least be decodable from a byte array, but there could be non-printable or control characters, so it's not likely to that useful:

    val iso88591Representation = new String(newPerson.toByteArray, java.nio.charset.StandardCharsets.ISO_8859_1)
    

    Alternatively, you might want a representation like how the Scala REPL will (sometimes) render it:

    "Array(" + newPerson.toByteArray.mkString(", ") + ")"