numpy.array.tostring doesn't seem to preserve information about matrix dimensions (see this question), requiring the user to issue a call to numpy.array.reshape.
Is there a way to serialize a numpy array to JSON format while preserving this information?
Note: The arrays may contain ints, floats or bools. It's reasonable to expect a transposed array.
Note 2: this is being done with the intent of passing the numpy array through a Storm topology using streamparse, in case such information ends up being relevant.
pickle.dumps or numpy.save encode all the information needed to reconstruct an arbitrary NumPy array, even in the presence of endianness issues, non-contiguous arrays, or weird structured dtypes. Endianness issues are probably the most important; you don't want array([1]) to suddenly become array([16777216]) because you loaded your array on a big-endian machine. pickle is probably the more convenient option, though save has its own benefits, given in the npy format rationale.
I'm giving options for serializing to JSON or a bytestring, because the original questioner needed JSON-serializable output, but most people coming here probably don't.
The pickle way:
import pickle
a = # some NumPy array
# Bytestring option
serialized = pickle.dumps(a)
deserialized_a = pickle.loads(serialized)
# JSON option
# latin-1 maps byte n to unicode code point n
serialized_as_json = json.dumps(pickle.dumps(a).decode('latin-1'))
deserialized_from_json = pickle.loads(json.loads(serialized_as_json).encode('latin-1'))
numpy.save uses a binary format, and it needs to write to a file, but you can get around that with io.BytesIO:
a = # any NumPy array
memfile = io.BytesIO()
numpy.save(memfile, a)
serialized = memfile.getvalue()
serialized_as_json = json.dumps(serialized.decode('latin-1'))
# latin-1 maps byte n to unicode code point n
And to deserialize:
memfile = io.BytesIO()
# If you're deserializing from a bytestring:
memfile.write(serialized)
# Or if you're deserializing from JSON:
# memfile.write(json.loads(serialized_as_json).encode('latin-1'))
memfile.seek(0)
a = numpy.load(memfile)