Is there a way to use one string.format()
call to format a runtime-determined number of items?
I wrote a program that generates a million md5sums and converts them to string with
md5.getDigest(input).toHexString();
It took 3 minutes on my laptop, but only 50 seconds if I modify the Crypto
package's toHexString()
as follows to use fewer seperate .format()
calls to generate the string (I'm assuming due to the string append for each one). I take this to mean formatting the string takes at least twice as long as generating the md5sum in the first place.
@ -239,9 +239,17 @@ module Crypto {
*/
proc toHexString() throws {
var buffHexString: string;
- for i in this.buffDomain do {
+ var next = this.buffDomain.first;
+ for i in this.buffDomain by 8 do
+ if this.buffDomain.contains(i+7) {
+ buffHexString += try ("%02xu"*8).format(
+ this.buff[i+0], this.buff[i+1], this.buff[i+2], this.buff[i+3],
+ this.buff[i+4], this.buff[i+5], this.buff[i+6], this.buff[i+7]);
+ next = i+8;
+ }
+ for i in this.buffDomain[next..] do
buffHexString += try "%02xu".format(this.buff[i]);
- }
+
return buffHexString;
}
}
but that's gross. I'd like to do something like
bufHexString = try ("%02xu" * this.buff.size).format(this.buff .... something);
but string.format()
only accepts its var args by args ...?k
which needs a compile-time param number of args.
The question I'm asking is about getting string.format
to work like this, but I'd also be happy with another way to generate a string like this all at once without any intermediate temporary strings. (I don't see a way to do it in Chapel code via the string.createBorrowingBuffer()
without dropping into c_ptr
s.)
I think this post points out something missing from the string
and bytes
types. What is missing is the ability to append a codepoint (as an int(32)
) or a byte (as a uint(8)
). I will look at adding these to Chapel's standard library.
But, I will answer your question more directly. It turns out that string.format
actually operates through the IO system. That is how the format strings match with writef
-- it is actually using the same implementation. string.format
does this with openMemFile
. But you can do the same thing, e.g.:
use IO;
var f = openMemFile();
{
var w = f.writer(locking=false);
for byte in A {
w.writef("%02xu", byte);
}
}
var r = f.reader(locking=false);
return r.readAll(string);
In some quick performance experiments in this area, I observed on my system (with running toHex
on a 16-element array of uint(8)
):
about 40k toHex calls/second with the original implementation
about 460k toHex calls/second with this openMemFile
approach
about 550k toHex calls/second with
("%02xu"*16).format(
A[0], A[1], A[2], A[3],
A[4], A[5], A[6], A[7],
A[8], A[9], A[10], A[11],
A[12], A[13], A[14], A[15]);
But, even better performance is available with the ability to append a numeric byte value / codpoint value: