javaandroidmp3pcmjlayer

AudioTrack - short array to byte array distortion using jlayer(java mp3 decoder)


I'm using jLayer to decode MP3 data, with this call:

SampleBuffer output = (SampleBuffer) decoder.decodeFrame(frameHeader, bitstream);

This call which returns the decoded data, returns an array of short[]. output.getBuffer();

When I call AudioTrack write() with that method, it plays fine as I loop through the file:

at.write(output.getBuffer(), 0, output.getBuffer().length);

However, when I convert the short[] array to byte[] array using any of the methods in this answer: https://stackoverflow.com/a/12347176/1176436 the sound gets distorted and jittery:

at.write(output.getBuffer(), 0, output.getBuffer().length);

becomes:

byte[] array = ShortToByte_Twiddle_Method(output.getBuffer());
at.write(array,  0,  array.length);

Am I doing anything wrong and what can I do to fix it? Unfortunately I need the pcm data to be in a byte array for another 3rd party library I'm using. The file is 22kHz if that matters and this is how at is being instantiated:

at = new AudioTrack(AudioManager.STREAM_MUSIC, 22050, AudioFormat.CHANNEL_OUT_STEREO,
                AudioFormat.ENCODING_PCM_16BIT, 10000 /* 10 second buffer */,
                AudioTrack.MODE_STREAM);   

Thank you so much in advance.

Edit: This is how I'm instantiating the AudioTrack variable now. So for 44kHz files, the value that is getting sent is 44100, while for 22kHz files, the value is 22050.

at = new AudioTrack(AudioManager.STREAM_MUSIC, decoder.getOutputFrequency(), 
                                  decoder.getOutputChannels() > 1 ? AudioFormat.CHANNEL_OUT_STEREO : AudioFormat.CHANNEL_OUT_MONO,
                                  AudioFormat.ENCODING_PCM_16BIT, 10000 /* 10 second buffer */,
                                  AudioTrack.MODE_STREAM);

This is decode method:

public byte[] decode(InputStream inputStream, int startMs, int maxMs) throws IOException {
        ByteArrayOutputStream outStream = new ByteArrayOutputStream(1024);

        float totalMs = 0;
        boolean seeking = true;

        try {
            Bitstream bitstream = new Bitstream(inputStream);
            Decoder decoder = new Decoder();

            boolean done = false;
            while (!done) {
                Header frameHeader = bitstream.readFrame();
                if (frameHeader == null) {
                    done = true;
                } else {
                    totalMs += frameHeader.ms_per_frame();

                    if (totalMs >= startMs) {
                        seeking = false;
                    }

                    if (!seeking) {
                        // logger.debug("Handling header: " + frameHeader.layer_string());
                        SampleBuffer output = (SampleBuffer) decoder.decodeFrame(frameHeader, bitstream);                            

                        short[] pcm = output.getBuffer();
                        for (short s : pcm) {
                            outStream.write(s & 0xff);
                            outStream.write((s >> 8) & 0xff);
                        }
                    }

                    if (totalMs >= (startMs + maxMs)) {
                        done = true;
                    }
                }
                bitstream.closeFrame();
            }

            return outStream.toByteArray();
        } catch (BitstreamException e) {
            throw new IOException("Bitstream error: " + e);
        } catch (DecoderException e) {
            throw new IOException("Decoder error: " + e);
        }
    }

This is how it sounds (wait a few seconds): https://vimeo.com/60951237 (and this is the actual file: http://www.tonycuffe.com/mp3/tail%20toddle.mp3)

Edit: I would have loved to have split the bounty, but instead I have given the bounty to Bill and the accepted answer to Neil. Both were a tremendous help. For those wondering, I ended up rewriting the Sonic native code which helped me move along the process.


Solution

  • As @Bill Pringlemeir says, the problem is that your conversion method doesn't actually convert. A short is a 16 bit number; a byte is an 8 bit number. The method you have chosen doesn't convert the contents of the shorts (ie go from 16 bits to 8 bits for the contents), it changes the way in which the same collection of bits is stored. As you say, you need something like this:

    SampleBuffer output = (SampleBuffer) decoder.decodeFrame(frameHeader, bitstream);
    byte[] array = MyShortToByte(output.getBuffer());
    at.write(array,  0,  array.length);
    

    @Bill Pringlemeir's approach is equivalent to dividing all the shorts by 256 to ensure they fit in the byte range:

    byte[] MyShortToByte(short[] buffer) {
        int N = buffer.length;
        ByteBuffer byteBuf = ByteBuffer.allocate(N);
        while (N >= i) {
            byte b = (byte)(buffer[i]/256);  /*convert to byte. */
            byteBuf.put(b);
            i++;
        }
        return byteBuf.array();
    }
    

    This will work, but will probably give you very quiet, edgy tones. If you can afford the processing time, a two pass approach will probably give better results:

    byte[] MyShortToByte(short[] buffer) {
        int N = buffer.length;
        short min = 0;
        short max = 0;
        for (int i=0; i<N; i++) {
             if (buffer[i] > max) max = buffer[i];
             if (buffer[i] < min) min = buffer[i];
             }
        short scaling = 1+(max-min)/256; // 1+ ensures we stay within range and guarantee no divide by zero if sequence is pure silence ...
    
        ByteBuffer byteBuf = ByteBuffer.allocate(N);
        for (int i=0; i<N; i++) {
            byte b = (byte)(buffer[i]/scaling);  /*convert to byte. */
            byteBuf.put(b);
        }
        return byteBuf.array();
    }
    

    Again, beware signed / unsigned issue. The above works signed-> signed and unsigned->unsigned; but not between the two. It may be that you are reading signed shorts (-32768-32767), but need to output unsigned bytes (0-255), ...

    If you can afford the processing time, a more precise (smoother) approach would be to go via floats (this also gets round the signed/unsigned issue):

    byte[] MyShortToByte(short[] buffer) {
        int N = buffer.length;
        float f[] = new float[N];
        float min = 0.0f;
        float max = 0.0f;
        for (int i=0; i<N; i++) {
             f[i] = (float)(buffer[i]);
             if (f[i] > max) max = f[i];
             if (f[i] < min) min = f[i];
             }
        float scaling = 1.0f+(max-min)/256.0f; // +1 ensures we stay within range and guarantee no divide by zero if sequence is pure silence ...
    
        ByteBuffer byteBuf = ByteBuffer.allocate(N);
        for (int i=0; i<N; i++) {
            byte b = (byte)(f[i]/scaling);  /*convert to byte. */
            byteBuf.put(b);
        }
        return byteBuf.array();
    }