Android -- Audio系统之AudioTrack内部实现简析(二)


在上一篇博文中,我们主要分析了Audio系统所依赖的Native服务的启动过程;说是启动,其实也就是分析它的初始化流程。而AudioTrack是Android提供的供应用使用的Audio API之一,它管理和实现了播放PCM制式音频的接口。AudioTrack采用“push”模式管理数据,我们需要调用write()方法主动将数据传输给AudioTrack对象。


AudioTrack可以工作在两种模式下:

  • STREAM模式:应用程序通过调用write()方法将连续的音频数据传输给AudioTrack对象。它的处理是阻塞的,只有当音频数据从Java层传入到Native层,并且加入到回放队列后,它才会返回。STREAM模式经常用于处理量较大的音频数据。
  • STATIC模式:当处理的音频数据量较小,能一次性填充到内存用以播放;且对播放时延要求较高时,会使用这种模式。

可能是MediaPlayer功能太完备,我们使用AudioTrack的机会不多。但借助它来分析Audio系统的上下层调用关系和实现流程,还是很有效的。希望在分析完AudioTrack的内部实现之后,我们都能对Android Audio模块有一些基本的了解和认识。


接下来,我们就借助一个STREAM模式下的代码Demo,一层层分析它的内部实现流程;了解AudioTrack是如何与AudioFlinger/AudioPolicyServcie这些Native服务交互的。


示例代码:

public void initAudioTrack() {

        //1、
        int bufsize = AudioTrack.getMinBufferSize(8000, AudioFormat.CHANNEL_CONFIGURATION_STEREO,
                AudioFormat.ENCODING_PCM_16BIT);

        //2、指定音频流类型、采样率、声道、格式、工作模式等信息
        AudioTrack track = new AudioTrack(AudioManager.STREAM_MUSIC, 8000, AudioFormat.CHANNEL_CONFIGURATION_STEREO,
                AudioFormat.ENCODING_PCM_16BIT, bufsize, AudioTrack.MODE_STREAM);

        //3、
        track.play();

        byte pkg[] = new byte[1024];//无效数据,只是用于分析;多数时候会在嵌套在某个循环之中

        //4、
        track.write(pkg, 0, pkg.length);

        //5、
        track.stop();

        //6、
        track.release();

    }

示例代码罗列了使用AudioTrack时所需的一些重要步骤。我们通过分解这些步骤,区分代码调用层次,一层层的从Java调用分析到JNI调用、Native调用等;最终达到我们的分析目的。


一、获取需要的最小Buffer大小


音频数据是具有很多属性的,比如采样率、音频格式以及声道数等等;而这些属性又跟我们的Audio Interface有关。我们配置不同的音频属性,底层播放这段数据所需的最小Buffer空间大小也会变化。所以,我们需要根据配置的音频信息,先得到它对应的最小Buffer空间;再继续后面的操作。


AudioTrack::getMinBufferSize()函数的定义如下:

/**
     * Returns the estimated minimum buffer size required for an AudioTrack
     * object to be created in the {@link #MODE_STREAM} mode.
     * The size is an estimate because it does not consider either the route or the sink,
     * since neither is known yet.  Note that this size doesn't
     * guarantee a smooth playback under load, and higher values should be chosen according to
     * the expected frequency at which the buffer will be refilled with additional data to play.
     * For example, if you intend to dynamically set the source sample rate of an AudioTrack
     * to a higher value than the initial source sample rate, be sure to configure the buffer size
     * based on the highest planned sample rate.
     * @param sampleRateInHz the source sample rate expressed in Hz.
     *   {@link AudioFormat#SAMPLE_RATE_UNSPECIFIED} is not permitted.
     * @param channelConfig describes the configuration of the audio channels.
     *   See {@link AudioFormat#CHANNEL_OUT_MONO} and
     *   {@link AudioFormat#CHANNEL_OUT_STEREO}
     * @param audioFormat the format in which the audio data is represented.
     *   See {@link AudioFormat#ENCODING_PCM_16BIT} and
     *   {@link AudioFormat#ENCODING_PCM_8BIT},
     *   and {@link AudioFormat#ENCODING_PCM_FLOAT}.
     * @return {@link #ERROR_BAD_VALUE} if an invalid parameter was passed,
     *   or {@link #ERROR} if unable to query for output properties,
     *   or the minimum buffer size expressed in bytes.
     */
    static public int getMinBufferSize(int sampleRateInHz, int channelConfig, int audioFormat) {
        int channelCount = 0;
        switch(channelConfig) {//根据配置判断声道数
        case AudioFormat.CHANNEL_OUT_MONO:
        case AudioFormat.CHANNEL_CONFIGURATION_MONO:
            channelCount = 1;
            break;
        case AudioFormat.CHANNEL_OUT_STEREO:
        case AudioFormat.CHANNEL_CONFIGURATION_STEREO:
            channelCount = 2;
            break;
        default:
            if (!isMultichannelConfigSupported(channelConfig)) {
                loge("getMinBufferSize(): Invalid channel configuration.");
                return ERROR_BAD_VALUE;
            } else {
                channelCount = AudioFormat.channelCountFromOutChannelMask(channelConfig);
            }
        }

        if (!AudioFormat.isPublicEncoding(audioFormat)) {//判断音频格式
            loge("getMinBufferSize(): Invalid audio format.");
            return ERROR_BAD_VALUE;
        }

        // sample rate, note these values are subject to change
        // Note: AudioFormat.SAMPLE_RATE_UNSPECIFIED is not allowed
        if ( (sampleRateInHz < AudioFormat.SAMPLE_RATE_HZ_MIN) ||
                (sampleRateInHz > AudioFormat.SAMPLE_RATE_HZ_MAX) ) {//判断音频采样率是否合法,是否在人贰可识别的频率范围内
            loge("getMinBufferSize(): " + sampleRateInHz + " Hz is not a supported sample rate.");
            return ERROR_BAD_VALUE;
        }

        int size = native_get_min_buff_size(sampleRateInHz, channelCount, audioFormat);//因为采样率、声道数等信息都跟硬件的支持有关,需要通过JNI查询硬件信息
        if (size <= 0) {
            loge("getMinBufferSize(): error querying hardware");
            return ERROR;
        }
        else {
            return size;
        }
    }

根据代码中的注释,首先会进行传参的有效性检测;最后通过JNI调用将函数处理带入Native层。


这里重要的一步调用是:native_get_min_buff_size().


二、创建AudioTrack对象


得到最小的Buffer大小后,我们就会去创建AudioTrack实例。AudioTrack的构造函数定义是:

//--------------------------------------------------------------------------
    // Constructor, Finalize
    //--------------------
    /**
     * Class constructor.
     * @param streamType the type of the audio stream. See
     *   {@link AudioManager#STREAM_VOICE_CALL}, {@link AudioManager#STREAM_SYSTEM},
     *   {@link AudioManager#STREAM_RING}, {@link AudioManager#STREAM_MUSIC},
     *   {@link AudioManager#STREAM_ALARM}, and {@link AudioManager#STREAM_NOTIFICATION}.
     * @param sampleRateInHz the initial source sample rate expressed in Hz.
     *   {@link AudioFormat#SAMPLE_RATE_UNSPECIFIED} means to use a route-dependent value
     *   which is usually the sample rate of the sink.
     *   {@link #getSampleRate()} can be used to retrieve the actual sample rate chosen.
     * @param channelConfig describes the configuration of the audio channels.
     *   See {@link AudioFormat#CHANNEL_OUT_MONO} and
     *   {@link AudioFormat#CHANNEL_OUT_STEREO}
     * @param audioFormat the format in which the audio data is represented.
     *   See {@link AudioFormat#ENCODING_PCM_16BIT},
     *   {@link AudioFormat#ENCODING_PCM_8BIT},
     *   and {@link AudioFormat#ENCODING_PCM_FLOAT}.
     * @param bufferSizeInBytes the total size (in bytes) of the internal buffer where audio data is
     *   read from for playback. This should be a nonzero multiple of the frame size in bytes.
     *   <p> If the track's creation mode is {@link #MODE_STATIC},
     *   this is the maximum length sample, or audio clip, that can be played by this instance.
     *   <p> If the track's creation mode is {@link #MODE_STREAM},
     *   this should be the desired buffer size
     *   for the <code>AudioTrack</code> to satisfy the application's
     *   latency requirements.
     *   If <code>bufferSizeInBytes</code> is less than the
     *   minimum buffer size for the output sink, it is increased to the minimum
     *   buffer size.
     *   The method {@link #getBufferSizeInFrames()} returns the
     *   actual size in frames of the buffer created, which
     *   determines the minimum frequency to write
     *   to the streaming <code>AudioTrack</code> to avoid underrun.
     *   See {@link #getMinBufferSize(int, int, int)} to determine the estimated minimum buffer size
     *   for an AudioTrack instance in streaming mode.
     * @param mode streaming or static buffer. See {@link #MODE_STATIC} and {@link #MODE_STREAM}
     * @throws java.lang.IllegalArgumentException
     */
    public AudioTrack(int streamType, int sampleRateInHz, int channelConfig, int audioFormat,
            int bufferSizeInBytes, int mode)
    throws IllegalArgumentException {
        this(streamType, sampleRateInHz, channelConfig, audioFormat,
                bufferSizeInBytes, mode, AudioManager.AUDIO_SESSION_ID_GENERATE);
    }

最终的调用是:

/**
     * Class constructor with {@link AudioAttributes} and {@link AudioFormat}.
     * @param attributes a non-null {@link AudioAttributes} instance.
     * @param format a non-null {@link AudioFormat} instance describing the format of the data
     *     that will be played through this AudioTrack. See {@link AudioFormat.Builder} for
     *     configuring the audio format parameters such as encoding, channel mask and sample rate.
     * @param bufferSizeInBytes the total size (in bytes) of the internal buffer where audio data is
     *   read from for playback. This should be a nonzero multiple of the frame size in bytes.
     *   <p> If the track's creation mode is {@link #MODE_STATIC},
     *   this is the maximum length sample, or audio clip, that can be played by this instance.
     *   <p> If the track's creation mode is {@link #MODE_STREAM},
     *   this should be the desired buffer size
     *   for the <code>AudioTrack</code> to satisfy the application's
     *   latency requirements.
     *   If <code>bufferSizeInBytes</code> is less than the
     *   minimum buffer size for the output sink, it is increased to the minimum
     *   buffer size.
     *   The method {@link #getBufferSizeInFrames()} returns the
     *   actual size in frames of the buffer created, which
     *   determines the minimum frequency to write
     *   to the streaming <code>AudioTrack</code> to avoid underrun.
     *   See {@link #getMinBufferSize(int, int, int)} to determine the estimated minimum buffer size
     *   for an AudioTrack instance in streaming mode.
     * @param mode streaming or static buffer. See {@link #MODE_STATIC} and {@link #MODE_STREAM}.
     * @param sessionId ID of audio session the AudioTrack must be attached to, or
     *   {@link AudioManager#AUDIO_SESSION_ID_GENERATE} if the session isn't known at construction
     *   time. See also {@link AudioManager#generateAudioSessionId()} to obtain a session ID before
     *   construction.
     * @throws IllegalArgumentException
     */
    public AudioTrack(AudioAttributes attributes, AudioFormat format, int bufferSizeInBytes,
            int mode, int sessionId)
                    throws IllegalArgumentException {
        super(attributes);
        // mState already == STATE_UNINITIALIZED

        if (format == null) {
            throw new IllegalArgumentException("Illegal null AudioFormat");
        }

        // remember which looper is associated with the AudioTrack instantiation
        Looper looper;
        if ((looper = Looper.myLooper()) == null) {
            looper = Looper.getMainLooper();
        }

        int rate = format.getSampleRate();
        if (rate == AudioFormat.SAMPLE_RATE_UNSPECIFIED) {
            rate = 0;
        }

        int channelIndexMask = 0;
        if ((format.getPropertySetMask()
                & AudioFormat.AUDIO_FORMAT_HAS_PROPERTY_CHANNEL_INDEX_MASK) != 0) {
            channelIndexMask = format.getChannelIndexMask();
        }
        int channelMask = 0;
        if ((format.getPropertySetMask()
                & AudioFormat.AUDIO_FORMAT_HAS_PROPERTY_CHANNEL_MASK) != 0) {
            channelMask = format.getChannelMask();
        } else if (channelIndexMask == 0) { // if no masks at all, use stereo
            channelMask = AudioFormat.CHANNEL_OUT_FRONT_LEFT
                    | AudioFormat.CHANNEL_OUT_FRONT_RIGHT;
        }
        int encoding = AudioFormat.ENCODING_DEFAULT;
        if ((format.getPropertySetMask() & AudioFormat.AUDIO_FORMAT_HAS_PROPERTY_ENCODING) != 0) {
            encoding = format.getEncoding();
        }
        audioParamCheck(rate, channelMask, channelIndexMask, encoding, mode);
        mStreamType = AudioSystem.STREAM_DEFAULT;

        audioBuffSizeCheck(bufferSizeInBytes);

        mInitializationLooper = looper;

        if (sessionId < 0) {
            throw new IllegalArgumentException("Invalid audio session ID: "+sessionId);
        }

        int[] sampleRate = new int[] {mSampleRate};
        int[] session = new int[1];
        session[0] = sessionId;
        // native initialization
        int initResult = native_setup(new WeakReference<AudioTrack>(this), mAttributes,
                sampleRate, mChannelMask, mChannelIndexMask, mAudioFormat,
                mNativeBufferSizeInBytes, mDataLoadMode, session, 0 /*nativeTrackInJavaObj*/);//传入了构造的参数信息,重要地,我们传入了一个AudioTrack类型的弱引用对象
        if (initResult != SUCCESS) {
            loge("Error code "+initResult+" when initializing AudioTrack.");
            return; // with mState == STATE_UNINITIALIZED
        }

        mSampleRate = sampleRate[0];
        mSessionId = session[0];

        if (mDataLoadMode == MODE_STATIC) {
            mState = STATE_NO_STATIC_DATA;
        } else {
            mState = STATE_INITIALIZED;
        }
    }

AudioAttributes类是Parcelable接口的子类,它是一个对Audio设置信息进行保存的数据包类;我们只需注意它的数据封装作用就行了。构造函数中的处理也很直观,除去一些初始化操作外,重要的调用是:

native_setup(),此时的处理也会转入到JNI中。



三、启动播放


AudioTrack对象创建完成后,我们会启动播放,调用play()方法:

/---------------------------------------------------------
    // Transport control methods
    //--------------------
    /**
     * Starts playing an AudioTrack.
     * <p>
     * If track's creation mode is {@link #MODE_STATIC}, you must have called one of
     * the write methods ({@link #write(byte[], int, int)}, {@link #write(byte[], int, int, int)},
     * {@link #write(short[], int, int)}, {@link #write(short[], int, int, int)},
     * {@link #write(float[], int, int, int)}, or {@link #write(ByteBuffer, int, int)}) prior to
     * play().
     * <p>
     * If the mode is {@link #MODE_STREAM}, you can optionally prime the data path prior to
     * calling play(), by writing up to <code>bufferSizeInBytes</code> (from constructor).
     * If you don't call write() first, or if you call write() but with an insufficient amount of
     * data, then the track will be in underrun state at play().  In this case,
     * playback will not actually start playing until the data path is filled to a
     * device-specific minimum level.  This requirement for the path to be filled
     * to a minimum level is also true when resuming audio playback after calling stop().
     * Similarly the buffer will need to be filled up again after
     * the track underruns due to failure to call write() in a timely manner with sufficient data.
     * For portability, an application should prime the data path to the maximum allowed
     * by writing data until the write() method returns a short transfer count.
     * This allows play() to start immediately, and reduces the chance of underrun.
     *
     * @throws IllegalStateException if the track isn't properly initialized
     */
    public void play()
    throws IllegalStateException {
        if (mState != STATE_INITIALIZED) {
            throw new IllegalStateException("play() called on uninitialized AudioTrack.");
        }
        baseStart();
        synchronized(mPlayStateLock) {
            native_start();
            mPlayState = PLAYSTATE_PLAYING;
        }
    }

处理很简单,启动播放的操作切换到JNI中:

native_start().



四、写入数据


AudioTrack中有几个重载的write()函数定义,主要的区别是它们对音频数据转换的数组类型要求不同,从而导致了不同的重载版本的调用。JNI层对这种重载形式也做了区分:

private native final int native_write_byte(byte[] audioData,
                                               int offsetInBytes, int sizeInBytes, int format,
                                               boolean isBlocking);

    private native final int native_write_short(short[] audioData,
                                                int offsetInShorts, int sizeInShorts, int format,
                                                boolean isBlocking);

    private native final int native_write_float(float[] audioData,
                                                int offsetInFloats, int sizeInFloats, int format,
                                                boolean isBlocking);

    private native final int native_write_native_bytes(Object audioData,
            int positionInBytes, int sizeInBytes, int format, boolean blocking);

定义了不同版本的JNI函数,但在JNI实现中,它们最终的函数实现却基本一致。这里我们就不看AudioTrack中的write()函数定义了,它的函数定义最终也是调用这些native 函数,进入JNI。


这一步的重要调用:native_write_byte()/native_write_short()...


五、停止播放


如果需要停止播放,我们则会调用AudioTrack::stop()函数:

/**
     * Stops playing the audio data.
     * When used on an instance created in {@link #MODE_STREAM} mode, audio will stop playing
     * after the last buffer that was written has been played. For an immediate stop, use
     * {@link #pause()}, followed by {@link #flush()} to discard audio data that hasn't been played
     * back yet.
     * @throws IllegalStateException
     */
    public void stop()
    throws IllegalStateException {
        if (mState != STATE_INITIALIZED) {
            throw new IllegalStateException("stop() called on uninitialized AudioTrack.");
        }

        // stop playing
        synchronized(mPlayStateLock) {
            native_stop();
            mPlayState = PLAYSTATE_STOPPED;
            mAvSyncHeader = null;
            mAvSyncBytesRemaining = 0;
        }
    }

它的处理很直观,转入JNI中;这里重要的函数调用是:

natvie_stop().



六、释放资源


当我们停止播放后,还需要释放AudioTrack相关联的各种系统资源,此时会调用AudioTrack::release():

/**
     * Releases the native AudioTrack resources.
     */
    public void release() {
        // even though native_release() stops the native AudioTrack, we need to stop
        // AudioTrack subclasses too.
        try {
            stop();
        } catch(IllegalStateException ise) {
            // don't raise an exception, we're releasing the resources.
        }
        baseRelease();
        native_release();
        mState = STATE_UNINITIALIZED;
    }

与stop()的处理类似,通过

native_release()调用进入JNI中处理。



分析完AudioTrack Java层的实现,发现它真的是太简单了,所有的复杂处理都转入到JNI中,从而再转入到Native层进行最终处理。


我们整理下之前分析得出的6个重要调用:

  • native_get_min_buff_size():获取最小Buffer大小
  • native_setup():完成AudioTrack的创建
  • native_start():AudioTrack启动播放
  • native_write_byte()/native_write_short()...:往AudioTrack写入音频数据
  • natvie_stop():AudioTrack停止播放
  • native_release():释放AudioTrack绑定的系统资源

看起来Java AudioTrack把复杂处理的锅都通过JNI甩给Native层了,我们下一步就要在JNI中分析这几个函数调用,看下它们是如何处理Java AudioTrack的请求的。