ios 视频编码与解码苹果手机视频编码器

转载

IT剑客行 2023-07-13 12:15:34

文章标签 ios 视频编码与解码音视频视频编解码实时音视频 webrtc 文章分类 iOS 移动开发

VideoToolBox

2014年的WWDC大会,苹果发布iOS 8.0,开放了硬编解码的API，也就是VideoToolBox
VideoToolbox 一个底层视频硬编解码框架，纯C语言API，可直接访问硬件的编解码器。提供性能极高的硬编硬解，同时也提供存储在 CoreVideo 像素缓冲区的图像进行格式的转换。
这篇文章记录基于VideoToolBox封装一个H264硬解码视频的工具

硬解码优点

速度快，降低解码时间
功耗大幅降低，降低使用App时的电量消耗

VideoToolBox解码的输入输出分别是什么？

输入NALU数据，输出CVPixelBufferRef

硬解码步骤

思路

1 解析数据(NALU Unit) 判断I/P/B帧
2 初始化解码器会话，配置参数,(和编码器不同，解码器的初始化，依赖sps、pps，所以一定是拿到数据之后才能初始化解码器)
3 将解析后的H264 NALU Unit 输入到解码器
4 在解码完成的回调函数里，回调解码后的数据(可以使用OpenGL ES显示)

核心函数

1 创建解码会话， VTDecompressionSessionCreate
2 解码一个frame，VTDecompressionSessionDecodeFrame
3 销毁解码会话 VTDecompressionSessionInvalidate

相关知识点

I帧关键帧，保留了一张完整的视频帧，解码的关键！
P帧向前参考帧，差异数据，解码需要依赖I帧
B帧双向参考帧，解码需要同时依赖I帧和P帧
如果H264码流中I帧错误/丢失，就会导致错误传递，P和B无法单独完成解码。会有花屏现象产生
使用VideoToolBox 硬编码时，第一帧并不是I，而是手动加入的SPS/PPS
解码时，需要使用SPS/PPS来对解码器进行初始化。

一接收并分析数据

获取数据类型，sps/pps赋值给成员变量，用来初始化解码器
IBP帧给解码器参与解码

/// 解析NALU数据

- (void)decodeNaluData:(uint8_t *)naluData withSize:(uint32_t)frameSize {
    // 数据类型:frame，前四个字节为NALU开始码，00 00 00 01*
    // 第五位标识数据类型，转化十进制，7表示sps，8表示pps，5表示I帧*
    int type = (naluData[4] & 0x1F);
    // 将NALU的开始码转为4字节大端NALU的长度信息
    uint32_t naluSize = frameSize - 4;
    uint8_t *pNaluSize = (uint8_t *)(&naluSize);
    naluData[0] = *(pNaluSize + 3);
    naluData[1] = *(pNaluSize + 2);
    naluData[2] = *(pNaluSize + 1);
    naluData[3] = *(pNaluSize);
    CVPixelBufferRef pixelBuffer = NULL;

    /**
     第一次解析时: 初始化解码器initDecoder
     判断数据类型，帧数据调用decode:(uint8_t *)frame
     sps/pps数据，则给成员变量赋值保存
     */
    switch (type) {
        case 0x05:
            // 关键帧
            if ([self initDecoderSession]) {
                pixelBuffer = [self decode:naluData withSize:frameSize];
            }
            break;
        case 0x06:
            // 增强型
            break;
        case 0x07:
            // sps
            _spsSize = naluSize;
            _sps = malloc(_spsSize);
            // 从下标4(也就是第五个元素)开始复制数据
            memcpy(_sps, &naluData[4], _spsSize);
            break;
        case 0x08:
            // pps
            _ppsSize = naluSize;
            _pps = malloc(_ppsSize);
            // 从下标4(也就是第五个元素)开始复制数据
            memcpy(_pps, &naluData[4], _ppsSize);
            break;
        default:
            // 其他帧（1-5）
            if ([self initDecoderSession]) {
                pixelBuffer = [self decode:naluData withSize:frameSize];
            }
            break;
    }
}

本文福利，免费领取C++音视频学习资料包、技术视频，内容包括（音视频开发，面试题，FFmpeg ，webRTC ，rtmp ，hls ，rtsp ，ffplay ，编解码，推拉流，srs）↓↓↓↓↓↓见下面↓↓文章底部点击免费领取↓↓

二初始化解码器

拿到SPS\PPS才能拿到CMVideoFormatDescriptionRef，CMVideoFormatDescriptionRef拿到才能初始化解码会话

/// 初始化解码会话

- (BOOL)initDecoderSession {
    if (self.decodeSession) return YES;
    const uint8_t * const parameterSetPointers[2] = {_sps, _pps};
    const size_t parameterSetSizes[2] = {_spsSize, _ppsSize};
    int naluHeaderLen = 4;
    
    /**
     根据sps pps设置解码参数
     param kCFAllocatorDefault 分配器
     param 2 参数个数
     param parameterSetPointers 参数集指针
     param parameterSetSizes 参数集大小
     param naluHeaderLen nalu nalu start code 的长度 4
     param _decodeDesc 解码器描述
     return 状态
     */
    OSStatus status = CMVideoFormatDescriptionCreateFromH264ParameterSets(kCFAllocatorDefault, 2, parameterSetPointers, parameterSetSizes, naluHeaderLen, &_videoDesc);
    if (status != noErr) {
        NSLog(@"Video hard DecodeSession create H264ParameterSets(sps, pps) failed status= %d", (int)status);
        return NO;
    }

    /**
     解码参数:

    * kCVPixelBufferPixelFormatTypeKey:摄像头的输出数据格式

     kCVPixelBufferPixelFormatTypeKey，已测可用值为

        kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange，即420v

        kCVPixelFormatType_420YpCbCr8BiPlanarFullRange，即420f

        kCVPixelFormatType_32BGRA，iOS在内部进行YUV至BGRA格式转换

     YUV420一般用于标清视频，YUV422用于高清视频，这里的限制让人感到意外。但是，在相同条件下，YUV420计算耗时和传输压力比YUV422都小。
    * kCVPixelBufferWidthKey/kCVPixelBufferHeightKey: 视频源的分辨率 width*height

     * kCVPixelBufferOpenGLCompatibilityKey : 它允许在 OpenGL 的上下文中直接绘制解码后的图像，而不是从总线和 CPU 之间复制数据。这有时候被称为零拷贝通道，因为在绘制过程中没有解码的图像被拷贝.
     */

    NSDictionary *destinationPixBufferAttrs =

    @{

      (id)kCVPixelBufferPixelFormatTypeKey: [NSNumber numberWithInt:kCVPixelFormatType_420YpCbCr8BiPlanarFullRange], *//iOS上 nv12(uvuv排布) 而不是nv21（vuvu排布）*

      (id)kCVPixelBufferWidthKey: [NSNumber numberWithInteger:_config.width],

      (id)kCVPixelBufferHeightKey: [NSNumber numberWithInteger:_config.height],

      (id)kCVPixelBufferOpenGLCompatibilityKey: [NSNumber numberWithBool:true]

      };

    // 解码回调设置

    /**
     VTDecompressionOutputCallbackRecord 是一个简单的结构体，它带有一个指针 (decompressionOutputCallback)，指向帧解压完成后的回调方法。你需要提供可以找到这个回调方法的实例 (decompressionOutputRefCon)。VTDecompressionOutputCallback 回调方法包括七个参数：

            参数1: 回调的引用

            参数2: 帧的引用

            参数3: 一个状态标识 (包含未定义的代码)

            参数4: 指示同步/异步解码，或者解码器是否打算丢帧的标识

            参数5: 实际图像的缓冲

            参数6: 出现的时间戳

            参数7: 出现的持续时间

     */

    VTDecompressionOutputCallbackRecord callbackRecord;
    callbackRecord.decompressionOutputCallback = videoDecoderCallBack;
    callbackRecord.decompressionOutputRefCon = (__bridge void * _Nullable)(self);

    // 创建session

    /**
     @function    VTDecompressionSessionCreate
     @abstract    创建用于解压缩视频帧的会话。
     @discussion  解压后的帧将通过调用OutputCallback发出
     @param    allocator  内存的会话。通过使用默认的kCFAllocatorDefault的分配器。
     @param    videoFormatDescription 描述源视频帧
     @param    videoDecoderSpecification 指定必须使用的特定视频解码器.NULL
     @param    destinationImageBufferAttributes 描述源像素缓冲区的要求 NULL
     @param    outputCallback 使用已解压缩的帧调用的回调
     @param    decompressionSessionOut 指向一个变量以接收新的解压会话
     */

    status = VTDecompressionSessionCreate(kCFAllocatorDefault, _videoDesc, NULL, (__bridge CFDictionaryRef _Nullable)(destinationPixBufferAttrs), &callbackRecord, &_decodeSession);
    if (status != noErr) {
        NSLog(@"Video hard DecodeSession create failed status= %d", (int)status);
        return NO;
    }

    // 设置解码会话属性
    // 实时解码
    status = VTSessionSetProperty(self.decodeSession, kVTDecompressionPropertyKey_RealTime,kCFBooleanTrue);
    NSLog(@"Vidoe hard decodeSession set property RealTime status = %d", (int)status);
    return YES;

}

三将IBP帧输入到解码器

CVPixelBufferRef 解码后/编码前的数据
CMBlockBufferRef 编码后的数据
解码函数接受的数据类型是CMSampleBufferRef，需要将frame 进行两次包装
frame->CMBlockBufferRef->CMSampleBufferRef

/// 接受帧数据解码

- (CVPixelBufferRef)decode:(uint8_t *)frame withSize:(uint32_t)frameSize {

    CVPixelBufferRef outputPixelBuffer = NULL;

    CMBlockBufferRef blockBuffer = NULL;

    CMBlockBufferFlags flag0 = 0;

    

    *// 创建blockBuffer*

    /*!

     参数1: structureAllocator kCFAllocatorDefault 默认内存分配

     参数2: memoryBlock  frame

     参数3: frame size

     参数4: blockAllocator: Pass NULL

     参数5: customBlockSource Pass NULL

     参数6: offsetToData  数据偏移

     参数7: dataLength 数据长度

     参数8: flags 功能和控制标志

     参数9: newBBufOut blockBuffer地址,不能为空

     */

    OSStatus status = CMBlockBufferCreateWithMemoryBlock(kCFAllocatorDefault, frame, frameSize, kCFAllocatorNull, NULL, 0, frameSize, flag0, &blockBuffer);

    

    if (status != kCMBlockBufferNoErr) {

        NSLog(@"Video hard decode create blockBuffer error code=%d", (int)status);

        return outputPixelBuffer;

    }

    

    CMSampleBufferRef sampleBuffer = NULL;

    const size_t sampleSizeArray[] = {frameSize};

    

    *// 创建sampleBuffer*

    */**

*参数1: allocator 分配器,使用默认内存分配, kCFAllocatorDefault*

*参数2: blockBuffer.需要编码的数据blockBuffer.不能为NULL*

*参数3: formatDescription,视频输出格式*

*参数4: numSamples.CMSampleBuffer 个数.*

*参数5: numSampleTimingEntries 必须为0,1,numSamples*

*参数6: sampleTimingArray.  数组.为空*

*参数7: numSampleSizeEntries 默认为1*

*参数8: sampleSizeArray*

*参数9: sampleBuffer对象*

**/*

    status = CMSampleBufferCreateReady(kCFAllocatorDefault, blockBuffer, _videoDesc, 1, 0, NULL, 1, sampleSizeArray, &sampleBuffer);

    

    if (status != noErr || !sampleBuffer) {

        NSLog(@"Video hard decode create sampleBuffer failed status=%d", (int)status);

        CFRelease(blockBuffer);

        return outputPixelBuffer;

    }

    

    *// 解码*

    *// 向视频解码器提示使用低功耗模式是可以的*

    VTDecodeFrameFlags flag1 = kVTDecodeFrame_1xRealTimePlayback;

    *// 异步解码*

    VTDecodeInfoFlags  infoFlag = kVTDecodeInfo_Asynchronous;

    *// 解码数据*

    */**

*参数1: 解码session*

*参数2: 源数据 包含一个或多个视频帧的CMsampleBuffer*

*参数3: 解码标志*

*参数4: 解码后数据outputPixelBuffer*

*参数5: 同步/异步解码标识*

**/*

    status = VTDecompressionSessionDecodeFrame(_decodeSession, sampleBuffer, flag1, &outputPixelBuffer, &infoFlag);

    

    if (status == kVTInvalidSessionErr) {

        NSLog(@"Video hard decode  InvalidSessionErr status =%d", (int)status);

    } else if (status == kVTVideoDecoderBadDataErr) {

        NSLog(@"Video hard decode  BadData status =%d", (int)status);

    } else if (status != noErr) {

        NSLog(@"Video hard decode failed status =%d", (int)status);

    }

    CFRelease(sampleBuffer);

    CFRelease(blockBuffer);

    return outputPixelBuffer;

}

四在VideoToolBox的回调里，将解码后的数据回调

void videoDecoderCallBack(void * CM_NULLABLE decompressionOutputRefCon, void * CM_NULLABLE sourceFrameRefCon, OSStatus status, VTDecodeInfoFlags infoFlags, CM_NULLABLE CVImageBufferRef imageBuffer, CMTime presentationTimeStamp, CMTime presentationDuration ) {
    if (status != noErr) {
        NSLog(@"Video hard decode callback error status=%d", (int)status);
        return;
    }

    // 拿到解码后的数据sourceFrameRefCon -> CVPixelBufferRef
    CVPixelBufferRef *outputPixelBuffer = (CVPixelBufferRef *)sourceFrameRefCon;

    *outputPixelBuffer = CVPixelBufferRetain(imageBuffer);

    // 获取self
    CQVideoDecoder *decoder = (__bridge CQVideoDecoder *)(decompressionOutputRefCon);

    // 回调
    dispatch_async(decoder.callBackQueue, ^{
        if (decoder.delegate && [decoder.delegate respondsToSelector:@selector(videoDecoder:didDecodeSuccessWithPixelBuffer:)]) {
            [decoder.delegate videoDecoder:decoder didDecodeSuccessWithPixelBuffer:imageBuffer];
        }
        CVPixelBufferRelease(imageBuffer);

    });

}

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。