本文只讲述iOS平台如何实现边录制音频边压缩编码边上传到服务端,至于播放则只下载解码播放并不涉及
录制
使用AudioToolbox.framework -> AudioQueue类进行录音/播放
三个缓冲器Buffers:每个缓冲器都是一个存储音频数据的临时仓库。
一个缓冲队列Buffer Queue:一个包含音频缓冲器的有序队列。
一个回调Callback:一个自定义的队列回调函数。
- (void)_createAudioInputQueue
{
APMM_DEBUG(@"WAudioInputQueue _createAudioInputQueue");
if (![self _checkAudioQueueSuccess:AudioQueueNewInput(&_format, MCAudioQueueInuputCallback, (__bridge void *)(self), CFRunLoopGetMain(), NULL, 0, &_audioQueue)])
{
return;
}
AudioQueueAddPropertyListener(_audioQueue, kAudioQueueProperty_IsRunning, WAudioInputQueuePropertyCallback, (__bridge void *)(self));
_meterState = (AudioQueueLevelMeterState *)calloc(sizeof(AudioQueueLevelMeterState),_format.mChannelsPerFrame);
UInt32 trueValue = true;
AudioQueueSetProperty(_audioQueue,kAudioQueueProperty_EnableLevelMetering, &trueValue, sizeof(UInt32));
for (int i = 0; i < MCAudioQueueBufferCount; ++i)
{
AudioQueueBufferRef buffer;
if (![self _checkAudioQueueSuccess:AudioQueueAllocateBuffer(_audioQueue, _bufferSize, &buffer)])
{
break;
}
if (![self _checkAudioQueueSuccess:AudioQueueEnqueueBuffer(_audioQueue, buffer, 0, NULL)])
{
break;
}
}
}
static void MCAudioQueueInuputCallback(void *inClientData,
AudioQueueRef inAQ,
AudioQueueBufferRef inBuffer,
const AudioTimeStamp *inStartTime,
UInt32 inNumberPacketDescriptions,
const AudioStreamPacketDescription *inPacketDescs)
{
WAudioInputQueue *audioOutputQueue = (__bridge WAudioInputQueue *)inClientData;
[audioOutputQueue handleAudioQueueOutputCallBack:inAQ
buffer:inBuffer
inStartTime:inStartTime
inNumberPacketDescriptions:inNumberPacketDescriptions
inPacketDescs:inPacketDescs];
}
- (void)handleAudioQueueOutputCallBack:(AudioQueueRef)audioQueue
buffer:(AudioQueueBufferRef)buffer
inStartTime:(const AudioTimeStamp *)inStartTime
inNumberPacketDescriptions:(UInt32)inNumberPacketDescriptions
inPacketDescs:(const AudioStreamPacketDescription *)inPacketDescs
{
if (_started)
{
[_buffer appendBytes:buffer->mAudioData length:buffer->mAudioDataByteSize];
if ([_buffer length] >= _bufferSize)
{
NSRange range = NSMakeRange(0, _bufferSize);
NSData *subData = [_buffer subdataWithRange:range];
[_delegate inputQueue:self inputData:subData numberOfPackets:inNumberPacketDescriptions finish:NO];
[_buffer replaceBytesInRange:range withBytes:NULL length:0];
}
[self _checkAudioQueueSuccess:AudioQueueEnqueueBuffer(_audioQueue, buffer, 0, NULL)];
}else{
[_buffer appendBytes:buffer->mAudioData length:buffer->mAudioDataByteSize];
NSRange range = NSMakeRange(0, buffer->mAudioDataByteSize);
NSData *subData = [_buffer subdataWithRange:range];
[_delegate inputQueue:self inputData:subData numberOfPackets:inNumberPacketDescriptions finish:NO];
[_buffer replaceBytesInRange:range withBytes:NULL length:0];
}
APMM_DEBUG(@"handleAudioQueueOutputCallBack, data length:%u",(unsigned int)buffer->mAudioDataByteSize);
}
通过
AudioQueue类的注册的callback方法拿到语音buffer(PCM 未压缩音频数据)
压缩
目前语音压缩的编码格式很多,像AMR、SILK等,但是在同等采样率和比特率的条件下,SILK在音质包括降噪等方面脚AMR较优秀,可以看到微信也是采用后者。
silk相关参数的配置
/* Define decode codec specific settings should be moved to h file */
#define DECODE_MAX_BYTES_PER_FRAME 1024
#define DECODE_MAX_INPUT_FRAMES 5
#define DECODE_MAX_FRAME_LENGTH 480
#define DECODE_FRAME_LENGTH_MS 20
#define DECODE_MAX_API_FS_KHZ 48
#define DECODE_MAX_LBRR_DELAY 2
/* Define encode codec specific settings */
#define ENCODE_MAX_BYTES_PER_FRAME 250 // Equals peak bitrate of 100 kbps
#define ENCODE_MAX_INPUT_FRAMES 5
#define ENCODE_FRAME_LENGTH_MS 20
#define ENCODE_MAX_API_FS_KHZ 48
在录制中拿到的语音buffer 通过SILK的如上配置进行编码最终拿到编码后的silk数据buffer,到此,我们已经能实时拿到录音的编码后数据。
上传
如果按照其他类型文件上传,语音需要拿最终结束录音后保存的文件进行上传,每次用户录音完成,如果录音时间较久,比如60秒后才进行上传,那么意味着60秒后才去建立连接上传服务端,那么其在数据网络或者弱网条件下的耗时是很大的,于是我们可以在用户点击录音开始便可以建连、每次通过AudioQueue拿到编码后的数据进行写数据,直到用户结束录音,我们的录音数据也同时上传完毕。
这里我们可以使用
HTTP/1.1 POST
multipart/form-data
首先,${bound} 是一个占位符,代表我们规定的分割符,可以自己任意规定,但为了避免和正常文本重复了,尽量要使用复杂一点的内容。
然后,Content-Type里指明了数据是以mutipart/form-data来编码,本次请求的boundary是什么内容。消息主体里按照字段个数又分为多个结构类似的部分,每部分都是以–boundary开始,紧接着内容描述信息,然后是回车,最后是字段具体内容(文本或二进制)。如果传输的是文件,还要包含文件名和文件类型信息。消息主体最后以–boundary–标示结束。
POST http://www.example.com HTTP/1.1
Content-Type:multipart/form-data; boundary=${bound}
--${bound}
Content-Disposition: form-data; name="text"
title
--${bound}
Content-Disposition: form-data; name="file"; filename="chrome.png"
Content-Type: image/png
PNG ... content of chrome.png ...
--${bound}--
在iOS平台, 已经有很多优秀的网络开源库,这里我采用的是AFNetWorking的第三方开源库,它本身就已经支持了
multipart/form-data的协议封装,我们只需稍加改造,便可以通过HOLD住上传流来进行多表单方式的上传,一边录制一边压缩一边上传语音,直到结束录音没有语音数据。
- (NSMutableURLRequest *)multipartFormStreamRequestWithMethod:(NSString *)method
URLString:(NSString *)URLString
parameters:(NSDictionary *)parameters
constructingBodyWithBlock:(void (^)(id <AFMultipartFormData> formData))block
error:(NSError *__autoreleasing *)error
{
NSParameterAssert(method);
NSParameterAssert(![method isEqualToString:@"GET"] && ![method isEqualToString:@"HEAD"]);
NSMutableURLRequest *mutableRequest = [self requestWithMethod:method URLString:URLString parameters:nil error:error];
__block AFStreamingMultipartFormData *formData = [[AFStreamingMultipartFormData alloc] initWithURLRequest:mutableRequest stringEncoding:NSUTF8StringEncoding];
if (parameters) {
for (AFQueryStringPair *pair in AFQueryStringPairsFromDictionary(parameters)) {
NSData *data = nil;
if ([pair.value isKindOfClass:[NSData class]]) {
data = pair.value;
} else if ([pair.value isEqual:[NSNull null]]) {
data = [NSData data];
} else {
data = [[pair.value description] dataUsingEncoding:self.stringEncoding];
}
if (data) {
[formData appendPartWithFormData:data name:[pair.field description]];
}
}
}
if (block) {
block(formData);
}
return [formData requestByFinalizingMultipartFormDataWithOutLength];
}
- (NSMutableURLRequest *)requestByFinalizingMultipartFormDataWithOutLength {
if ([self.bodyStream isEmpty]) {
return self.request;
}
// Reset the initial and final boundaries to ensure correct Content-Length
[self.bodyStream setInitialAndFinalBoundaries];
[self.request setHTTPBodyStream:self.bodyStream];
[self.request setValue:[NSString stringWithFormat:@"multipart/form-data; boundary=%@", self.boundary] forHTTPHeaderField:@"Content-Type"];
[self.request setValue:@"100-Continue" forHTTPHeaderField:@"Expect"];
return self.request;
}
- (NSInteger)read:(uint8_t *)buffer
maxLength:(NSUInteger)length
{
if ([self streamStatus] == NSStreamStatusClosed) {
return 0;
}
NSInteger totalNumberOfBytesRead = 0;
#pragma clang diagnostic push
#pragma clang diagnostic ignored "-Wgnu"
while ((NSUInteger)totalNumberOfBytesRead < MIN(length, self.numberOfBytesInPacket)) {
if (!self.currentHTTPBodyPart || ![self.currentHTTPBodyPart hasBytesAvailable]) {
if (!(self.currentHTTPBodyPart = [self.HTTPBodyPartEnumerator nextObject])) {
break;
}
} else {
NSUInteger maxLength = length - (NSUInteger)totalNumberOfBytesRead;
NSInteger numberOfBytesRead = [self.currentHTTPBodyPart read:&buffer[totalNumberOfBytesRead] maxLength:maxLength];
if (numberOfBytesRead == -1) {
self.streamError = self.currentHTTPBodyPart.inputStream.streamError;
break;
}else if (numberOfBytesRead == 0){
[NSThread sleepForTimeInterval:1];
}else {
totalNumberOfBytesRead += numberOfBytesRead;
if (self.delay > 0.0f) {
[NSThread sleepForTimeInterval:self.delay];
}
return totalNumberOfBytesRead;
}
}
}
#pragma clang diagnostic pop
return totalNumberOfBytesRead;
}