开源语音识别算法语音识别开发

转载

笑傲江湖求败 2024-03-30 16:15:15

文章标签 开源语音识别算法 android 语音识别语音合成科大讯飞 文章分类 NLP 人工智能

好久没有更新博客了，一个月又差不多过完了，公司的项目又被搁浅了，然后天天去客户公司无所事事，光看别人的代码最坑的是那套代码还没有注释，现在我是严重鄙视那些不写注释的程序狗，然后项目还跑不起来，以前也没有做过金融类的的项目，里面全都是一些报文格式，然后还没有需求文档，真的是醉了，刚好呢现在公司也不大管我们了，不注重研发的公司就是坑。。。。。。

今天呢随便来用下科大讯飞的语音功能，代码也非常的简单，然后就把它集成到一个查快递的app里，也是懒得改东西了。

首先就先来说下科大讯飞：语音技术实现了人机语音交互，使人与机器之间沟通变得像人与人沟通一样简单。语音技术主要包括语音合成和语音识别两项关键技术。让机器说话，用的是语音合成技术；让机器听懂人说话，用的是语音识别技术。此外，语音技术还包括语音编码、音色转换、口语评测、语音消噪和增强等技术，有着广阔应用空间。

讯飞的语音sdk是需要申请的，地址是：http://www.voicecloud.cn/。申请一个讯飞的开发者账号，然后申请一个appid，申请的时候需要填写开发者信息和你的应用的信息。记住这个appid是开发中必须用到的，所以搞那么麻烦就是为了这个appid和下载开发的sdk。这appid与你开启的服务有关，一般就基础服务就好了，有些是增值服务，然后每一个项目的appid都是不一样的。

由于科大讯飞的sdk的版本有所不同，然后调用的方法也有所不同，特别是Msc.jar里面的内容是不断更新的，开发的时候你可以先导入他的sample看看，去感受一下它的强大。然后我们就把libs文件夹拷贝到自己的项目工程里，接着就是编写代码了，也不多说了。

这个文章是根据sdk写的，也参考了别人的代码，毕竟也是纯属好玩搞了一下，能够把声音转化为文字，再把文字转为声音，就能满足一些基本的功能了，里面的注释写的很详细，只要随便改改就能直接用到你的项目里面。

开源语音识别算法语音识别开发_科大讯飞

从图里可以看到这是运用的几个功能，但是好像Msc.jar的版本不一样所以调用的方法有所出入，然后我自己的好像就没有SpeechUtilty这个类，文中主要用到了RecognizerDialog，SpeechRecognizer，RecognizerDialogListener，SpeechListener,SynthesizerListener，SpeechSynthesizer。然后功能只能把语音转换为文字，再把文本信息本地合成语音，还有很多的地方需要完善。

简单效果：

开源语音识别算法语音识别开发_语音识别_02

开源语音识别算法语音识别开发_语音合成_03

先看布局

main_activity.xml

<RelativeLayout xmlns:android="http://schemas.android.com/apk/res/android"
        xmlns:tools="http://schemas.android.com/tools"
        android:layout_width="match_parent"
        android:layout_height="match_parent"
        >

<Button
android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:layout_alignParentBottom="true"
        android:background="@drawable/icon_microphone"
        android:layout_alignParentRight="true"
        android:id="@+id/button1"
        />
<EditText
android:id="@+id/content"
        android:layout_width="fill_parent"
        android:layout_marginRight="50dp"
        android:layout_height="30dp"
        android:layout_alignParentBottom="true"
        android:layout_alignParentLeft="true"

        />
</RelativeLayout>

mainActivity.java

package com.example.voicetoword;

        import android.app.Activity;
        import android.content.Intent;
        import android.os.Bundle;
        import android.view.Menu;
        import android.view.View;
        import android.view.View.OnClickListener;
        import android.widget.Button;
        import android.widget.EditText;

        import com.coderqi.publicutil.voice.VoiceToWord;
        import com.example.voicetoword.R;

public class MainActivity extends Activity implements OnClickListener{
    Button but = null;
    private EditText content;
    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);
        but = (Button) findViewById(R.id.button1);
        content=(EditText) findViewById(R.id.content);
        but.setOnClickListener(this);
    }

    @Override
    public boolean onCreateOptionsMenu(Menu menu) {
        getMenuInflater().inflate(R.menu.main, menu);
        return true;
    }


    @Override
    public void onClick(View v) {
        switch (v.getId()) {
            //听写按钮
            case R.id.button1:
                VoiceToWord voice = new VoiceToWord(MainActivity.this,"54ae8c54");//你申请的appid
                voice.GetWordFromVoice();
                break;
        }
    }
    @Override
    protected void onResume() {
        super.onResume();
        //content.setText(((MyApplicaton)getApplication()).getText());
    }
}

接下来的就是数据解析类：

package zy.voice;

        import org.json.JSONArray;
        import org.json.JSONObject;
        import org.json.JSONTokener;

        import android.text.TextUtils;

//import com.iflytek.speech.ErrorCode;
//import com.iflytek.speech.SpeechError;
/**
 * 对云端返回的Json结果进行解析
 * @author iFlytek
 * @since 20131211
 */
public class JsonParser {

    /**
     * 听写结果的Json格式解析
     * @param json
     * @return
     */
    public static String parseIatResult(String json) {
        if(TextUtils.isEmpty(json))
            return "";

        StringBuffer ret = new StringBuffer();
        try {
            JSONTokener tokener = new JSONTokener(json);
            JSONObject joResult = new JSONObject(tokener);

            JSONArray words = joResult.getJSONArray("ws");
            for (int i = 0; i < words.length(); i++) {
                // 听写结果词，默认使用第一个结果
                JSONArray items = words.getJSONObject(i).getJSONArray("cw");
                JSONObject obj = items.getJSONObject(0);
                ret.append(obj.getString("w"));//识别的为单个字
//          如果需要多候选结果，解析数组其他字段
//          for(int j = 0; j < items.length(); j++)
//          {
//             JSONObject obj = items.getJSONObject(j);
//             ret.append(obj.getString("w"));
//          }
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
        return ret.toString();
    }

    /**
     * 识别结果的Json格式解析
     * @param json
     * @return
     */
    public static String parseGrammarResult(String json) {
        StringBuffer ret = new StringBuffer();
        try {
            JSONTokener tokener = new JSONTokener(json);
            JSONObject joResult = new JSONObject(tokener);

            JSONArray words = joResult.getJSONArray("ws");//识别的为词
            for (int i = 0; i < words.length(); i++) {
                JSONArray items = words.getJSONObject(i).getJSONArray("cw");//中文分词
                for(int j = 0; j < items.length(); j++)
                {
                    JSONObject obj = items.getJSONObject(j);
                    if(obj.getString("w").contains("nomatch"))//不匹配的时候
                    {
                        ret.append("没有匹配结果.");
                        return ret.toString();
                    }
                    ret.append("【结果】" + obj.getString("w"));
                    ret.append("【置信度】" + obj.getInt("sc"));//分数
                    ret.append("\n");
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
            ret.append("没有匹配结果.");
        }
        return ret.toString();
    }

    /**
     * 语义结果的Json格式解析
     * @param json
     * @return
     */
    public static String parseUnderstandResult(String json) {
        StringBuffer ret = new StringBuffer();
        try {
            JSONTokener tokener = new JSONTokener(json);
            JSONObject joResult = new JSONObject(tokener);

            ret.append("【应答码】" + joResult.getString("rc") + "\n");
            ret.append("【转写结果】" + joResult.getString("text") + "\n");
            ret.append("【服务名称】" + joResult.getString("service") + "\n");
            ret.append("【操作名称】" + joResult.getString("operation") + "\n");
            ret.append("【完整结果】" + json);
        } catch (Exception e) {
            e.printStackTrace();
            ret.append("没有匹配结果.");
        }
        return ret.toString();
    }
}

这个是弹出的说话的dialog

package zy.voice;
        import android.content.Context;
        import android.text.TextUtils;
        import android.widget.Toast;
        import com.iflytek.cloud.speech.RecognizerResult;
        import com.iflytek.cloud.speech.SpeechConstant;
        import com.iflytek.cloud.speech.SpeechError;
        import com.iflytek.cloud.speech.SpeechSynthesizer;
        import com.iflytek.cloud.speech.SynthesizerListener;
        import com.iflytek.cloud.ui.RecognizerDialogListener;
/**
 * 识别回调监听器
 */
public class MyRecognizerDialogLister implements RecognizerDialogListener,SynthesizerListener{
    private Context context;

    //本地合成对象
    private SpeechSynthesizer speechSynthesizer;
    public MyRecognizerDialogLister(Context context) {
        this.context = context;
        setParam();
    }

    // 自定义的结果回调函数，成功执行第一个方法，失败执行第二个方法
    @Override
    public void onResult(RecognizerResult results, boolean isLast) {
        String text = JsonParser.parseIatResult(results.getResultString());
        Toast.makeText(context, text, Toast.LENGTH_LONG).show();
        //app.setText(text);

        if(!TextUtils.isEmpty(text)){//如果返回的内容不为空
            //开始把文本信息合成语音
            speechSynthesizer.startSpeaking(text, this);
        }

    }

    /**
     * 识别回调错误.
     */
    @Override
    public void onError(SpeechError error) {//错误码请访问http://open.voicecloud.cn/index.php/default/doccenter/doccenterInner?itemTitle=ZmFx&anchor=Y29udGl0bGU2Ng==
        // TODO Auto-generated method stub
        int errorCoder = error.getErrorCode();
        switch (errorCoder) {
            case 10118://用户没有说话，没有数据
                System.out.println("user don't speak anything");
                break;
            case 10200://网络一般错误
                System.out.println("can't connect to internet");
                break;
            default:
                break;
        }
    }

    public void setParam()
    {
        //创建SpeechSynthesizer 对象
        speechSynthesizer = SpeechSynthesizer.createSynthesizer(context);
        speechSynthesizer.setParameter(SpeechConstant.VOICE_NAME, "xiaoyan");//默认的，发音人为小燕（青年女声）,小梅为新引擎参数，效果好点
        speechSynthesizer.setParameter(SpeechConstant.SPEED, "50");//语速
        speechSynthesizer.setParameter(SpeechConstant.VOLUME, "50");//音量，范围是0---100
        speechSynthesizer.setParameter(SpeechConstant.PITCH, "50");
    }
    /**
     *  percent  缓冲进度 0-100
     *  beginPos 缓冲音频在文中的开始位置
     *  endPos 缓冲音频在文中的末位置
     *  info   附加信息
     * */
    @Override
    public void onBufferProgress(int percent, int beginPos, int endPos, String info) {

    }
    //会话结束回调接口,无错误时err为null
    @Override
    public void onCompleted(SpeechError err) {


    }
    //开始播放
    @Override
    public void onSpeakBegin() {


    }
    //暂停播放
    @Override
    public void onSpeakPaused() {


    }
    /**
     *  percent  播放进度 0-100
     *  beginPos  播放音频在文中的开始位置
     *  endPos  播放音频在文中的末位置
     * */
    @Override
    public void onSpeakProgress(int percent, int beginPos, int endPos) {


    }
    //恢复 播放
    @Override
    public void onSpeakResumed() {


    }}

最后的是把你的话转为文字

package zy.voice;
        import android.app.Activity;
        import android.content.Context;
        import android.content.SharedPreferences;
        import android.os.Bundle;
        import android.os.Environment;
        import android.text.TextUtils;
        import android.widget.Toast;
        import com.iflytek.cloud.speech.SpeechConstant;
        import com.iflytek.cloud.speech.SpeechError;
        import com.iflytek.cloud.speech.SpeechListener;
        import com.iflytek.cloud.speech.SpeechRecognizer;
        import com.iflytek.cloud.speech.SpeechUser;
        import com.iflytek.cloud.ui.RecognizerDialog;
        import com.iflytek.cloud.ui.RecognizerDialogListener;
        import com.iflytek.sunflower.FlowerCollector;
public class VoiceToWord extends Activity {
    private Context context;
    private Toast mToast;
    // 识别窗口
    private RecognizerDialog iatDialog;
    // 识别对象
    private SpeechRecognizer iatRecognizer;
    // 缓存，保存当前的引擎参数到下一次启动应用程序使用.
    private SharedPreferences mSharedPreferences;
    private RecognizerDialogListener recognizerDialogListener = null;
    public VoiceToWord(Context context, String APP_ID) {
        // TODO Auto-generated constructor stub
        // 用户登录
        this.context = context;
        // 初始化缓存对象.
        mSharedPreferences = context.getSharedPreferences(
                context.getPackageName(), MODE_PRIVATE);
        SpeechUser.getUser().login(context, null, null, "appid=" + APP_ID,
                listener);
        // 初始化听写Dialog,如果只使用有UI听写功能,无需创建SpeechRecognizer
        iatDialog = new RecognizerDialog(context);
        mToast = Toast.makeText(context, "", Toast.LENGTH_LONG);
        // 初始化听写Dialog,如果只使用有UI听写功能,无需创建SpeechRecognizer
        iatDialog = new RecognizerDialog(context);
        iatDialog.setCanceledOnTouchOutside(false);
    }
    public VoiceToWord(Context context, String APP_ID,
                       RecognizerDialogListener recognizerDialogListener) {
        this.context = context;
        SpeechUser.getUser().login(context, null, null, "appid=" + APP_ID,
                listener);
        // 初始化听写Dialog,如果只使用有UI听写功能,无需创建SpeechRecognizer
        iatDialog = new RecognizerDialog(context);
        mToast = Toast.makeText(context, "", Toast.LENGTH_LONG);
        // 初始化听写Dialog,如果只使用有UI听写功能,无需创建SpeechRecognizer
        iatDialog = new RecognizerDialog(context);
        // 在dialog外面不能取消
        iatDialog.setCanceledOnTouchOutside(false);
        // 初始化缓存对象.
        mSharedPreferences = context.getSharedPreferences(
                context.getPackageName(), MODE_PRIVATE);
        this.recognizerDialogListener = recognizerDialogListener;
    }
    public void GetWordFromVoice() {
        boolean isShowDialog = mSharedPreferences.getBoolean("iat_show", true);
        if (isShowDialog) {
            // 显示语音听写Dialog.
            showIatDialog();
        } else {
            if (null == iatRecognizer) {
                iatRecognizer = SpeechRecognizer.createRecognizer(this);
                // 设置返回结果格式
//    iatRecognizer.setParameter(SpeechConstant.RESULT_TYPE, "json");
//
//    String lag = mSharedPreferences.getString(
//      "iat_language_preference", "mandarin");
//    if (lag.equals("en_us")) {
//     // 设置语言
//     iatRecognizer
//       .setParameter(SpeechConstant.LANGUAGE, "en_us");
//    } else {
//     // 设置语言
//     iatRecognizer
//       .setParameter(SpeechConstant.LANGUAGE, "zh_cn");
//     // 设置语言区域
//     iatRecognizer.setParameter(SpeechConstant.ACCENT, lag);
//    }
//    // 设置语音前端点
//    iatRecognizer.setParameter(SpeechConstant.VAD_BOS,
//      mSharedPreferences.getString("iat_vadbos_preference",
//        "4000"));
//    // 设置语音后端点
//    iatRecognizer.setParameter(SpeechConstant.VAD_EOS,
//      mSharedPreferences.getString("iat_vadeos_preference",
//        "1000"));
//    // 设置标点符号
//    iatRecognizer.setParameter(SpeechConstant.ASR_PTT,
//      mSharedPreferences
//        .getString("iat_punc_preference", "1"));
//    // 设置音频保存路径
//    iatRecognizer.setParameter(SpeechConstant.ASR_AUDIO_PATH,
//      Environment.getExternalStorageDirectory()
//        + "/iflytek/wavaudio.pcm");//需在清单文件里添加sd卡的权限
            }
            if (iatRecognizer.isListening()) {
                iatRecognizer.stopListening();
                // ((Button)
                // findViewById(android.R.id.button1)).setEnabled(false);
            } else {
            }
        }
    }

    /**
     * 显示听写对话框.
     *
     * @param
     */
    public void showIatDialog() {
        if (null == iatDialog) {
            // 初始化听写Dialog
            iatDialog = new RecognizerDialog(this);
        }
        // 获取引擎参数
        String engine = mSharedPreferences.getString("iat_engine", "iat");
        // 清空Grammar_ID，防止识别后进行听写时Grammar_ID的干扰
        iatDialog.setParameter(SpeechConstant.CLOUD_GRAMMAR, null);
        // 设置听写Dialog的引擎
        iatDialog.setParameter(SpeechConstant.DOMAIN, engine);
        // 设置采样率参数，支持8K和16K
        String rate = mSharedPreferences.getString("sf", "sf");
        if (rate.equals("rate8k")) {
            iatDialog.setParameter(SpeechConstant.SAMPLE_RATE, "8000");
        } else {
            iatDialog.setParameter(SpeechConstant.SAMPLE_RATE, "16000");
        }
        if (recognizerDialogListener == null) {
            getRecognizerDialogListener();
        }
        // 显示听写对话框
        iatDialog.setListener(recognizerDialogListener);
        iatDialog.show();
    }
    private void getRecognizerDialogListener() {
        /**
         * 识别回调监听器
         */
        recognizerDialogListener = new MyRecognizerDialogLister(context);
    }
    /**
     * 用户登录回调监听器.
     */
    private SpeechListener listener = new SpeechListener() {
        @Override
        public void onData(byte[] arg0) {
        }
        @Override
        public void onCompleted(SpeechError error) {
            if (error != null) {
                System.out.println("user login success");
            }
        }
        @Override
        public void onEvent(int arg0, Bundle arg1) {
        }
    };
    protected void onDestroy() {
        // 退出时释放连接
        iatRecognizer.cancel();
        iatRecognizer.destroy();
    };
    @Override
    protected void onResume() {
        // 移动数据统计分析
        FlowerCollector.onResume(this);
        FlowerCollector.onPageStart("VoiceToWord");
        super.onResume();
    }
    @Override
    protected void onPause() {
        // 移动数据统计分析
        FlowerCollector.onPageEnd("VoiceToWord");
        FlowerCollector.onPause(this);
        super.onPause();
    }
}

最后别忘了添加权限

开源语音识别算法语音识别开发_语音识别_04