最近和几位本学院的研究生师兄师姐参加研究生电子设计大赛,自己也来折腾下之前没有搞完的语音控制小车。恰巧自己负责的是语音控制这部分,折腾了几天也差不多搞定啦…把自己的折腾套路总结一下,给一起折腾的小伙伴点经验之谈…
一、配置树莓派
我们这次使用的最新的树莓派3,镜像直接是官网下载的Raspbian。这也是让我感觉最有树莓派感觉的一个系统。主要是sudo raspi-config的配置。
1和2最好都是设置一下,然后3看自己爱好吧;4的里面我都是重新设置的,包括时区、语言和字体。字体最好下载(apt-get install ttf-wqy-zenhei)。然后第八的那些串口、引脚的功能最好都打开
配置WiFi:树莓派3是自带WiFi的,所以启动正常的话是可以直接连接的;如果没有,就iwlist scan 扫描一下。
更新源:apt-get update,apt-get upgrade。
二、语音聊天和语音控制引脚。
在这里我使用的是wiringPi(C语言);
安装wiringPi:
1 sudo apt-get install git-core
2 git clone git://git.drogon.net/wiringPi
3 cd wiringPi
4 ./build
编辑脚本控制引脚电平高低:
首先创建个目录:
1 cd ~
2 mkdir scripts
3 cd scripts
然后编辑脚本light内容:
1 #!/bin/bash
2 if [ $# > 1 ]
3 then
4 /usr/local/bin/gpio mode 4 out
5 if [[ "$1" = "on" ]]
6 then
7 /usr/local/bin/gpio write 4 on
8 fi
9
10 if [[ "$1" = "off" ]]
11 then
12 /usr/local/bin/gpio write 4 off
13 fi
14 fi
这里的4对应的是树莓派的引脚16(GEN 4),相关文档大家可以自行查阅。
赋给脚本权限:
1 chmod u+x light
然后就可以检验是否可以通过命令行控制引脚啦…
1 ./light on
2 ./light off
下面就是通过语音来控制。我几乎尝试了可以用的所有的语音识别SDK,科大讯飞、百度语音、Google…事实证明Google是最棒的,但是对于我们墙内玩家是真的头疼…
我最后采用的还是百度的API,因为个人喜欢python,恰好碰到一个百度的python开源程序。
直接上代码:
1 -*- coding: utf-8 -*-
2
3 import numpy as np
4 from datetime import datetime
5 import wave
6 import time
7 import urllib, urllib2, pycurl
8 import base64
9 import json
10 import os
11 import sys
12 reload(sys)
13 sys.setdefaultencoding( "utf-8" )
14
15 save_count = 0
16 save_buffer = []
17 t = 0
18 sum = 0
19 time_flag = 0
20 flag_num = 0
21 filename = 'asr.wav'
22 duihua = '1'
23
24 def getHtml(url):
25 page = urllib.urlopen(url)
26 html = page.read()
27 return html
28
29 def get_token():
30 apiKey = "Ll0c53MSac6GBOtpg22ZSGAU"
31 secretKey = "44c8af396038a24e34936227d4a19dc2"
32 auth_url = "https://openapi.baidu.com/oauth/2.0/token?grant_type=client_credentials&client_id=" + apiKey + "&client_secret=" + secretKey;
33 res = urllib2.urlopen(auth_url)
34 json_data = res.read()
35 return json.loads(json_data)['access_token']
36
37 def dump_res(buf):
38 global duihua
39 print "字符串类型"
40 print (buf)
41 a = eval(buf)
42 print type(a)
43 if a['err_msg']=='success.':
44 #print a['result'][0]#终于搞定了,在这里可以输出,返回的语句
45 duihua = a['result'][0]
46 print duihua
47
48 def use_cloud(token):
49 fp = wave.open(filename, 'rb')
50 nf = fp.getnframes()
51 f_len = nf * 2
52 audio_data = fp.readframes(nf)
53 cuid = "7519663" #产品id
54 srv_url = 'http://vop.baidu.com/server_api' + '?cuid=' + cuid + '&token=' + token
55 http_header = [
56 'Content-Type: audio/pcm; rate=8000',
57 'Content-Length: %d' % f_len
58 ]
59
60 c = pycurl.Curl()
61 c.setopt(pycurl.URL, str(srv_url)) #curl doesn't support unicode
62 #c.setopt(c.RETURNTRANSFER, 1)
63 c.setopt(c.HTTPHEADER, http_header) #must be list, not dict
64 c.setopt(c.POST, 1)
65 c.setopt(c.CONNECTTIMEOUT, 30)
66 c.setopt(c.TIMEOUT, 30)
67 c.setopt(c.WRITEFUNCTION, dump_res)
68 c.setopt(c.POSTFIELDS, audio_data)
69 c.setopt(c.POSTFIELDSIZE, f_len)
70 c.perform() #pycurl.perform() has no return val
71
72 # 将data中的数据保存到名为filename的WAV文件中
73 def save_wave_file(filename, data):
74 wf = wave.open(filename, 'wb')
75 wf.setnchannels(1)
76 wf.setsampwidth(2)
77 wf.setframerate(SAMPLING_RATE)
78 wf.writeframes("".join(data))
79 wf.close()
80
81 token = get_token()
82 key = '05ba411481c8cfa61b91124ef7389767'
83 api = 'http://www.tuling123.com/openapi/api?key=' + key + '&info='
84
85 while(True):
86 os.system('arecord -D "plughw:1,0" -f S16_LE -d 5 -r 8000 /home/pi/Desktop/2016-6-25/asr.wav')
87 use_cloud(token)
88 print duihua
89 info = duihua
90 duihua = ""
91 request = api + info
92 response = getHtml(request)
93 dic_json = json.loads(response)
94
95 a = dic_json['text']
96 print type(a)
97 unicodestring = a
98
99 # 将Unicode转化为普通Python字符串:"encode"
100 utf8string = unicodestring.encode("utf-8")
101
102 print type(utf8string)
103 print str(a)
104 url = "http://tsn.baidu.com/text2audio?tex="+dic_json['text']+"&lan=zh&per=0&pit=1&spd=7&cuid=8297904&ctp=1&tok=24.9ae6d50d24d1c222c4019be4c70613e7.2592000.1469358913.282335-8297904"
105 os.system('mplayer "%s"'%(url))
这是网上大神自己给树莓派自己写的Python脚本,然后我自己修改的如下可以实现语音控制引脚开灯和关灯。
1 # -*- coding: utf-8 -*-
2
3 import numpy as np
4 from datetime import datetime
5 import wave
6 import time
7 import urllib, urllib2, pycurl
8 import base64
9 import json
10 import os
11 import sys
12 reload(sys)
13 sys.setdefaultencoding( "utf-8" )
14
15 save_count = 0
16 save_buffer = []
17 t = 0
18 sum = 0
19 time_flag = 0
20 flag_num = 0
21 filename = 'asr.wav'
22 commun = '1'
23 answer = '1'
24 def getHtml(url):
25 page = urllib.urlopen(url)
26 html = page.read()
27 return html
28
29 def get_token():
30 apiKey = "Ll0c53MSac6GBOtpg22ZSGAU"
31 secretKey = "44c8af396038a24e34936227d4a19dc2"
32 auth_url = "https://openapi.baidu.com/oauth/2.0/token?grant_type=client_credentials&client_id=" + apiKey + "&client_secret=" + secretKey;
33 res = urllib2.urlopen(auth_url)
34 json_data = res.read()
35 return json.loads(json_data)['access_token']
36
37 def dump_res(buf):
38 global duihua
39 print "字符串类型"
40 print (buf)
41 a = eval(buf)
42 print type(a)
43 if a['err_msg']=='success.':
44 commun = a['result'][0]
45 print duihua
46
47 def use_cloud(token):
48 fp = wave.open(filename, 'rb')
49 nf = fp.getnframes()
50 f_len = nf * 2
51 audio_data = fp.readframes(nf)
52 cuid = "7519663" #产品id
53 srv_url = 'http://vop.baidu.com/server_api' + '?cuid=' + cuid + '&token=' + token
54 http_header = [
55 'Content-Type: audio/pcm; rate=8000',
56 'Content-Length: %d' % f_len
57 ]
58
59 c = pycurl.Curl()
60 c.setopt(pycurl.URL, str(srv_url)) #curl doesn't support unicode
61 #c.setopt(c.RETURNTRANSFER, 1)
62 c.setopt(c.HTTPHEADER, http_header) #must be list, not dict
63 c.setopt(c.POST, 1)
64 c.setopt(c.CONNECTTIMEOUT, 30)
65 c.setopt(c.TIMEOUT, 30)
66 c.setopt(c.WRITEFUNCTION, dump_res)
67 c.setopt(c.POSTFIELDS, audio_data)
68 c.setopt(c.POSTFIELDSIZE, f_len)
69 c.perform() #pycurl.perform() has no return val
70
71 # 将data中的数据保存到名为filename的WAV文件中
72 def save_wave_file(filename, data):
73 wf = wave.open(filename, 'wb')
74 wf.setnchannels(1)
75 wf.setsampwidth(2)
76 wf.setframerate(SAMPLING_RATE)
77 wf.writeframes("".join(data))
78 wf.close()
79
83
84 while(True):
85 os.system('arecord -D "plughw:1,0" -f S16_LE -d3 -r 8000 /home/pi/Desktop/2016-6-25/asr.wav')
86 use_cloud(token)
87 print commun
88 site = commun89 if "开" in site: #在返回的文本里寻找“开”
90 answer = '好的,正在为您开灯,请稍后'
91 url = "http://tsn.baidu.com/text2audio?tex="+answer+"&lan=zh&per=0&pit=1&spd=7&cuid=8297904&ctp=1&tok=24.9ae6d50d24d1c222c4019be4c70613e7.2592000.1469358913.282335-8297904"
92 os.system('mplayer "%s"'%(url))
93 os.system('cd /home/pi/Desktop/scripts&&./light on')
94 if "关" in site:
95 answer = '好的,正在为您关灯,请稍后'
96 url = "http://tsn.baidu.com/text2audio?tex="+answer+"&lan=zh&per=0&pit=1&spd=7&cuid=8297904&ctp=1&tok=24.9ae6d50d24d1c222c4019be4c70613e7.2592000.1469358913.282335-8297904"
97 os.system('mplayer "%s"'%(url))
98 os.system('cd /home/pi/Desktop/scripts&&./light off