前段时间OpenAI推出巨无霸模型GPT3的时候,可谓在NLP领域爆出一颗原子弹。可惜俺没运气抽中内测。只好用上一代GPT2来实战一把。
开发任务:输入一段文字开头和创作字数,程序自动完成歌词创作。
思路:WEB部分采用HTML+Python+Django,实现输入输出;NLP部分采用Hugging Face的预训练模型,实现语言序列数据的推理。Hugging face 是一家总部位于纽约的聊天机器人初创服务商,专注于NLP技术,拥有大型开源社区。
官网链接 https://huggingface.co/
代码托管 https://github.com/huggingface/transformers
环境搭建:
1、安装anaconda
2、conda create -n nlp python=3.7 -y
3、pip install transformers django
代码实现:
1、django-admin startproject lyric
2、python manage.py startapp lyric_app
3、settings.py文件配置:
ALLOWED_HOSTS = ['*']
INSTALLED_APPS = [
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'lyric_app',
]
TEMPLATES = [
{
'BACKEND': 'django.template.backends.django.DjangoTemplates',
'DIRS':[os.path.join(BASE_DIR,'templates')],
'APP_DIRS': True,
'OPTIONS': {
'context_processors': [
'django.template.context_processors.debug',
'django.template.context_processors.request',
'django.contrib.auth.context_processors.auth',
'django.contrib.messages.context_processors.messages',
],
},
},
]
4、在Django项目路径下创建templates文件夹。
5、lyric项目路由文件urls.py:
from django.contrib import admin
from django.urls import path, include
urlpatterns = [
path('admin/', admin.site.urls),
path('lyric/', include('lyric_app.urls')),
]
5、lyric_app模块路由urls.py:
from django.urls import path, re_path
from . import views
urlpatterns = [
path('lyric', views.lyric),
path('lyric_query', views.lyric_query),
]
6、lyric_app模块视图文件views.py:
from django.shortcuts import render
from django.http import HttpResponse
from transformers import BertTokenizer, GPT2LMHeadModel, TextGenerationPipeline
def lyric(request):
return render(request,'page/lyric.html')
#加载语言模型
tokenizer = BertTokenizer.from_pretrained("uer/gpt2-chinese-lyric")
model = GPT2LMHeadModel.from_pretrained("uer/gpt2-chinese-lyric")
text_generator = TextGenerationPipeline(model, tokenizer)
def lyric_query(request):
ly = request.POST.get('ly')
length = request.POST.get('length')
print('ly=',ly, length)
txt = text_generator(ly, max_length=int(length), do_sample=True) #按长度生成创作语料
txt = txt[0]['generated_text'].replace(' ','')
print(txt)
return render(request,'page/lyric.html',{'ly': ly, 'txt': txt})
7、运行python manage.py migrate生成model文件
8、运行python manage.py runserver 127.0.0.1:8000启动服务
9、安装花生壳,将本地端口映射到外网
简易界面: