关于“如何学习自然语言处理”,有很多同学通过不同的途径留过言,这方面虽然很早之前写过几篇小文章:《如何学习自然语言处理》和《几本自然语言处理入门书》,但是更推崇知乎上这个问答:自然语言处理怎么最快入门,里面有微软亚洲研究院周明老师的系统回答和清华大学刘知远老师的倾情奉献:初学者如何查阅自然语言处理(NLP)领域学术资料,当然还包括其他同学的无私分享。
不过,对于希望入门NLP的同学来说,推荐你们先看一下这本书: Speech and Language Processing,第一版中文名译为《自然语言处理综论》,作者都是NLP领域的大大牛:斯坦福大学 Dan Jurafsky 教授和科罗拉多大学的 James H. Martin 教授。这也是我当年的入门书,我读过这本书的中文版(翻译自第一版英文版)和英文版第二版,该书第三版正在撰写中,作者已经完成了不少章节的撰写,所完成的章节均可下载:Speech and Language Processing (3rd ed. draft)。从章节来看,第三版增加了不少和NLP相关的深度学习的章节,内容和篇幅相对于之前有了更多的更新:
| Chapter | Slides | Relation to 2nd ed. |
1: | Introduction | | [Ch. 1 in 2nd ed.] |
2: | [Ch. 2 and parts of Ch. 3 in 2nd ed.] | ||
3: | Finite State Transducers | ||
4: | [Ch. 4 in 2nd ed.] | ||
5: | [expanded from pieces in Ch. 5 in 2nd ed.] | ||
6: | [new in this edition] | ||
7: | |||
8: | Neural Nets and Neural Language Models | ||
9: | | [Ch. 6 in 2nd ed.] | |
10: | | [Ch. 5 in 2nd ed.] | |
| |||
11: | | [Ch. 12 in 2nd ed.] | |
12: | | [Ch. 13 in 2nd ed.] | |
13: | Statistical Parsing | ||
14: | | [new in this edition] | |
| |||
15: | [expanded from parts of Ch. 19 and 20 in 2nd ed.] | ||
16: | [new in this edition] | ||
17: | [expanded from parts of Ch. 19 and 20 in 2nd ed.] | ||
18: | [new in this edition] | ||
| |||
19: | The Representation of Sentence Meaning | ||
20: | Computational Semantics | ||
21: | | [Ch. 22 in 2nd ed.] | |
22: | [expanded from parts of Ch. 19 and 20 in 2nd ed.] | ||
23: | Neural Models of Sentence Meaning (RNN, LSTM, CNN, etc.) | ||
| |||
24: | Coreference Resolution and Entity Linking | ||
25: | Discourse Coherence | ||
| |||
26: | Seq2seq Models and Summarization | ||
27: | Machine Translation | ||
28: | |||
29: | Conversational Agents | ||
30: | Speech Recognition | ||
31: | Speech Synthesis |
另外该书作者之一斯坦福大学 Dan Jurafsky 教授曾经在Coursera上开设过一门自然语言处理课程:Natural Language Processing,该课程目前貌似在Coursera新课程平台上已经查询不到,不过我们在百度网盘上做了一个备份,包括该课程视频和该书的第二版英文,两个一起看,效果更佳:
2018.3 更新:链接: https://pan.baidu.com/s/1Wp35AyHY1PrmisA4deoC6Q 密码: sps4
对于一直寻找如何入门自然语言处理的同学来说,先把这本书和这套课程拿下来才是一个必要条件,万事先有个基础。