Paper Reading 2017 02 10

2017鸡年第一波论文阅读总结及推荐

本次推荐论文如下:

  1. Structured prediction models for RNN based sequence labeling in clinical text. EMNLP 2016.
  2. Sentence rewriting for semantic parsing. ACL 2016.
  3. Adversarial training methods for semi-supervised text classification. Under review as a conference paper at ICLR 2017.

Structured prediction models for RNN based sequence labeling in clinical text

1
作者

Abhyuday N Jagannatha, Hong Yu

单位

University of Massachusetts, MA, USA Bedford VAMC and CHOIR, MA, USA

关键词

sequence labeling, skip chain CRF, Bi-LSTM

来源

EMNLP 2016

立题

解决医用领域的slot filling问题。

模型
(1)Bi-LSTM CRF

Bi-LSTM 的输出层套上CRF的criterion,可以参考:[Zhiheng Huang, Wei Xu, and Kai Yu. 2015. Bidirectional lstm-crf models for sequence tagging. arXiv preprint arXiv:1508.01991.]

(2)Bi-LSTM CRF with pairwise modeling

因为很多label的连接在训练数据中出现的次数很少,无法被很好地学习到。于是作者把CRF中$$A_{y_t,y_{t+1}}$$的label transition score用一个简单的神经网络模型来预测,这个NN模型的输入是t时刻和t+1时刻的隐层状态。

(3)Bi-LSTM with approximate Skip-chain CRF

Alt text

简评

这是一个对Bi-LSTM CRF深入研究的工作。

Sentence rewriting for semantic parsing

2
作者

Bo Chen, Le Sun, Xianpei Han, Bo An

单位

State Key Laboratory of Computer Sciences Institute of Software, Chinese Academy of Sciences, China.

关键词

Sentence rewriting, semantic parsing

来源

ACL 2016

立题

由于semantic parsing的目标logical form是依赖于ontology的词典的,所以有一些语义相同但是语言表达形式稀少的句子很难被解析正确。为了解决这个问题,该文章提出使用句子转写的方式,把语言表达形式稀少的句子转成意思相同且常见的句子。

模型
(1)Dictionary-based Rewriting

基于字典的名词再解释。

Alt text

做法举例:“For instance, the word “daughter” is explained as “female child” in Wiktionary”

使用到资源:Wiktionary

(2)Template-based Rewriting

基于句子模板的替换。 Alt text 使用到资源:WikiAnswers

简评

句子重写是一种解决数据稀疏问题的逆向方法。

Adversarial training methods for semi-supervised text classification

3
作者

Takeru Miyato, Andrew M Dai, Ian Goodfellow

单位

Kyoto University, Google Brain and OpenAI

关键词

Adversarial training, virtual adversarial training, semi-supervised learning, text classification

来源

ICLR 2017 (under review)

立题

把GAN学习应用到NLP里来。

模型

Alt text

(1)Adversarial training

有监督学习。

(2)Virtual adversarial training

无监督学习。

简评

提供了一种regularization的方案。