fastNLP/reproduction/text_classification
yh 0a33a32081 Merge branch 'dev0.5.0' of https://github.com/fastnlp/fastNLP into dev0.5.0
# Conflicts:
#	fastNLP/modules/encoder/embedding.py
#	reproduction/seqence_labelling/ner/train_ontonote.py
#	reproduction/text_classification/model/lstm.py
2019-07-12 09:56:35 +08:00
..
data 1. BucketSampler不需要自己传入batch_size了,由Trainer自动设置 2019-07-12 09:27:11 +08:00
model Merge branch 'dev0.5.0' of https://github.com/fastnlp/fastNLP into dev0.5.0 2019-07-12 09:56:35 +08:00
test add TC/MTL16Loader 2019-06-16 23:43:37 +08:00
utils -update DPCNN & train script 2019-07-07 16:07:26 +08:00
README.md update README.md 2019-07-11 00:03:24 +08:00
train_awdlstm.py 修改代码以适配新embeddings模块 2019-07-12 06:47:04 +08:00
train_bert.py Merge branch 'dev0.5.0' of https://github.com/fastnlp/fastNLP into dev0.5.0 2019-07-12 09:56:35 +08:00
train_char_cnn.py 增加fastNLP.embeddings模块并修改对应的现有代码以适配fastNLP.embeddings 2019-07-12 04:07:47 +08:00
train_dpcnn.py 增加fastNLP.embeddings模块并修改对应的现有代码以适配fastNLP.embeddings 2019-07-12 04:07:47 +08:00
train_HAN.py 增加fastNLP.embeddings模块并修改对应的现有代码以适配fastNLP.embeddings 2019-07-12 04:07:47 +08:00
train_lstm_att.py 增加fastNLP.embeddings模块并修改对应的现有代码以适配fastNLP.embeddings 2019-07-12 04:07:47 +08:00
train_lstm.py 增加fastNLP.embeddings模块并修改对应的现有代码以适配fastNLP.embeddings 2019-07-12 04:07:47 +08:00

text_classification任务模型复现

这里使用fastNLP复现以下模型

char_cnn :论文链接Character-level Convolutional Networks for Text Classification

dpcnn:论文链接Deep Pyramid Convolutional Neural Networks for TextCategorization

HAN:论文链接Hierarchical Attention Networks for Document Classification

LSTM+self_attention:论文链接A Structured Self-attentive Sentence Embedding

AWD-LSTM:论文链接Regularizing and Optimizing LSTM Language Models

数据集及复现结果汇总

使用fastNLP复现的结果vs论文汇报结果(/前为fastNLP实现后面为论文报道,-表示论文没有在该数据集上列出结果)

model name yelp_p yelp_f sst-2 IMDB
char_cnn 93.80/95.12 - - -
dpcnn 95.50/97.36 - - -
HAN - - - -
LSTM 95.74/- 64.16/- - 88.52/-
AWD-LSTM 95.96/- 64.74/- - 88.91/-
LSTM+self_attention 96.34/- 65.78/- - 89.53/-