Size: a a a

2019 October 31
DL in NLP
Рубрика «Читаем статьи за вас». Июль — Сентябрь 2019
habr.com/ru/company/ods/blog/472672
источник
2019 November 01
DL in NLP
BPE-Dropout: Simple and Effective Subword Regularization
Provilkov et al. [Yandex]
arxiv.org/abs/1910.13267

TL;DR
Hypothesis: standard BPE embeddings do not allow the model to learn the full variety of morphology in the language and are not robust to segmentation errors.
Proposed solution: let’s use multiple segmentations of the same word during training (see the picture).
Results: up to +3 BPE; makes rare tokens less rare and improves their embeddings; more robust to misspellings.

It is a very simple regularization algorithm and can be seen as a token-level augmentation too. Let’s use it! The algorithm is a part of github.com/rsennrich/subword-nmt (just use argument --dropout 0.1 )

via twitter.com/lena_voita/status/1189546512491134977
источник
DL in NLP
источник
2019 November 02
DL in NLP
Google увидел, что люди бегут от tensorboard к wandb.ai и создал свой аналог. Если вы всё ещё используете Tensorboard, то теперь можете складывать его в облако.

twitter.com/alxkh/status/1189952614139670528
источник
DL in NLP
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
Lewis et al. [FAIR]
arxiv.org/abs/1910.13461

Ok, people. We all know that seq2seq is kinda the most general task in NLP. So every pre-training task we have can be built into it. Let’s add some new ones and see what happends.
List of tasks:
1. Masked Language Modelling
1. Random Token Deletion
1. Text Infilling - replase a sequence of tokens with a single MASK token, restore them all (not knowing how many was deleted)
1. Sentence Shuffling - permute the sentences, restore them in the original order
1. Document Rotation - select random token, start document from this token and when it ends continue it with the first token, model should restore the document starting with the original token

After some experiments authors chose two tasks: Text Infilling and Sentence Shuffling and train seq2seq on this two tasks in RoBERTa setup.

Results: compatible to RoBERTa on NLU (GLUE, SQuAD, …) and SOTA on NLG (ELI5, XSum, CNN/DailyMail, PersonaChat). But not-so-good metrics at (Romanian-English) translation.
источник
DL in NLP
источник
DL in NLP
источник
DL in NLP
Paper with my highlights
источник
2019 November 03
DL in NLP
Подъезжают новости с текущего EmNLP
источник
DL in NLP
Переслано от Yaroslav Emelianov
https://arxiv.org/abs/1908.07898
Пока одни делают yet another B*RT, чтобы выбить соту на открытых датасетах, другие заморочились и сделали анализ annotator bias в nli данных
В частности, добавка id аннотатора как фичи модели дает прибавку к метрикам и генерализация модели при разбивке данных по аннотаторам получается хуже чем на test set.
источник
2019 November 04
DL in NLP
Interesting Reddit discussion about cross-entropy loss vs L2 over embeddings

reddit.com/r/MachineLearning/comments/dqoh2u/d_any_principled_reason_for_cross_entropy_instead
источник
DL in NLP
Я часто вижу вопросы вида “что мне почитать, чтобы разбираться в NLP”
источник
DL in NLP
источник
2019 November 05
DL in NLP
Несколько очевидные наблюдения, но зато всё на постере и хорошо видно.
источник
DL in NLP
источник
2019 November 06
DL in NLP
Переслано от viktor
источник
DL in NLP
источник
DL in NLP
Ещё одна статья в копилку про text style transfer

Style Transfer for Texts: Retrain, Report Errors, Compare with Rewrites
Tikhonov et al.

This paper shows that standard assessment methodology for style transfer has several sig- nificant problems. First, the standard metrics for style accuracy and semantics preservation vary significantly on different re-runs. There- fore one has to report error margins for the obtained results. Second, starting with cer- tain values of bilingual evaluation understudy (BLEU) between input and output and accu- racy of the sentiment transfer the optimization of these two standard metrics diverge from the intuitive goal of the style transfer task. Finally, due to the nature of the task itself, there is a specific dependence between these two met- rics that could be easily manipulated. Under these circumstances, we suggest taking BLEU between input and human-written reformula- tions into consideration for benchmarks. We also propose three new architectures that out- perform state of the art in terms of this metric.

https://www.aclweb.org/anthology/D19-1406.pdf
источник
DL in NLP
источник
DL in NLP
Revealing the Dark Secrets of BERT
Kovaleva et al. [UMass Lowell]
arxiv.org/abs/1908.08593

Статья нашей лабы по интерпретации Берта на EMNLP
источник