Телеграмм чат группы dlinnlp страница 27

parrot.ru

Рейтинг популярных групп и каналов

В рейтинге участвует:

групп:

каналов:

Виртуальный сервер на SSD - недорого!

Аренда выделенных и виртуальных серверов (VDS/VPS), хостинг, аренда IP-адресов, администрирование, круглосуточная поддержка

qwarta.ru подробнее

Резервное копирование с проверкой на вирусы!!!

Удобный сервис создания резервных копий на любой сервер сети интернет. Отслеживайте изменения, проверяйте на вирусы. Надежно защитите свой бизнес!

go.backupland.com

Выбираете сервер? Любая конфигурация на заказ!

Аренда физических серверов любых конфигураций под любые запросы - 1С бухгалтерия, игровые сервера, нагруженные проекты, интернет-магазины!

qwarta.ru подробнее

Size: a a a

DL in NLP

2929 membersпожаловаться на группу

1
«
…
‹
22
23
24
25
26
27
28
›
…
»

2019 October 15

exbert.net - A Visual Analysis Tool to Explore Learned Representations in Transformers Models

Удобная тулза для визуализации внутренних представлений BERT. Астроголи объявили удвоение статей по анализу трансформеров на следующем ACL.

twitter.com/Ben_Hoov/status/1183823783754371076

Pleased to announce exBERT, an interactive tool to explore the embeddings and attention of Transformer models at different layers, with different heads. Demo: https://t.co/FxFLDutYmK Paper: https://t.co/1x18EbgklP #NLProc With @sebgehr @hen_str @MITIBMLab https://t.co/k4NrAgbCb7

источник

123819:23пожаловаться #1

Ещё тут люди говорят, что RAdam на самом деле не делает ничего полезного и стоит продолжать использовать Adam с warmup. И даже предлагают сколько этого warmup’а нужно (спойлер: 2/(1−β2) итераций).

twitter.com/denisyarats/status/1183794108856459264

Denis Yarats

Excited to share our new work lead by @rjerryma, where we attempt to demystify RAdam (Liu et al. 2019) and its automatic learning rate warmup schedule. Paper: https://t.co/hDsfzzfmiv [1/4]

источник

125919:25пожаловаться #2

И вышел Python 3.8

Основные фичи:
1. Walrus operator, позволяющий одновременно присвоить значение и возвратить его
1. Positional-only arguments
1. Много всего нового в модуле typing (посмотрите на протоколы!)
1. Новый синтаксис в f-string f"{my_variable=}", который эквивалентен f"my_variable={my_variable}"
1. Новый модуль importlib.metadata, позволяющий узнавать информацию о dependency и об установленных пакетах на уровне кода
1. Новые фичи в math и statistics: math.prod, math.isqrt, statistics.geometric_mean, statistics.multimode, statistics.NormalDist
1. Warnings about dangerous syntax. Вот этот лично мне будет полезен: SyntaxWarning: 'tuple' object is not callable; you missed a comma?
1. Всякие ускорения (namedtuple работает в 2 раза быстрее)

realpython.com/python38-new-features

Cool New Features in Python 3.8 – Real Python

What does Python 3.8 bring to the table? Learn about some of the biggest changes and see you how you can best make use of them.

источник

254419:39пожаловаться #3

2019 October 16

Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges
Arivazhagan et al. [Google Brain]
arxiv.org/abs/1907.05019

We also studied other properties of very deep networks, including the depth-width trade-off, trainability challenges and design choices for scaling Transformers to over 1500 layers with 84 billion parameters.

А теперь немного серьёзнее. Как они до такого дошли?
Идея: многоязычные системы машинного перевода существуют уже давно, но обычно их обучают на нескольких high-resource языках (гигабайты текста); но почему бы не попробовать обучить их на 103 языках (25 млн. пар предложений) и посмотреть на BLEU en⟼any и any⟼ en?
При модели Transformer-Big, стандартной для машинного перевода, у высокоресурсных языков (немецкий, французский, …) качество падает относительно двуязычного бейзлайна (максимум -5 BLEU). Однако, BLEU низкоресурсных языков (йоруба, синдхи, савайский, …) растёт вплоть до +10 BLEU (в среднем +5). И это очень заметное улучшение.
Но как сохранить высокочастотные языки? Возможно, модель просто недостаточно ёмкая, чтобы выучить их одновременно с низкочастотными (попробуйте сами выучить сотню языков). Как проверить эту гипотезу? Увеличить модель до безумных размеров!

Способа увеличить модельку два: в ширину и в глубину.
1. Wide transformer (24 слоя, 32 головы, 2048 attention hidden, 16384 fc-hidden, 1.3B total)
2. Deep transformer (48 слоёв, 16 голов, 1024 attention hidden, 4096 fc-hidden, 1.3B total)
Результат: ещё больший прирост в низкоресурсных, появился прирост в среднересурсных, потерь в высокоресурсных почти нет.

статья в блоге: ai.googleblog.com/2019/10/exploring-massively-multilingual.html
за ссылку спасибо @twlvth

P.S. Заявленная в блоге модель на 84B параметров в тексте статьи так и не появилась =(
Гугл, не надо так.
P.P.S Также очень советую посмотреть статью. В ней много подробностей о тренировке, e.g. как правильно семплить языки и больше интересных резульатов. В один пост не уместить.

источник

134319:36пожаловаться #4

источник

143419:36пожаловаться #5

2019 October 17

Забавный способ применения GPT-2. Для шифрованных сообщений. С фиксированным сидом можно подобрать такой префиксный текст, который генерит ваше секретное послание.

twitter.com/altsoph/status/1184426687683055617

Aleksey Tikhonov

#HarvardNLP proposed to use a GPT-2 for steganography. To encrypt a secret message, one specifies its text, as well as seed (sets both a key and an output style). To decode it, one needs to know the existence of the message, a seed-text, and the same model https://t.co/khJ6af6Zei

источник

130118:52пожаловаться #6

Deep learning отнимает работу там, где никто не ожидал. DeepMind применил нейросетки для восстановления текстов на древнегреческом. Error rate их системы: 30%. С одной стороны - это много, а с другой стороны, error rate профессионального эпиграфиста (да, есть целая профессия, которая посвящена таким задачам) - 57%.

Restoring ancient text using deep learning: a case study on Greek epigraphy
Assael et al. DeepMind
arxiv.org/abs/1910.06262

блог: deepmind.com/research/publications/Restoring-ancient-text-using-deep-learning-a-case-study-on-Greek-epigraphy

источник

123719:11пожаловаться #7

источник

117619:11пожаловаться #8

Ещё тут facebook показывает свои свежие результаты по низкоресурсному машинному переводу.

We’ve developed a novel approach that combines several methods, including iterative back-translation and self-training with noisy channel decoding, to build the best-performing English-Burmese MT system (+8 BLEU points over the second-best team).

We’ve also developed a state-of-the-art approach for better filtering of noisy parallel data from public websites with our LASER toolkit. … first place for the shared task on corpus filtering for low-resource languages of Sinhala and Nepali at WMT 2019.

В общем на удивление LASER не забыли и нашли ему интересное применение. Кстати ещё в блоге есть неплохие анимации, показывающие их подход. Советую посмотреть.

ai.facebook.com/blog/recent-advances-in-low-resource-machine-translation

Recent advances in low-resource machine translation

Recently, Facebook AI has advanced state-of-the-art results in key language understanding tasks and also launched a new benchmark to push AI systems further

источник

147019:16пожаловаться #9

2019 October 18

Быстрый T-SNE - это то, чего сильно не хватало, когда мы игрались с визуализацией эмбеддингов. Обходит реализацию sklearn на порядки.

https://twitter.com/altsoph/status/1184771151916126208

Aleksey Tikhonov

Canny Lab (Berkeley) adapted the popular dimensionality reduction algorithm t-SNE for parallel calculation on the GPU. It's faster than the sklearn implementation by 2-3 orders of magnitude. https://t.co/radIxpUfIl. Picture: my t-SNE map of ELMO embedded series annotations.

источник

149117:45пожаловаться #10

И появился мультиязычный QA-датасет. 7 языков, русского нет (en, de, es, ar, zh, vi, hi). По аналогии с XNLI в нём только dev и test-сеты, но хотя бы так. Facebook делает ставку на мультиязычные модели.

датасет: github.com/facebookresearch/MLQA
статья: arxiv.org/abs/1910.07475

facebookresearch/MLQA

New dataset. Contribute to facebookresearch/MLQA development by creating an account on GitHub.

источник

119517:52пожаловаться #11

источник

111017:52пожаловаться #12

И интересная статья от Quanta Magazine с обзором всего того, что сейчас происходит в NLP. GLUE, BERT и сломанные датасеты. Очень советую почитать.

отрывок:
“As BERT-based neural networks have taken benchmarks like GLUE by storm, new evaluation methods have emerged that seem to paint these powerful NLP systems as computational versions of Clever Hans, the early 20th-century horse who seemed smart enough to do arithmetic, but who was actually just following unconscious cues from his trainer.”

www.quantamagazine.org/machines-beat-humans-on-a-reading-test-but-do-they-understand-20191017

Quanta Magazine

Machines Beat Humans on a Reading Test. But Do They Understand?

A tool known as BERT can now beat humans on advanced reading-comprehension tests. But it's also revealed how far AI has to go.

источник

150418:02пожаловаться #13

Когда TF выходил, было много разговоров про то, что в нём графы исполняются умно и оптимизируются на этапе компиляции. На практике это оказалось не совсем так, но теперь Стенфорд показывает новый метод оптимизации вычислительного графа, который заметно обходит стандартные rule-based подходы.

Поддерживает TensorFlow и ONNX (который по сути стандарт для экспорта PyTorch-графов в статические). Обещают ускорение от 10 до 300% относительно оптимизатора TensorRT.

twitter.com/matei_zaharia/status/1185104766583619584

github: github.com/jiazhihao/taso

Matei Zaharia

TASO is an awesome Stanford research project led by @JiaZhihao that we'll be presenting at SOSP 2019 this month. It generates graph optimizations for DNNs automatically using theorem proving, and outperforms the manually written graph optimizers in current DNN frameworks. https://t.co/HwD2kblpKN

источник

153421:01пожаловаться #14

источник

180921:01пожаловаться #15

2019 October 21

The Illustrated GPT-2 (Visualizing Transformer Language Models)
jalammar.github.io/illustrated-gpt2

Новый пост в блоге Jay Alammar (The Illustrated Transformer), рассказывающий о языковых моделях и GPT. Как всегда, много отличных картинок. Всем читать.

jalammar.github.io

The Illustrated GPT-2 (Visualizing Transformer Language Models)

Discussions:
Hacker News (64 points, 3 comments), Reddit r/MachineLearning (219 points, 18 comments)

Translations: Russian

This year, we saw a dazzling application of machine learning. The OpenAI GPT-2 exhibited impressive ability of writing coherent and passionate essays that exceed what we anticipated current language models are able to produce. The GPT-2 wasn’t a particularly novel architecture – it’s architecture is very similar to the decoder-only transformer. The GPT2 was, however, a very large, transformer-based language model trained on a massive dataset. In this post, we’ll look at the architecture that enabled the model to produce its results. We will go into the depths of its self-attention layer. And then we’ll look at applications for the decoder-only transformer beyond language modeling.

My goal here is to also supplement my earlier post, The Illustrated Transformer, with more visuals explaining the inner-workings of transformers, and how they’ve evolved since the original paper. My…

источник

126723:02пожаловаться #16

источник

127023:02пожаловаться #17

2019 October 22

Новая порция новостей от Рудера

newsletter.ruder.io/issues/deep-learning-indaba-eurnlp-ml-echo-chamber-pretrained-lms-reproducibility-papers-199557

newsletter.ruder.io

Deep Learning Indaba, EurNLP, ML echo chamber, Pretrained LMs, Reproducibility papers

Hi all,This month features updates about recent events (ICLR 2020 submissions, Deep Learning Indaba, EurNLP 2019), reflections on the ML echo chamber, a ton of resources and tools (many of them about Transformers and pretrained language models), many superb posts—from entertaining comics to advice for PhDs and writing papers and musings on incentives to use poor-quality datasets—and compelling papers on reproducibility.Contributions 💪 If you have written or have come across something that would b

источник

117716:21пожаловаться #18

И хороший пост из подборки Рудера:

Основные проблемы transfer learning в NLP

mohammadkhalifa.github.io/2019/09/06/Issues-With-Transfer-Learning-in-NLP/

Current Issues with Transfer Learning in NLP - Khalifa's Blog

Natural Language Processing (NLP) has recently witnessed dramatic progress with state-of-the-art results being published every few days. Leaderboard madness is diriving the most common NLP bench...

источник

118916:37пожаловаться #19

How the Transformers broke NLP leaderboards

Одна из проблем: лидерборды сломаны и топовые модели не делают значительного вклада и вообще всё это попахивает переобучением

hackingsemantics.xyz/2019/leaderboards

Hacking semantics

How the Transformers broke NLP leaderboards

With the huge Transformer-based models such as BERT, GPT-2, and XLNet, are we losing track of how the state-of-the-art performance is achieved?

источник

126516:41пожаловаться #20

1
«
…
‹
22
23
24
25
26
27
28
›
…
»