Ksenia Se
Kseniase
·
AI & ML interests
None yet
Recent Activity
replied to
their
post
2 days ago
TL;DR: The Story of Attention's Development by @karpathy
Origin: First proposed in 2014 by @Dzmitry Bahdanau, @KyunghyunCho, and Yoshua Bengio in https://huggingface.co/papers/1409.0473 . Inspired by cognitive processes and later renamed from "RNNSearch."
Key Idea: A data-dependent weighted average for pooling and communication, enabling flexible and powerful neural network connections.
Breakthrough: Bahdanau's "soft search" mechanism (softmax + weighted averaging) solved encoder-decoder bottlenecks in machine translation.
Transformer Revolution: https://huggingface.co/papers/1706.03762 (2017) by @ashishvaswanigoogle et al. simplified architectures by stacking attention layers, introducing multi-headed attention and positional encodings.
Legacy: Attention replaced RNNs, driving modern AI systems like ChatGPT. It emerged independently but was influenced by contemporaneous work like Alex Graves’s https://huggingface.co/papers/1410.5401 and Jason Weston’s https://huggingface.co/papers/1410.3916 .
Attention to history: Jürgen Schmidhuber claims his 1992 Fast Weight Programmers anticipated modern attention mechanisms. While conceptually similar, the term “attention” was absent, and there’s no evidence it influenced Bahdanau, Cho, and Bengio’s 2014 work. Paying attention (!) to history might have brought us to genAI earlier – but credit for the breakthrough still goes to Montreal.
Referenced Papers:
Attention Origin: https://huggingface.co/papers/1409.0473
Transformers: https://huggingface.co/papers/1706.03762
Alex Graves' Work: https://huggingface.co/papers/1410.5401, https://huggingface.co/papers/1308.0850
Jason Weston @spermwhale's https://huggingface.co/papers/1410.3916
https://huggingface.co/papers/1409.3215 by Ilya Sutskever (@ilyasut ), Oriol Vinyals, Quoc V. Le
Who else deserves recognition in this groundbreaking narrative of innovation? Let’s ensure every contributor gets the credit they deserve. Leave a comment below 👇🏻🤗
Reacted to
their
post
with ❤️
2 days ago
TL;DR: The Story of Attention's Development by @karpathy
Origin: First proposed in 2014 by @Dzmitry Bahdanau, @KyunghyunCho, and Yoshua Bengio in https://huggingface.co/papers/1409.0473 . Inspired by cognitive processes and later renamed from "RNNSearch."
Key Idea: A data-dependent weighted average for pooling and communication, enabling flexible and powerful neural network connections.
Breakthrough: Bahdanau's "soft search" mechanism (softmax + weighted averaging) solved encoder-decoder bottlenecks in machine translation.
Transformer Revolution: https://huggingface.co/papers/1706.03762 (2017) by @ashishvaswanigoogle et al. simplified architectures by stacking attention layers, introducing multi-headed attention and positional encodings.
Legacy: Attention replaced RNNs, driving modern AI systems like ChatGPT. It emerged independently but was influenced by contemporaneous work like Alex Graves’s https://huggingface.co/papers/1410.5401 and Jason Weston’s https://huggingface.co/papers/1410.3916 .
Attention to history: Jürgen Schmidhuber claims his 1992 Fast Weight Programmers anticipated modern attention mechanisms. While conceptually similar, the term “attention” was absent, and there’s no evidence it influenced Bahdanau, Cho, and Bengio’s 2014 work. Paying attention (!) to history might have brought us to genAI earlier – but credit for the breakthrough still goes to Montreal.
Referenced Papers:
Attention Origin: https://huggingface.co/papers/1409.0473
Transformers: https://huggingface.co/papers/1706.03762
Alex Graves' Work: https://huggingface.co/papers/1410.5401, https://huggingface.co/papers/1308.0850
Jason Weston @spermwhale's https://huggingface.co/papers/1410.3916
https://huggingface.co/papers/1409.3215 by Ilya Sutskever (@ilyasut ), Oriol Vinyals, Quoc V. Le
Who else deserves recognition in this groundbreaking narrative of innovation? Let’s ensure every contributor gets the credit they deserve. Leave a comment below 👇🏻🤗
replied to
their
post
4 days ago
TL;DR: The Story of Attention's Development by @karpathy
Origin: First proposed in 2014 by @Dzmitry Bahdanau, @KyunghyunCho, and Yoshua Bengio in https://huggingface.co/papers/1409.0473 . Inspired by cognitive processes and later renamed from "RNNSearch."
Key Idea: A data-dependent weighted average for pooling and communication, enabling flexible and powerful neural network connections.
Breakthrough: Bahdanau's "soft search" mechanism (softmax + weighted averaging) solved encoder-decoder bottlenecks in machine translation.
Transformer Revolution: https://huggingface.co/papers/1706.03762 (2017) by @ashishvaswanigoogle et al. simplified architectures by stacking attention layers, introducing multi-headed attention and positional encodings.
Legacy: Attention replaced RNNs, driving modern AI systems like ChatGPT. It emerged independently but was influenced by contemporaneous work like Alex Graves’s https://huggingface.co/papers/1410.5401 and Jason Weston’s https://huggingface.co/papers/1410.3916 .
Attention to history: Jürgen Schmidhuber claims his 1992 Fast Weight Programmers anticipated modern attention mechanisms. While conceptually similar, the term “attention” was absent, and there’s no evidence it influenced Bahdanau, Cho, and Bengio’s 2014 work. Paying attention (!) to history might have brought us to genAI earlier – but credit for the breakthrough still goes to Montreal.
Referenced Papers:
Attention Origin: https://huggingface.co/papers/1409.0473
Transformers: https://huggingface.co/papers/1706.03762
Alex Graves' Work: https://huggingface.co/papers/1410.5401, https://huggingface.co/papers/1308.0850
Jason Weston @spermwhale's https://huggingface.co/papers/1410.3916
https://huggingface.co/papers/1409.3215 by Ilya Sutskever (@ilyasut ), Oriol Vinyals, Quoc V. Le
Who else deserves recognition in this groundbreaking narrative of innovation? Let’s ensure every contributor gets the credit they deserve. Leave a comment below 👇🏻🤗
View all activity
Organizations
Kseniase's activity
-
-
-
-
-
-
-
-
-
-
upvoted
a
paper
6 days ago