yicui
's Collections
Theory
updated
KAN: Kolmogorov-Arnold Networks
Paper
•
2404.19756
•
Published
•
108
The Platonic Representation Hypothesis
Paper
•
2405.07987
•
Published
•
2
The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of
Inductive Biases in Machine Learning
Paper
•
2304.05366
•
Published
•
1
Explaining NonLinear Classification Decisions with Deep Taylor
Decomposition
Paper
•
1512.02479
•
Published
•
1
Large Language Models as Markov Chains
Paper
•
2410.02724
•
Published
•
30
Neural Machine Translation by Jointly Learning to Align and Translate
Paper
•
1409.0473
•
Published
•
5
Transformers Learn Higher-Order Optimization Methods for In-Context
Learning: A Study with Linear Models
Paper
•
2310.17086
•
Published
•
1
Cross-Entropy Loss Functions: Theoretical Analysis and Applications
Paper
•
2304.07288
•
Published
•
1
The Geometry of Concepts: Sparse Autoencoder Feature Structure
Paper
•
2410.19750
•
Published
•
2
Scaling Laws for Precision
Paper
•
2411.04330
•
Published
•
7