During the spring, I took a discrete probability course (6.7720). This was one of the most interesting mathematics courses I've ever taken, second only to an amazing reading course on dynamical systems under Adam Kanigowski (read this if limit theorems for dynamical systems interests you!).

While I have no aspirations of becoming a theoretician (a life in academia has always been a non-starter), my time in the course allowed me to read a paper I had stored in my "papers to read" cache back in 2021: the Universal Law of Robustness paper by Sellke and Bubeck. To summarize: classically, models that interpolate (perfectly fit) their training data were expected to generalize poorly. However, they often perform well in practice, a phenomenon that led to Belkin et al.'s discovery of the "double descent" curve. In this paper, Sellke proposes a necessary condition for fitting that explains the emergence of this curve.

Back in 2021, I found the paper to be a tough read. However, after taking the course, I found it rather pleasant, with well-written and concise proofs.

As a final project, I wrote an exposition on this paper. This exposition,

If you're interested in this robustness paper but struggle to read it and want a more hand wavy (intuition heavy) perspective on it, read the exposition here.