avatar

Alessio Russo

Postdoctoral Researcher
Boston University
arusso2 (at) bu (dot) edu


Reading List

A curated selection of papers and books I’m reading or plan to read.
Browse by topic using the buttons below.

General Deep Learning

Conformal Prediction

Differential Geometry in Deep Learning

Dimensionality Reduction

Thompson Sampling

Deep Reinforcement Learning

Year Paper Venue
2025 Discovering state-of-the-art reinforcement learning algorithms Nature
2022 Exploring through Random Curiosity with General Value Functions
2022 CICERO
2021 Mastering Atari with Discrete World-Models
2021 GMAC: A Distributional Perspective on Actor-Critic Framework
2021 Enforcing Robust Control Guarantees with Neural Network Policies
2021 Adversarial Intrinsic Motivation for Reinforcement Learning
2020 Sample-Based Distributional Policy Gradient
2020 Planning to Explore via Supervised World-Models
2020 Planning go explore via self-supervised world models
2020 Hypermodels for Exploration
2020 Dream to Control
2020 A Theoretic Analysis of DQN
2020 A Finite-Time Analysis of Q-Learning with Neural Network Function Approximation
2019 Towards Characterizing Divergence in Deep Q-Learning
2019 Statistics and Samples in Distributional Reinforcement Learning
2019 Learning Latent Dynamics from Pixels
2018 Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
2018 Equivalence Between Policy Gradients and Soft Q-Learning
2018 DORA The Explorer: Directed Outreaching Reinforcement Action-Selection
2018 Distributed Distributional Deterministic Policy Gradients
2018 Control-Theoretic Analysis of Smoothness for Stability-Certified Reinforcement Learning
2017 Reinforcement Learning with Deep Energy-Based Policies
2017 #Exploration: A Study of Count-Based Exploration for Deep Reinforcement
2016 Unifying Count-Based Exploration and Intrinsic Motivation
List of algorithms

Reinforcement Learning

Year Paper Venue
2024 Stop Regressing: Training Value Functions via Classification for Scalable Deep RL
2024 Correlated Proxies: A New Definition and Improved Mitigation for Reward Hacking
2023 Empirical Design in Reinforcement Learning
2023 Does Zero-Shot Reinforcement Learning Exist?
2023 BACKSTEPPING TEMPORAL DIFFERENCE LEARNING
2023 An Analysis of Quantile Temporal-Difference Learning
2022 Why Should I Trust You, Bellman? The Bellman Error is a Poor Replacement for Value Error
2022 Understanding Policy Gradient Algorithms: A Sensitivity-Based Approach
2022 TRC: Trust Region Conditional Value at Risk for Safe Reinforcement Learning
2022 Towards Safe Reinforcement Learning via Constraining Conditional Value-at-Risk
2022 Safety-constrained Reinforcement Learning with a Distributional Safety Critic
2022 SAAC: Safe Reinforcement Learning as an Adversarial Game of Actor-Critics
2022 Constrained Variational Policy Optimization for Safe Reinforcement Learning
2022 Conformal Off-Policy Prediction in Contextual Bandits
2022 A Review of Off-Policy Evaluation in Reinforcement Learning
2021 Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction
2021 Learning Successor States and Goal-Dependent Values: A Mathematical Viewpoint
2021 Learning One Representation to Optimize All Rewards
2021 Hoeffding’s Inequality for General Markov Chains and Its Applications to Statistical Learning
2021 Adaptive Sampling for Best Policy Identification in MDPs
2020 Provably Efficient Exploration for Reinforcement Learning Using Unsupervised Learning
2020 On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces
2020 Fast active learning for pure exploration in reinforcement learning
2020 CoinDICE: Off-Policy Confidence Interval Estimation
2019 Revisiting the Softmax Bellman Operator: New Benefits and New Perspective
2019 Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon MDP
2019 Provably Efficient Reinforcement Learning with Linear Function Approximation
2019 Benchmarking Safe Exploration in Deep Reinforcement Learning
2018 Is Q-learning Provably Efficient?
2018 Deep Reinforcement Learning that Matters
2018 Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation
2018 Adaptive Sampling for Policy Identification
2017 Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning
2017 Risk-Constrained Reinforcement Learning with Percentile Risk Criteria
2017 Constrained Policy Optimization
2016 Learning the Variance of the Reward-To-Go
2015 Variance-Constrained Actor-Critic Algorithms for Discounted and Average Reward MDPs
2015 Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach
2015 High Confidence Policy Improvement
2015 High Confidence Off-Policy Evaluation
2015 A Comprehensive Survey on Safe Reinforcement Learning
2012 Policy Gradients with Variance Related Risk Criteria
2009 An Analysis of Reinforcement Learning with Function Approximation
2008 An Analysis of Model-Based Interval Estimation for Markov Decision Processes
2006 PAC Model-Free Reinforcement Learning
2004 Bias and Variance in Value Function Estimation
2001 TD Algorithm for the Variance of Return and Mean-Variance Reinforcement Learning
2001 Convergence of Optimistic and Incremental Q-Learning
2000 Eligibility Traces for Off-Policy Policy Evaluation
2000 Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms
1993 Convergence of Stochastic Iterative Dynamic Programming Algorithms
1992 Reinforcement Learning Applied to Linear Quadratic Regulation
1982 The Variance of Discounted Markov Decision Processes

Bandit Algorithms

In-Context Learning

Optimization

Statistics

Probability Modeling & Inference

Uncertainty Estimation

Statistical Learning

Lecture Notes, Books and Courses

Blogs

Schools

Bayesian Learning


Powered by Jekyll and Minimal Light theme.