Publications | Theresa Eimer

2025

Growing with Experience: Growing Neural Networks in Deep Reinforcement Learning

Lukas Fehring, Marius Lindauer, and Theresa Eimer

In Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM), Jun 2025

@inproceedings{fehring25,
  author = {Fehring, Lukas and Lindauer, Marius and Eimer, Theresa},
  title = {Growing with Experience: Growing Neural Networks in Deep Reinforcement Learning},
  booktitle = {Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM)},
  year = {2025},
  month = jun,
}

Task Scheduling & Forgetting in Multi-Task Reinforcement Learning

Marc Speckmann, and Theresa Eimer

In Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM), Jun 2025

Bib Website

@inproceedings{speckmann25,
  author = {Speckmann, Marc and Eimer, Theresa},
  title = {Task Scheduling & Forgetting in Multi-Task Reinforcement Learning},
  booktitle = {Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM)},
  year = {2025},
  month = jun,
}

Performance Prediction In Reinforcement Learning: The Bad And The Ugly

Julian Dierkes, Theresa Eimer, Marius Lindauer, and Holger Hoos

In 18th European Workshop on Reinforcement Learning (EWRL), Sep 2025

Bib Website

@inproceedings{dierkes25,
  author = {Dierkes, Julian and Eimer, Theresa and Lindauer, Marius and Hoos, Holger},
  title = {Performance Prediction In Reinforcement Learning: The Bad And The Ugly},
  booktitle = {18th European Workshop on Reinforcement Learning (EWRL)},
  year = {2025},
  month = sep,
}

Mighty: A Comprehensive Tool for studying Generalization, Meta-RL and AutoRL

Aditya Mohan, Theresa Eimer, Carolin Benjamins, André Biedenkapp, and Marius Lindauer

In 18th European Workshop on Reinforcement Learning (EWRL), Sep 2025

Bib Code Website

@inproceedings{moheimer25,
  author = {Mohan, Aditya and Eimer, Theresa and Benjamins, Carolin and Biedenkapp, André and Lindauer, Marius},
  title = {Mighty: A Comprehensive Tool for studying Generalization, Meta-RL and AutoRL},
  booktitle = {18th European Workshop on Reinforcement Learning (EWRL)},
  year = {2025},
  month = sep,
}

Revisiting Learning Rate Control

Micha Henheik, Theresa Eimer, and Marius Lindauer

In Proceedings of the Fourth International Conference on Automated Machine Learning (AutoML’25), Sep 2025

Bib Website

@inproceedings{henheik-automl25,
  author = {Henheik, Micha and Eimer, Theresa and Lindauer, Marius},
  title = {Revisiting Learning Rate Control},
  booktitle = {Proceedings of the Fourth International Conference on
                 Automated Machine Learning ({AutoML}'25)},
  year = {2025},
  month = sep,
}

2024

AutoML in the Age of Large Language Models: Current Challenges, Future Opportunities and Risks

Alexander Tornede, Difan Deng, Theresa Eimer, Joseph Giovanelli, Aditya Mohan, Tim Ruhkopf, Sarah Segel, Daphne Theodorakopoulos, Tanja Tornede, Henning Wachsmuth, and Marius Lindauer

Transactions on Machine Learning Research, Jan 2024

Abs Bib PDF Code

The fields of both Natural Language Processing (NLP) and Automated Machine Learning (AutoML) have achieved remarkable results over the past years. In NLP, especially Large Language Models (LLMs) have experienced a rapid series of breakthroughs very recently. We envision that the two fields can radically push the boundaries of each other through tight integration. To showcase this vision, we explore the potential of a symbiotic relationship between AutoML and LLMs, shedding light on how they can benefit each other. In particular, we investigate both the opportunities to enhance AutoML approaches with LLMs from different perspectives and the challenges of leveraging AutoML to further improve LLMs. To this end, we survey existing work, and we critically assess risks. We strongly believe that the integration of the two fields has the potential to disrupt both fields, NLP and AutoML. By highlighting conceivable synergies, but also risks, we aim to foster further exploration at the intersection of AutoML and LLMs.
@article{tornede-tmlr24, author = {Tornede, Alexander and Deng, Difan and Eimer, Theresa and Giovanelli, Joseph and Mohan, Aditya and Ruhkopf, Tim and Segel, Sarah and Theodorakopoulos, Daphne and Tornede, Tanja and Wachsmuth, Henning and Lindauer, Marius}, title = {AutoML in the Age of Large Language Models: Current Challenges, Future Opportunities and Risks}, journal = {Transactions on Machine Learning Research}, year = {2024}, month = jan, }
ARLBench: Flexible and Efficient Benchmarking for Hyperparameter Optimization in Reinforcement Learning

Jannis Becktepe, Julian Dierkes, Carolin Benjamins, Aditya Mohan, David Salinas, Raghu Rajan, Frank Hutter, Holger H. Hoos, Marius Lindauer, and Theresa Eimer

In 17th European Workshop on Reinforcement Learning (EWRL), Sep 2024

Abs Bib Code Website

Hyperparameters are a critical factor in reliably training well-performing reinforcement learning (RL) agents. Unfortunately, developing and evaluating automated approaches for tuning such hyperparameters is both costly and time-consuming. As a result, such approaches are often only evaluated on a single domain or algorithm, making comparisons difficult and limiting insights into their generalizability. We propose ARLBench, a benchmark for hyperparameter optimization (HPO) in RL that allows comparisons of diverse HPO approaches while being highly efficient in evaluation. To enable research into HPO in RL, even in settings with low compute resources, we select a representative subset of HPO tasks spanning a variety of algorithm and environment combinations. This selection allows for generating a performance profile of an automated RL (AutoRL) method using only a fraction of the compute previously necessary, enabling a broader range of researchers to work on HPO in RL. With the extensive and large-scale dataset on hyperparameter landscapes that our selection is based on, ARLBench is an efficient, flexible, and future-oriented foundation for research on AutoRL. Both the benchmark and the dataset are available at https://github.com/automl/arlbench.
@inproceedings{beckdierkes-ewrl24, author = {Becktepe, Jannis and Dierkes, Julian and Benjamins, Carolin and Mohan, Aditya and Salinas, David and Rajan, Raghu and Hutter, Frank and Hoos, Holger H. and Lindauer, Marius and Eimer, Theresa}, title = {ARLBench: Flexible and Efficient Benchmarking for Hyperparameter Optimization in Reinforcement Learning}, booktitle = {17th European Workshop on Reinforcement Learning (EWRL)}, year = {2024}, month = sep, }

2023

Hyperparameters in Reinforcement Learning and How To Tune Them

Theresa Eimer, Marius Lindauer, and Roberta Raileanu

Proceedings of the Fortieth International Conference on Machine Learning, Jul 2023

Abs Bib PDF Code Website

In order to improve reproducibility, deep reinforcement learning (RL) has been adopting better scientific practices such as standardized evaluation metrics and reporting. However, the process of hyperparameter optimization still varies widely across papers, which makes it challenging to compare RL algorithms fairly. In this paper, we show that hyperparameter choices in RL can significantly affect the agent’s final performance and sample efficiency, and that the hyperparameter landscape can strongly depend on the tuning seed which may lead to overfitting. We therefore propose adopting established best practices from AutoML, such as the separation of tuning and testing seeds, as well as principled hyperparameter optimization (HPO) across a broad search space. We support this by comparing multiple state-of-the-art HPO tools on a range of RL algorithms and environments to their hand-tuned counterparts, demonstrating that HPO approaches often have higher performance and lower compute overhead. As a result of our findings, we recommend a set of best practices for the RL community, which should result in stronger empirical results with fewer computational costs, better reproducibility, and thus faster progress. In order to encourage the adoption of these practices, we provide plug-and-play implementations of the tuning algorithms used in this paper.
@article{eimer-icml23, author = {Eimer, Theresa and Lindauer, Marius and Raileanu, Roberta}, title = {Hyperparameters in Reinforcement Learning and How To Tune Them}, journal = {Proceedings of the Fortieth International Conference on Machine Learning}, month = jul, year = {2023}, }
Contextualize Me - The Case for Context in Reinforcement Learning

Carolin Benjamins, Theresa Eimer, Frederik Schubert, Sebastian Döhler, Aditya Mohan, André Biedenkapp, Bodo Rosenhahn, Frank Hutter, and Marius Lindauer

Transactions on Machine Learning Research, Jul 2023

Abs Bib PDF Code Website

While Reinforcement Learning (RL) has made great strides towards solving increasingly complicated problems, many algorithms are still brittle to even slight environmental changes. Contextual Reinforcement Learning (cRL) provides a framework to model such changes in a principled manner, thereby enabling flexible, precise and interpretable task specification and generation. Our goal is to show how the framework of cRL contributes to improving zero-shot generalization in RL through meaningful benchmarks and structured reasoning about generalization tasks. We confirm the insight that optimal behavior in cRL requires context information, as in other related areas of partial observability. To empirically validate this in the cRL framework, we provide various context-extended versions of common RL environments. They are part of the first benchmark library, CARL, designed for generalization based on cRL extensions of popular benchmarks, which we propose as a testbed to further study general agents. We show that in the contextual setting, even simple RL environments become challenging - and that naive solutions are not enough to generalize across complex context spaces.
@article{benjamins-tmlr23, author = {Benjamins, Carolin and Eimer, Theresa and Schubert, Frederik and Döhler, Sebastian and Mohan, Aditya and Biedenkapp, André and Rosenhahn, Bodo and Hutter, Frank and Lindauer, Marius}, title = {Contextualize Me - The Case for Context in Reinforcement Learning}, journal = {Transactions on Machine Learning Research}, year = {2023}, }

2022

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Jack Parker-Holder, Raghu Rajan, Xingyou Song, André Biedenkapp, Yingjie Miao, Theresa Eimer, Baohe Zhang, Vu Nguyen, Roberto Calandra, Aleksandra Faust, Frank Hutter, and Marius Lindauer

Journal of Artificial Intelligence Research (JAIR), Jul 2022

Abs Bib PDF Website

The combination of Reinforcement Learning (RL) with deep learning has led to a series of impressive feats, with many believing (deep) RL provides a path towards generally capable agents. However, the success of RL agents is often highly sensitive to design choices in the training process, which may require tedious and error-prone manual tuning. This makes it challenging to use RL for new problems, while also limits its full potential. In many other areas of machine learning, AutoML has shown it is possible to automate such design choices and has also yielded promising initial results when applied to RL. However, Automated Reinforcement Learning (AutoRL) involves not only standard applications of AutoML but also includes additional challenges unique to RL, that naturally produce a different set of methods. As such, AutoRL has been emerging as an important area of research in RL, providing promise in a variety of applications from RNA design to playing games such as Go. Given the diversity of methods and environments considered in RL, much of the research has been conducted in distinct subfields, ranging from meta-learning to evolution. In this survey we seek to unify the field of AutoRL, we provide a common taxonomy, discuss each area in detail and pose open problems which would be of interest to researchers going forward.
@article{parker-holder-jair22, author = {Parker{-}Holder, Jack and Rajan, Raghu and Song, Xingyou and Biedenkapp, André and Miao, Yingjie and Eimer, Theresa and Zhang, Baohe and Nguyen, Vu and Calandra, Roberto and Faust, Aleksandra and Hutter, Frank and Lindauer, Marius}, title = {Automated Reinforcement Learning (AutoRL): {A} Survey and Open Problems}, year = {2022}, journal = {Journal of Artificial Intelligence Research (JAIR)}, pages = {517-568}, volume = {74} }
Automated Dynamic Algorithm Configuration

Steven Adriaensen, André Biedenkapp, Gresa Shala, Noor Awad, Theresa Eimer, Marius Lindauer, and Frank Hutter

Journal of Artificial Intelligence Research, Jul 2022

Abs Bib PDF Code Website

The performance of an algorithm often critically depends on its parameter configuration. While a variety of automated algorithm configuration methods have been proposed to relieve users from the tedious and error-prone task of manually tuning parameters, there is still a lot of untapped potential as the learned configuration is static, i.e., parameter settings remain fixed throughout the run. However, it has been shown that some algorithm parameters are best adjusted dynamically during execution, e.g., to adapt to the current part of the optimization landscape. Thus far, this is most commonly achieved through hand-crafted heuristics. A promising recent alternative is to automatically learn such dynamic parameter adaptation policies from data. In this article, we give the first comprehensive account of this new field of automated dynamic algorithm configuration (DAC), present a series of recent advances, and provide a solid foundation for future research in this field. Specifically, we (i) situate DAC in the broader historical context of AI research; (ii) formalize DAC as a computational problem; (iii) identify the methods used in prior-art to tackle this problem; (iv) conduct empirical case studies for using DAC in evolutionary optimization, AI planning, and machine learning.
@article{adriaensen-jair22, author = {Adriaensen, Steven and Biedenkapp, André and Shala, Gresa and Awad, Noor and Eimer, Theresa and Lindauer, Marius and Hutter, Frank}, title = {Automated Dynamic Algorithm Configuration}, journal = {Journal of Artificial Intelligence Research}, volume = {75}, pages = {1633--1699}, year = {2022}, }

2021

DACBench: A Benchmark Library for Dynamic Algorithm Configuration

Theresa Eimer, André Biedenkapp, Maximilian Reimer, Steven Adriaensen, Frank Hutter, and Marius Lindauer

In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI’21), Aug 2021

Abs Bib PDF Blog Code Website

Dynamic Algorithm Configuration (DAC) aims to dynamically control a target algorithm’s hyperparameters in order to improve its performance. Several theoretical and empirical results have demonstrated the benefits of dynamically controlling hyperparameters in domains like evolutionary computation, AI Planning or deep learning. Replicating these results, as well as studying new methods for DAC, however, is difficult since existing benchmarks are often specialized and incompatible with the same interfaces. To facilitate benchmarking and thus research on DAC, we propose DACBench, a benchmark library that seeks to collect and standardize existing DAC benchmarks from different AI domains, as well as provide a template for new ones. For the design of DACBench, we focused on important desiderata, such as (i) flexibility, (ii) reproducibility, (iii) extensibility and (iv) automatic documentation and visualization. To show the potential, broad applicability and challenges of DAC, we explore how a set of six initial benchmarks compare in several dimensions of difficulty.
@inproceedings{eimer-ijcai21, author = {Eimer, Theresa and Biedenkapp, André and Reimer, Maximilian and Adriaensen, Steven and Hutter, Frank and Lindauer, Marius}, title = {DACBench: A Benchmark Library for Dynamic Algorithm Configuration}, booktitle = {Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ({IJCAI}'21)}, year = {2021}, month = aug, publisher = {ijcai.org}, talk = {https://www.youtube.com/watch?v=-G-hLmBI4WM} }
Self-Paced Context Evaluation for Contextual Reinforcement Learning

Theresa Eimer, André Biedenkapp, Frank Hutter, and Marius Lindauer

In Proceedings of the Thirty-eighth International Conference on Machine Learning, Jul 2021

Abs Bib PDF Blog Code

Reinforcement learning (RL) has made a lot of advances for solving a single problem in a given environment; but learning policies that generalize to unseen variations of a problem remains challenging. To improve sample efficiency for learning on such instances of a problem domain, we present Self-Paced Context Evaluation (SPaCE). Based on self-paced learning, \spc automatically generates \task curricula online with little computational overhead. To this end, SPaCE leverages information contained in state values during training to accelerate and improve training performance as well as generalization capabilities to new instances from the same problem domain. Nevertheless, SPaCE is independent of the problem domain at hand and can be applied on top of any RL agent with state-value function approximation. We demonstrate SPaCE’s ability to speed up learning of different value-based RL agents on two environments, showing better generalization capabilities and up to 10x faster learning compared to naive approaches such as round robin or SPDRL, as the closest state-of-the-art approach.
@inproceedings{eimer-icml21, author = {Eimer, Theresa and Biedenkapp, André and Hutter, Frank and Lindauer, Marius}, title = {Self-Paced Context Evaluation for Contextual Reinforcement Learning}, booktitle = {Proceedings of the Thirty-eighth International Conference on Machine Learning}, year = {2021}, month = jul, }
Automatic Risk Adaption in Distributional Reinforcement Learning

Frederik Schubert, Theresa Eimer, Bodo Rosenhahn, and Marius Lindauer

In Workshop on Reinforcement Learning for Real Life (RL4RealLife@ICML’21), Jul 2021

Abs Bib PDF Code

The use of Reinforcement Learning (RL) agents in practical applications requires the consideration of suboptimal outcomes, depending on the familiarity of the agent with its environment. This is especially important in safety-critical environments, where errors can lead to high costs or damage. In distributional RL, the risk-sensitivity can be controlled via different distortion measures of the estimated return distribution. However, these distortion functions require an estimate of the risk level, which is difficult to obtain and depends on the current state. In this work, we demonstrate the suboptimality of a static risk level estimation and propose a method to dynamically select risk levels at each environment step. Our method ARA (Automatic Risk Adaptation) estimates the appropriate risk level in both known and unknown environments using a Random Network Distillation error. We show reduced failure rates by up to a factor of 7 and improved generalization performance by up to 14 percent compared to both risk-aware and risk-agnostic agents in several locomotion environments.
@inproceedings{schubert-rl4rlicml21, author = {Schubert, Frederik and Eimer, Theresa and Rosenhahn, Bodo and Lindauer, Marius}, title = {Automatic Risk Adaption in Distributional Reinforcement Learning}, booktitle = {Workshop on Reinforcement Learning for Real Life ({RL4RealLife@ICML}'21)}, year = {2021}, month = jul, }
Hyperparameters in Contextual RL are Highly Situational

Theresa Eimer, Carolin Benjamins, and Marius Lindauer

In Ecological Theory of RL Workshop NeurIPS, Dez 2021

Abs Bib PDF Code

Although Reinforcement Learning (RL) has shown impressive results in games and simulation, real-world application of RL suffers from its instability under changing environment conditions and hyperparameters. We give a first impression of the extent of this instability by showing that the hyperparameters found by automatic hyperparameter optimization (HPO) methods are not only dependent on the problem at hand, but even on how well the state describes the environment dynamics. Specifically, we show that agents in contextual RL require different hyperparameters if they are shown how environmental factors change. In addition, finding adequate hyperparameter configurations is not equally easy for both settings, further highlighting the need for research into how hyperparameters influence learning and generalization in RL.
@inproceedings{eimer-ecorl21, author = {Eimer, Theresa and Benjamins, Carolin and Lindauer, Marius}, title = {Hyperparameters in Contextual RL are Highly Situational}, booktitle = {Ecological Theory of RL Workshop NeurIPS}, year = {2021}, month = dez, }

2020

Dynamic Algorithm Configuration: Foundation of a New Meta-Algorithmic Framework

André Biedenkapp, H. Furkan Bozkurt, Theresa Eimer, Frank Hutter, and Marius Lindauer

In Proceedings of the European Conference on Artificial Intelligence (ECAI), Jun 2020

Abs Bib PDF Blog Code Website

The performance of many algorithms in the fields of hard combinatorial problem solving, machine learning or AI in general depends on parameter tuning. Automated methods have been proposed to alleviate users from the tedious and error-prone task of manually searching for performance-optimized configurations across a set of problem instances. However, there is still a lot of untapped potential through adjusting an algorithm’s parameters online since different parameter values can be optimal at different stages of the algorithm. Prior work showed that reinforcement learning is an effective approach to learn policies for online adjustments of algorithm parameters in a data-driven way. We extend that approach by formulating the resulting dynamic algorithm configuration as a contextual MDP, such that RL not only learns a policy for a single instance, but across a set of instances. To lay the foundation for studying dynamic algorithm configuration with RL in a controlled setting, we propose white-box benchmarks covering major aspects that make dynamic algorithm configuration a hard problem in practice and study the per- formance of various types of configuration strategies for them. On these white-box benchmarks, we show that (i) RL is a robust candidate for learning configuration policies, outperforming standard pa- rameter optimization approaches, such as classical algorithm configuration; (ii) based on function approximation, RL agents can learn to generalize to new types of instances; and (iii) self-paced learning can substantially improve the performance by selecting a useful sequence of training instances automatically.
@inproceedings{biedenkapp-ecai20, author = {Biedenkapp, André and Bozkurt, H. Furkan and Eimer, Theresa and Hutter, Frank and Lindauer, Marius}, title = {Dynamic {A}lgorithm {C}onfiguration: {F}oundation of a {N}ew {M}eta-{A}lgorithmic {F}ramework}, booktitle = {Proceedings of the European Conference on Artificial Intelligence (ECAI)}, year = {2020}, month = jun, talk = {https://www.youtube.com/watch?v=wxPYtSGT05s&feature=youtu.be}, }