Research Program

In our research we develop novel data-driven techniques to solve real materials design problems across scale in an actionable way. We use machine learning models as navigation systems for the chemical space. We do this in close collaboration with experimental partners and by working on several themes that are key for progress in the field.

We strive to make all our work practically useful by sharing code and data under permissive licenses in a reusable form. To us, computational work without open code is just mere advertisment. External link

Leveraging (tacit) knowledge

AI generated image of robot sifting through literature
Image: Kevin Maik Jablonka

Plently of chemical data is being produced and published, but most of it is not used. The experiments performed in most labs are not (optimally) informed by the scientific record - they are often not even informed by the experiments performed by prior group members.

Machine learning techniques, in particular large language models, can help leveraging this information and making it more accessible.
In particular, they can also help us to capture subtle (tacit) aspects that conventional machine learning approaches (operating on representations of "idealized" structures) cannot capture.

Better inductive biases and representations

We know many things about the world. We take basic physical and chemical (empirical) laws for granted. However, most of our models do not know about them.

Since, however, this can help making models more robust and predictive we develop novel ways to incorporate relevant inductive biases into our models.

Much of this work happens on the level of the inputs to the models. That is, we attempt to craft representations of molecules and materials that carry more of the relevant information in a faithful way with them. This also includes the development of representations that allow us to bridge length scales.

Learning beyond conventional objectives

AI generated image of human-computer interaction
Illustration: Kevin Maik Jablonka

Most models are trained by finding weights that minimize the mismatch between the predictions and are ground truth. While this might make models good at predicting the values that correlate with the "ground truth" in a specific case, it does not guarantee that the models are right for the right reasons.

To counteract this, we will incorporate more than just feedback on a quantitative error into the training of our models. For doing so, we will leverage principles from human-computer interaction.