Skip to the content.

How to design sustainable molecules?

Speaker: Jana Weber

Time: 15:00-15:45

Location: HG 2.62, Civil Engineering

Zoom: link

Abstract: Designing sustainable molecules requires navigating extremely large design spaces under severe data constraints. In this talk, I will introduce core building blocks of molecular machine learning and show how they enable tailored, model-guided exploration of (bio-) chemical space. Emphasis will be placed on graph-based representations of molecules, graph neural networks, and network-level optimisation methods that naturally connect molecular design to problems in graph learning. These ideas will be illustrated through several case studies. I will first discuss the design of photocatalytic polymers, highlighting challenges in representing polymer structures as graphs, building predictive models from limited data, and performing inverse design. I will then present a guided design strategy for proteins, followed by an example of evaluating molecular synthesis pathways using reaction networks. Together, these examples demonstrate how graph models can be used for data-driven molecular design.

Grounding Large Language Models in Reaction Knowledge Graphs for Synthesis Retrieval

Speaker: Lorenzo Di Fruscia

Time: 15:45-16:05

Location: HG 2.62, Civil Engineering

Zoom: link

Abstract: Large Language Models (LLMs) can aid synthesis planning in chemistry, but standard prompting methods often yield hallucinated or outdated suggestions. We study LLM interactions with a reaction knowledge graph by casting reaction path retrieval as a Text2Cypher (natural language to graph query) generation problem, and define single- and multi-step retrieval tasks. We compare zero-shot prompting to one-shot variants using static, random, and embedding-based exemplar selection, and assess a checklist-driven validator/corrector loop. To evaluate our framework, we consider query validity and retrieval accuracy. We find that one-shot prompting with aligned exemplars consistently performs best. Our checklist-style self-correction loop mainly improves executability in zero-shot settings and offers limited additional retrieval gains once a good exemplar is present. We provide a reproducible Text2Cypher evaluation setup to facilitate further work on KG-grounded LLMs for synthesis planning.

GenComp: Generative Models for Spatial Compositionality Problems

Speaker: Amelia Villegas Morcillo

Time: 16:05-16:25

Location: HG 2.62, Civil Engineering

Abstract: Compositionality—the ability to construct novel solutions by recombining known components—is central to spatial reasoning tasks such as puzzles, LEGO assemblies, and molecular structures. While recent diffusion-based generative models have explored aspects of compositional generation, they typically do not explicitly model how spatial parts assemble into coherent intermediate sub-solutions. In this talk, I first introduce spatial compositionality metrics for diffusion models that use connectivity graphs extracted during sampling to evaluate whether meaningful part-part and part-assembly relationships emerge and persist across intermediate timesteps. Building on these insights, I then present ongoing work on GenComp, our proposed joint generative framework with two parallel, interacting diffusion processes over spatial and topological variables, in which graph structure guides how components connect and move together during generation. Preliminary results on cross-cut puzzles in a denoising diffusion setting show how enforcing compositional constraints in the forward process affects both topological and spatial prediction and influences generative performance, laying the groundwork for diffusion models that explicitly reason over reusable spatial substructures.

back