Patrick Knab
PhD Candidate
I am a third-year PhD candidate at Clausthal University of Technology working on Computer Vision and Interpretable AI. My research investigates how foundation models can be used to derive domain-specific visual concepts that make neural networks more transparent and easier to interpret. I am particularly interested in how such concept representations can improve the performance of vision foundation models in image and video generation, understanding, and downstream decision-making.

Education
  • Clausthal University of Technology
    Clausthal University of Technology
    PhD Candidate
    Feb. 2025 - present
  • University of Mannheim
    University of Mannheim
    PhD Candidate
    Sept. 2023 - Jan. 2025
  • University of Mannheim
    University of Mannheim
    M.Sc. in Information Science
    Jan. 2021 - Aug. 2023
  • Lappeenranta University of Technology, Finland
    Lappeenranta University of Technology, Finland
    Semester Abroad
    Jan. 2020 - Apr. 2020
  • University of Mannheim
    University of Mannheim
    B.Sc. in Information Science
    Sep. 2017 - Dec. 2020
Experience
  • Ramblr GmbH, Munich
    Ramblr GmbH, Munich
    PhD Internship
    Sept. 2025 - Nov. 2025
  • Robert-Bosch GmbH, Bühl
    Robert-Bosch GmbH, Bühl
    Masterand
    Oct. 2022 - Apr. 2023
  • Robert-Bosch GmbH, Bühl
    Robert-Bosch GmbH, Bühl
    Working Student
    Jul. 2022 - Aug. 2023
  • Institute for Enterprise Systems, Mannheim
    Institute for Enterprise Systems, Mannheim
    Scientific Assistant
    Jan. 2022 - Aug. 2023
  • Grosse-Hornke, Münster
    Grosse-Hornke, Münster
    Working Student
    Nov. 2021 - Jul. 2022
  • Robert-Bosch GmbH, Bühl
    Robert-Bosch GmbH, Bühl
    Working Student
    Jan. 2021 - Dec. 2021
  • Robert-Bosch GmbH, Bühl
    Robert-Bosch GmbH, Bühl
    Bachelorand
    Sep. 2020 - Dec. 2020
  • Robert-Bosch GmbH, Bühl
    Robert-Bosch GmbH, Bühl
    Intern
    May 2020 - Aug. 2020
  • Porsche AG, Weissach
    Porsche AG, Weissach
    Working Student
    Oct. 2019 - Dec. 2019
  • Robert-Bosch GmbH, Bühl
    Robert-Bosch GmbH, Bühl
    Intern
    Jul. 2019 - Aug. 2019
Honors & Awards
  • Ideenwettbewerb Regionalpreis Mainz
    2025
News
2025
We have a new preprint! We propose a transformer based CBM for video classfication - MoTIF.
Oct 01
We are happy to announce that our workshop paper DSEG is now accepted as a main track at ECAI 2025!
Jul 16
Selected Publications (view all )
Concepts in Motion: Temporal Bottlenecks for Interpretable Video Classification
Concepts in Motion: Temporal Bottlenecks for Interpretable Video Classification

Patrick Knab, Sascha Marton, Philipp J Schubert, Drago Guggiana, Christian Bartelt

Preprint 2025

Conceptual models such as Concept Bottleneck Models (CBMs) have driven substantial progress in improving interpretability for image classification by leveraging human-interpretable concepts. However, extending these models from static images to sequences of images, such as video data, introduces a significant challenge due to the temporal dependencies inherent in videos, which are essential for capturing actions and events. In this work, we introduce MoTIF (Moving Temporal Interpretable Framework), an architectural design inspired by a transformer that adapts the concept bottleneck framework for video classification and handles sequences of arbitrary length. Within the video domain, concepts refer to semantic entities such as objects, attributes, or higher-level components (e.g., 'bow', 'mount', 'shoot') that reoccur across time - forming motifs collectively describing and explaining actions. Our design explicitly enables three complementary perspectives: global concept importance across the entire video, local concept relevance within specific windows, and temporal dependencies of a concept over time. Our results demonstrate that the concept-based modeling paradigm can be effectively transferred to video data, enabling a better understanding of concept contributions in temporal contexts while maintaining competitive performance. Code available at github.com/patrick-knab/MoTIF.

Concepts in Motion: Temporal Bottlenecks for Interpretable Video Classification

Patrick Knab, Sascha Marton, Philipp J Schubert, Drago Guggiana, Christian Bartelt

Preprint 2025

Conceptual models such as Concept Bottleneck Models (CBMs) have driven substantial progress in improving interpretability for image classification by leveraging human-interpretable concepts. However, extending these models from static images to sequences of images, such as video data, introduces a significant challenge due to the temporal dependencies inherent in videos, which are essential for capturing actions and events. In this work, we introduce MoTIF (Moving Temporal Interpretable Framework), an architectural design inspired by a transformer that adapts the concept bottleneck framework for video classification and handles sequences of arbitrary length. Within the video domain, concepts refer to semantic entities such as objects, attributes, or higher-level components (e.g., 'bow', 'mount', 'shoot') that reoccur across time - forming motifs collectively describing and explaining actions. Our design explicitly enables three complementary perspectives: global concept importance across the entire video, local concept relevance within specific windows, and temporal dependencies of a concept over time. Our results demonstrate that the concept-based modeling paradigm can be effectively transferred to video data, enabling a better understanding of concept contributions in temporal contexts while maintaining competitive performance. Code available at github.com/patrick-knab/MoTIF.

Beyond Pixels: Enhancing LIME with Hierarchical Features and Segmentation Foundation Models
Beyond Pixels: Enhancing LIME with Hierarchical Features and Segmentation Foundation Models

Patrick Knab, Sascha Marton, Christian Bartelt

28th European Conference on Artificial Intelligence (ECAI) 2025

LIME (Local Interpretable Model-agnostic Explanations) is a popular XAI framework for unraveling decision-making processes in vision machine-learning models. The technique utilizes image segmentation methods to identify fixed regions for calculating feature importance scores as explanations. Therefore, poor segmentation can weaken the explanation and reduce the importance of segments, ultimately affecting the overall clarity of interpretation. To address these challenges, we introduce the DSEG-LIME (Data-Driven Segmentation LIME) framework, featuring: i) a data-driven segmentation for human-recognized feature generation by foundation model integration, and ii) a user-steered granularity in the hierarchical segmentation procedure through composition. Our findings demonstrate that DSEG outperforms on several XAI metrics on pre-trained ImageNet models and improves the alignment of explanations with human-recognized concepts. The code is available under: https://github.com/patrick-knab/DSEG-LIME

Beyond Pixels: Enhancing LIME with Hierarchical Features and Segmentation Foundation Models

Patrick Knab, Sascha Marton, Christian Bartelt

28th European Conference on Artificial Intelligence (ECAI) 2025

LIME (Local Interpretable Model-agnostic Explanations) is a popular XAI framework for unraveling decision-making processes in vision machine-learning models. The technique utilizes image segmentation methods to identify fixed regions for calculating feature importance scores as explanations. Therefore, poor segmentation can weaken the explanation and reduce the importance of segments, ultimately affecting the overall clarity of interpretation. To address these challenges, we introduce the DSEG-LIME (Data-Driven Segmentation LIME) framework, featuring: i) a data-driven segmentation for human-recognized feature generation by foundation model integration, and ii) a user-steered granularity in the hierarchical segmentation procedure through composition. Our findings demonstrate that DSEG outperforms on several XAI metrics on pre-trained ImageNet models and improves the alignment of explanations with human-recognized concepts. The code is available under: https://github.com/patrick-knab/DSEG-LIME

DCBM: Data-Efficient Visual Concept Bottleneck Models
DCBM: Data-Efficient Visual Concept Bottleneck Models

Katharina Prasse*, Patrick Knab*, Sascha Marton, Christian Bartelt, Margret Keuper (* equal contribution)

International Conference on Machine Learning (ICML) 2025

Concept Bottleneck Models (CBMs) enhance the interpretability of neural networks by basing predictions on human-understandable concepts. However, current CBMs typically rely on concept sets extracted from large language models or extensive image corpora, limiting their effectiveness in data-sparse scenarios. We propose Dataefficient CBMs (DCBMs), which reduce the need for large sample sizes during concept generation while preserving interpretability. DCBMs define concepts as image regions detected by segmentation or detection foundation models, allowing each image to generate multiple concepts across different granularities. This removes reliance on textual descriptions and large-scale pre-training, making DCBMs applicable for fine-grained classification and out-of-distribution tasks. Attribution analysis using Grad-CAM demonstrates that DCBMs deliver visual concepts that can be localized in test images. By leveraging dataset-specific concepts instead of predefined ones, DCBMs enhance adaptability to new domains.

DCBM: Data-Efficient Visual Concept Bottleneck Models

Katharina Prasse*, Patrick Knab*, Sascha Marton, Christian Bartelt, Margret Keuper (* equal contribution)

International Conference on Machine Learning (ICML) 2025

Concept Bottleneck Models (CBMs) enhance the interpretability of neural networks by basing predictions on human-understandable concepts. However, current CBMs typically rely on concept sets extracted from large language models or extensive image corpora, limiting their effectiveness in data-sparse scenarios. We propose Dataefficient CBMs (DCBMs), which reduce the need for large sample sizes during concept generation while preserving interpretability. DCBMs define concepts as image regions detected by segmentation or detection foundation models, allowing each image to generate multiple concepts across different granularities. This removes reliance on textual descriptions and large-scale pre-training, making DCBMs applicable for fine-grained classification and out-of-distribution tasks. Attribution analysis using Grad-CAM demonstrates that DCBMs deliver visual concepts that can be localized in test images. By leveraging dataset-specific concepts instead of predefined ones, DCBMs enhance adaptability to new domains.

DCBM: Data-Efficient Visual Concept Bottleneck Models
DCBM: Data-Efficient Visual Concept Bottleneck Models

Katharina Prasse*, Patrick Knab*, Sascha Marton, Christian Bartelt, Margret Keuper (* equal contribution)

Conference on Computer Vision and Pattern Recognition (CVPR) @ XAI4CV Workshop 2025

Concept Bottleneck Models (CBMs) enhance the interpretability of neural networks by basing predictions on human-understandable concepts. However, current CBMs typically rely on concept sets extracted from large language models or extensive image corpora, limiting their effectiveness in data-sparse scenarios. We propose Dataefficient CBMs (DCBMs), which reduce the need for large sample sizes during concept generation while preserving interpretability. DCBMs define concepts as image regions detected by segmentation or detection foundation models, allowing each image to generate multiple concepts across different granularities. This removes reliance on textual descriptions and large-scale pre-training, making DCBMs applicable for fine-grained classification and out-of-distribution tasks. Attribution analysis using Grad-CAM demonstrates that DCBMs deliver visual concepts that can be localized in test images. By leveraging dataset-specific concepts instead of predefined ones, DCBMs enhance adaptability to new domains.

DCBM: Data-Efficient Visual Concept Bottleneck Models

Katharina Prasse*, Patrick Knab*, Sascha Marton, Christian Bartelt, Margret Keuper (* equal contribution)

Conference on Computer Vision and Pattern Recognition (CVPR) @ XAI4CV Workshop 2025

Concept Bottleneck Models (CBMs) enhance the interpretability of neural networks by basing predictions on human-understandable concepts. However, current CBMs typically rely on concept sets extracted from large language models or extensive image corpora, limiting their effectiveness in data-sparse scenarios. We propose Dataefficient CBMs (DCBMs), which reduce the need for large sample sizes during concept generation while preserving interpretability. DCBMs define concepts as image regions detected by segmentation or detection foundation models, allowing each image to generate multiple concepts across different granularities. This removes reliance on textual descriptions and large-scale pre-training, making DCBMs applicable for fine-grained classification and out-of-distribution tasks. Attribution analysis using Grad-CAM demonstrates that DCBMs deliver visual concepts that can be localized in test images. By leveraging dataset-specific concepts instead of predefined ones, DCBMs enhance adaptability to new domains.

All publications