Institutsseminar

The IPD Institutsseminar is an ongoing course that aims to provide information about current research work at the Institute. In particular, students at the Institute are given the opportunity to report on their Bachelor's and Master's theses in front of a larger audience. The focus is on the problem definition, the solution approaches and the results achieved. However, the seminar is open to all students and employees of the KIT as well as other interested parties.

Place Building 50.34, Room 348 or online (see description)
Time On Fridays, 10:30-12:00 / 14:00-15:30

The presentations must adhere to the following time frame:

  • Proposal: 12 minutes speaking time + 8 minutes discussion
  • Bachelor's thesis: 20 minutes speaking time + 10 minutes discussion
  • Master's thesis: 30 minutes speaking time + 15 minutes discussion

If you have any questions or comments, please send an e-mail to the Institutsseminar team.

February 5, 2026 at 14:00
Title
Testing by betting: A tale of P's and E's
Presenter Jose Cribeiro
Meeting Type PhD Colloquium
Supervisor -
Room Room 348 (Building 50.34)
Online No link.
Abstract TBD.
February 6, 2026 at 10:00
Title
Improving the Quality of Temporal n-Gram Corpora using Entity Matching
Presenter Lukas Wilhelm
Meeting Type Bachelor's thesis -- Final presentation
Supervisor Fabian
Room Room 348 (Building 50.34)
Online No link.
Abstract The data quality within the Google Books Ngram Corpus is not sufficient for many purposes. This thesis aims to address common data quality problems like misspellings, errors due to optical character recognition, or spelling reforms. The main goal is to implement a framework for rule-based entity matching based on Active Learning from small example sets. Thus, different spellings and variants of the same word can be unified into a single canonical time series covering the word's meaning instead of its different spelling variants. While a general evaluation of the data quality before and after applying out method is difficult to achieve, both the coverage and correctness of the generated rules can serve as a sensible proxy.