Data Linking Infrastructure
Foundations and Architecture
2019–2025
RFF01
The Data Linking research field has developed an innovative infrastructure to support humanities scholars across all research areas of the Cluster of Excellence Understanding Written Artefacts (UWA). The core research question addressed how diverse types of research data, from textual, visual, and spatial data to materials-science and spectral analyses, can be systematically connected, enabling new forms of interdisciplinary research questions. The datasets considered originate from UWA and affiliated projects on written artefacts and include digitised images and videos, textual transcriptions (including OCR-derived texts), analytical measurements (e.g., FTIR), as well as descriptive temporal and spatial research data. The overarching goal has been to build a robust interoperable data linking infrastructure that supports the FAIR (Findable, Accessible, Interoperable, Reusable) principles and fosters new research perspectives through linking heterogeneous datasets. To achieve this, the Data Linking field has implemented methods for data harmonisation, automated ingestion, and federated access across project-specific information systems. The resulting infrastructure enables integrated searches and Large Language Models (LLMs)-assisted analyses through the Humanities-Aligned Chatbot (ChatHA), which provides natural-language interfaces for querying research data and related publications. These developments support interoperability and reuse of data in accordance with the FAIR principles and promote cross-domain collaboration between the humanities, social and computational sciences. By combining robust data management, semantic linking, and LLMs-based tools, the research field significantly contributes to the Cluster’s goal of establishing a sustainable, interoperable digital research environment for the study of written artefacts.
People
Project lead: Ralf Möller
Research Associates: Thomas Asselborn, Marcel Gehrke, Florian Marwitz, Silvia Melzer, Simon Schiff
More Information
Long-term Data Preservation and Accessibility
We integrated UWA research data into the Research Data Repository (RDR), enhancing it with new preview classes and visualisation tools so that researchers can assess the relevance of stored data for future investigations.
Persistent and Citable Research Data
We implemented facilities for data citation at the element level: researchers can now cite individual data-objects (e.g., segments of a dataset) derived from visualisations or linked datasets, thereby fostering scientific transparency and exchange.
Linked Data and FAIR via Federated Search
We realised a federated search infrastructure, enabled also by chatbots using Large Language Models (LLMs), that permits exploration of linked datasets across different formats. By transferring heterogeneous research data into browser-based information systems, we ensured broad interoperability and usability.
Integration into Interdisciplinary Research
The technologies developed in Research Field F have been adopted in multiple projects: humanities scholars working with artefact profiling and materials-science analyses now routinely use the infrastructure to combine image or video data, textual transcriptions, spatial or temporal research data, and spectroscopy results.
ChatHA
Searching for relevant information in a large collection of texts, for example, all publications in the Forschungsinformationssystem (FIS) and other sources like the Research Data Repository (RDR) is a time-consuming process. Additionally, articles may get looked over even though they may be relevant to the specific research question. To help researchers in finding information efficiently, ChatHA (Humanities-Aligned Chatbot) was developed that automatically takes selected texts and builds a chatbot interface around it that can be queried to find the relevant information from the selected texts.
See also https://www.philosophie.uni-hamburg.de/chai/ueber-das-institut/highlightsuwa.html
Demos
- EDAK: https://csmc-view.chai.uni-hamburg.de/view/https%3A//staging-rdm.fdr.uni-hamburg.de/records/0aevs-xp230/files/edak_beta_2025.2.csmc
- MprinT: https://csmc-view.chai.uni-hamburg.de/view/https://staging-rdm.fdr.uni-hamburg.de/records/8nqv8-3jv49/files/mprint.csmc/index.html
- Netamil 2: https://csmc-view.chai.uni-hamburg.de/view/https://staging-rdm.fdr.uni-hamburg.de/records/2ca24-h9933/files/tamilex.csmc/index.html
- FedBook: https://csmc-view.chai.uni-hamburg.de/view/https%3A//staging-rdm.fdr.uni-hamburg.de/records/8sp6d-hjn82/files/federated.csmc
- TAMAR: https://tools.fdm.uni-hamburg.de/tamar/
- Gilgit: https://heurist.fdm.uni-hamburg.de/html/heurist/?db=CSMC_UWA_BuddhIndo&website&id=22
Cooperations
- Close cooperation with all research fields (i.e.: RFA, RFB, RFC, RFD, RFE, FNT)
- Ethics group at CSMC (R. Möller, S. Thiemann, S. Melzer)
- Dr Ian Johnson (HEURIST designer) and Dr Michael Falk (Community Technical Adviser), THE UNIVERSITY OF SYDNEY, Faculty of Arts and Social Sciences, Australia
- University of Glasgow, Dr Luca Guariento (Research Systems Developer),
- Gesellschaft für Systems Engineering (GfSE)
- University of Freiburg & Shandong-Universität, Dr Haiyan Hu-von Hinüber (Gilgit Bronze Inscriptions)
- NFDI
- Universität Regensburg, Prof. Meike Klettke (Mentoring)
- Universität Koblenz, Dr. Jens Dörpinghaus (OCR)
- MprinT, Prof. Scott Reese
- Comité Argentino de Estudios Bizantinos
Public Outreach and/or Knowledge Transfer
Activities
- Prompting in the Humanities with ChatGPT & Co
Melzer, S. (Organisator/-in) & Asselborn, T. (Organisator/-in)
17.11.2025 - Large Language Models for Research Data Management?!
Melzer, S. (Organisator/-in), Möller, R. (Organisator/-in), Thiemann, S. (Organisator/-in) & Bender, M. (Organisator/-in)
18.09.2025 - 5th Workshop on Humanities-Centred Artificial Intelligence: 48th German Conference on Artificial Intelligence
Melzer, S. (Organisator/-in), Thiemann, S. (Organisator/-in), Peukert, H. (Organisator/-in) & Bender, M. (Organisator/-in)
16.09.2025 - Language Models in Humanities
Asselborn, T. (Vortragende/-r), Bender, M. (Co-Autor/-in), Marwitz, F. (Vortragende/-r), Melzer, S. (Vortragende/-r) & Möller, R. (Co-Autor/-in)
16.07.2025 - Managing Datasets in the Digital Age
Asselborn, T. (Vortragende/-r), Bender, M. (Co-Autor/-in), Marwitz, F. (Vortragende/-r), Melzer, S. (Vortragende/-r) & Möller, R. (Co-Autor/-in)
16.07.2025 - Data Linking Services, Management and Architecture
Melzer, S. (Vortragende/-r), Asselborn, T. (Vortragende/-r), Marwitz, F. (Vortragende/-r), Thiemann, S. (Co-Autor/-in) & Möller, R. (Co-Autor/-in)
11.07.2025 - Enhancing OCR using Large Language Models
Asselborn, T. (Vortragende/-r), Bender, M. (Co-Autor/-in), Melzer, S. (Vortragende/-r), Dörpinghaus, J. (Co-Autor/-in) & Möller, R. (Vortragende/-r)
11.07.2025 - Generative AI for interactive learning in education
Melzer, S. (Vortragende/-r), Asselborn, T. (Vortragende/-r) & Möller, R. (Co-Autor/-in)
11.07.2025 - Linking Research Data and Vizualisation
Melzer, S. (Vortragende/-r), Asselborn, T. (Vortragende/-r), Marwitz, F. (Vortragende/-r), Möller, R. (Co-Autor/-in), Bender, M. (Co-Autor/-in), Reese, S. (Co-Autor/-in) & Bang, A. K. (Co-Autor/-in)
11.07.2025 - Natural Language Processing for Federated Information Retrieval in the Humanities
Asselborn, T. (Vortragende/-r), Melzer, S. (Vortragende/-r), Möller, R. (Co-Autor/-in), Dal Sasso, E. (Co-Autor/-in), Li, C. (Co-Autor/-in) & Peukert, H. (Co-Autor/-in)
11.07.2025 - stellenwerk Jobmesse
Özcep, Ö. L. (Organisator/-in), Asselborn, T. (Co-Organisator/-in), Melzer, S. (Co-Organisator/-in) & Marwitz, F. (Co-Organisator/-in)
17.06.2025 - Unterstützung eines Schulpraktikums am CHAI-Institut
Melzer, S. (Sonstige), Asselborn, T. (Sonstige), Marwitz, F. (Sonstige) & Möller, R.
22.04.2025 → 14.05.2025 - Data Curation as a Stepwise Service to Data Sustainability: The Grey Area between Small-Scale Applications and Large-Scale Data Repositories
Asselborn, T. (Vortragende/-r), Peukert, H. (Co-Autor/-in), Melzer, S. (Co-Autor/-in) & Möller, R. (Vortragende/-r)
09.04.2025 - Girls' & Boys' Day 2025
Asselborn, T. (Organisator/-in), Droese, J. (Organisator/-in) & Melzer, S. (Organisator/-in)
03.04.2025 - ChatHA: Dein smarter Chatbot für jedes Thema
Marwitz, F. (Organisator/-in), Asselborn, T. (Co-Organisator/-in), Bender, M. (Co-Organisator/-in) & Melzer, S. (Co-Organisator/-in)
25.02.2025 - Berufsvorstellung: Informatiker:in und Musikwissenschaftler:in
Asselborn, T. (Vortragende/-r), Droese, J. (Vortragende/-r) & Melzer, S. (Vortragende/-r)
29.01.2025 - The 28th European Conference for South Asian Studies (did not take place due to few contributions)
Melzer, S. (Organisator/-in) & Hu-von Hinüber, H. (Organisator/-in)
2025 - Research Data Management 4.0 in the Humanties
Melzer, S. (Vortragende/-r) & Möller, R. (Co-Autor/-in)
19.11.2024 - Data Linking Workshop 2024: Dataset Provision and Citation in the Digital Age
Möller, R. (Organisator/-in), Asselborn, T. (Organisator/-in), Bender, M. (Organisator/-in), Marwitz, F. (Organisator/-in) & Melzer, S. (Organisator/-in)
18.10.2024 - 4th Workshop on Humanities-Centred Artificial Intelligence: 47th German Conference on Artificial Intelligence
Melzer, S. (Organisator/-in), Thiemann, S. (Organisator/-in), Peukert, H. (Organisator/-in) & Radisch, E. (Organisator/-in)
23.09.2024 - KI ZUM ANFASSEN - (WIE) GEHT DAS?
Nantke, J. (Organisator/-in), Zinsmeister, H. (Organisator/-in), Möller, R. (Organisator/-in), Asselborn, T. (Co-Organisator/-in), Benz, N. (Co-Organisator/-in), Flüh, M. (Co-Organisator/-in), Jablotschkin, S. E. (Co-Organisator/-in), Jurkiewicz-Rohrbacher, E. (Co-Organisator/-in), Kaczmarczyk, M. (Co-Organisator/-in), Klomfaß, V. (Co-Organisator/-in), Melzer, S. (Co-Organisator/-in), Sökefeld, C. (Co-Organisator/-in) & Stulen, A. (Co-Organisator/-in)
03.07.2024 - Girls' & Boys' Day 2024
Asselborn, T. (Organisator/-in), Heiles, M. (Organisator/-in) & Melzer, S. (Organisator/-in)
25.04.2024 - The 18th Annual IEEE International Systems Conference 2024
Melzer, S. (Sitzungsleiter/-in)
15.04.2024 → 18.04.2024 - 3rd Workshop on Humanities-Centred Artificial Intelligence: 46th German Conference on Artificial Intelligence
Melzer, S. (Organisator/-in), Thiemann, S. (Organisator/-in) & Peukert, H. (Organisator/-in)
26.09.2023 - Building Information Systems on Demand with ChatGPT?
Melzer, S. (Redner/-in)
28.06.2023 - Data Linking Workshop 2023: Computer Vision and Natural Language Processing – Challenges in the Humanities
Melzer, S. (Organisator/-in) & Hu-von Hinüber, H. (Organisator/-in)
27.06.2023 → 28.06.2023 - What the Buddhological Epigraphy can expect from the AI: The Information System "Buddhist Bronzes Inscriptions"
von Hinüber, O. (Redner/-in), Hu-von Hinüber, H. (Redner/-in) & Melzer, S. (Redner/-in)
27.06.2023 - Federated search in epigraphic databases using EpiDoc
Harter-Uibopuu, K. (Co-Autor/-in), Möller, R. (Co-Autor/-in), Melzer, S. (Gastredner/-in), Weise, F. (Gastredner/-in) & Klettke, M. (Co-Autor/-in)
25.04.2023 - Doctoral Symposium at the 25th International Symposium on Formal Methods (FM 2023)
Ahrendt, W. (Organisator/-in), Möller, R. (Organisator/-in) & Melzer, S. (Vorsitzende/-r)
06.03.2023 - Seminar on Research Data Management
Melzer, S. (Vortragende/-r), Thiemann, S. (Co-Autor/-in), Möller, R. (Co-Autor/-in) & Helmholz, K. (Co-Autor/-in)
27.01.2023 - Databasing on Demand and Federated Search in Manuscript Databases
Melzer, S. (Redner/-in), Schiff, R. S. (Co-Autor/-in) & Möller, R. (Co-Autor/-in)
03.11.2022 - Databasing on Demand and Federated Search in Manuscript Databases
Melzer, S. (Gastredner/-in)
03.11.2022 - The University of Sydney
Melzer, S. (Visiting scholar)
27.09.2022 → 08.11.2022 - Humanities-Centred Artificial Intelligence (CHAI): 45th German Conference on Artificial Intelligence
Melzer, S. (Organisator/-in), Thiemann, S. (Organisator/-in) & Peukert, H. (Organisator/-in)
19.09.2022 - Tag der offenen Tür am CSMC
Melzer, S. (Redner/-in)
10.06.2022 → 11.06.2022 - Storytelling with Heurist: Databasing on Demand
Melzer, S. (Redner/-in)
31.05.2022 → 03.06.2022 - Federated Search in Manuscript Databases
Melzer, S. (Redner/-in)
04.03.2022 - KerML, DOL und SysML:Chancen und Grenzen von formalen Sprachen bei der Modellierung von Systemen unter Berücksichtigung des Status Quo von SysML V1.x
Priglinger, S. (Vortragende/-r) & Melzer, S. (Vortragende/-r)
12.11.2021 - Presentation of the working group FAS
Melzer, S. (Vortragende/-r) & Weilkiens, T. (Co-Autor/-in)
10.11.2021 - Mentoring programme for female postdoc researchers 2021/2022
Melzer, S. (Teilnehmer/-in)
01.10.2021 → 28.04.2022 - Humanities-Centred Artificial Intelligence (CHAI): 44th German Conference on Artificial Intelligence
Melzer, S. (Organisator/-in), Thiemann, S. (Organisator/-in) & Gippert, J. (Organisator/-in)
28.09.2021 - Early Career Research Consortium 2021 at KI2021
Melzer, S. (Organisator/-in)
21.09.2021 - DOL, SysML & FAS: Chancen und Grenzen von (semi-)formalen Sprachen bei der Entwicklung von funktionalen Architekturen von Systemen
Melzer, S. (Vortragende/-r) & Priglinger, S. (Vortragende/-r)
16.09.2021 - Databasing on Demand: HEURIST for Humanities Researchers
Melzer, S. (Vortragende/-r) & Möller, R. (Co-Autor/-in)
12.07.2021 - Epidoc Archiving and Viewing
Melzer, S. (Vortragende/-r) & Möller, R. (Co-Autor/-in)
12.07.2021 - Data Linking Study Day 2021
Möller, R. (Organisator/-in) & Melzer, S. (Organisator/-in)
21.06.2021 - Episode 8: What do formal modeling languages have to do with Winnie-the-Pooh? (In german)
Weilkiens, T. (Vortragende/-r), Muggeo, C. (Vortragende/-r) & Melzer, S. (Gastredner/-in)
08.03.2021 - Simulation of Database Interactions 4 early Validation of Digitized Enterprise Processes
Melzer, S. (Vortragende/-r)
24.02.2021 - HEURIST – Create Your Own Databases
Melzer, S. (Vortragende/-r)
15.06.2020
CSMC news
- Humanities-Centred Artificial Intelligence
https://www.csmc.uni-hamburg.de/news/2025-10-08-chai.html - Data Linking Team Wins Best Young Research Paper Award
https://www.csmc.uni-hamburg.de/news/2025-09-24-fedcsis.html - Casting a New Light on Ancient Inscriptions
https://www.csmc.uni-hamburg.de/news/2025-09-23-edak.html - MprinT Information System Released
https://www.csmc.uni-hamburg.de/news/2025-07-07-mprint-information-system.html - New Web Tool: Transcriptions and Annotations for Manuscript Research
https://www.csmc.uni-hamburg.de/news/2025-06-10-tamar.html - Humanities-Centred Artificial Intelligence
https://www.csmc.uni-hamburg.de/news/2024-11-04-chai-2024.html
Looking Back: Girls’ Day and Boys’ Day 2024
https://www.csmc.uni-hamburg.de/about/blog/2024-05-03-girls-boys-day.html - Humanities-Centred Artificial Intelligence
https://www.csmc.uni-hamburg.de/news/2023-12-08-chai-2023.html - Humanities-Centred Artificial Intelligence
https://www.csmc.uni-hamburg.de/news/2022-12-20-chai.html - Week of AI in Lübeck
https://www.csmc.uni-hamburg.de/news/2022-11-09-week-of-ai.html - Welcome on Board, Pepper!
https://www.csmc.uni-hamburg.de/news/2022-05-06-pepper.html - Humanities-Centred Artificial Intelligence
https://www.csmc.uni-hamburg.de/news/2022-02-28-chai.html - 5 Questions to...Sylvia Melzer
https://www.csmc.uni-hamburg.de/about/blog/2022-09-27-sylvia-melzer.html - Data Linking Workshop on 16 November 2020
https://www.csmc.uni-hamburg.de/news/2020-10-27-data-linking-workshop.html
Publications
All publications (124 in total with 3 FTEs at a time) can be found under https://www.csmc.uni-hamburg.de/research/cluster-projects/field-f/publications.html
Future publications (in preparation, submitted, accepted)
- Thomas Asselborn, Magnus Bender, Jens Dörpinghaus, Ralf Möller, Sylvia Melzer
Enhancing Text Recognition of Damaged Documents through Synergistic OCR and Large Language Models (submitted, Springer) - Sylvia Melzer, Thomas Asselborn, Meike Klettke, Franziska Weise, Kaja Harter-Uibopuu, Ralf Möller
Federated Information Retrieval with TEI and EpiDoc Data Matching (submitted, Springer) - Marwitz et al.
Lifted Forward Planning in Relational Factored Markov Decision Processes with Concurrent Actions
(submitted, The 25th International Conference on Autonomous Agents and Multi-Agent Systems) - Melzer et al.
Family Tree Identification with Large Language Models (submitted, Manuscript Cultures) - Asselborn et al.
ChatHA – From an Information Retrieval Agent to a System of Expert Agents (submitted, Manuscript Cultures) - Melzer et al.
Natural Language Processing for Federated Information Retrieval in the Humanities (submitted, Manuscript Cultures) - Asselborn et al.
Detecting the Aura of an Article : Was the Original Manuscript Seen? (submitted, Manuscript Cultures) - Melzer et al.
Designing Information Systems for Research Data Repositories by Converting Diverse Data Formats to Enable Interactive Visualization (submitted, Manuscript Cultures) - Marwitz et al.
Managing Datasets and Citation in the Digital Age (submitted, Manuscript Cultures) - Marwitz et al.
What do We Research? Unsupervised Topic Analysis in the Humanities (submitted, Manuscript Cultures)
Dissertations
- S. Schiff (2024) Enabling Research Data Management for Non IT-Professionals
https://www.fis.uni-hamburg.de/publikationen/detail.html?id=bdc13f65-6073-4d98-9f2b-5d80521dd02f - M. Gehrke (2021) Taming Exact Inference in Temporal Probabilistic Relational Models
https://www.fis.uni-hamburg.de/publikationen/detail.html?id=8e5f6aa8-5b7b-4ea4-8845-8b11aec0f128 - Th. Asselborn (2026) Forthcoming
- F. Marwitz (2026) Forthcoming
