Beyond Text and Pixels
Spatial Reasoning for Manuscript Exploration
Individual Research Project 10

The aim of this project is to build a system that can answer textual questions based on the content and visual organisation of pages from digitised multigraphic historical manuscripts. Answers can consist of direct textual feedback or the retrieval of pages relevant to the question. Digitised multigraphic historical manuscripts present a special challenge, because of their wide variety in layout, script, language and style including many different types of illustrations and decorations. Additionally, digitised multigraphic manuscripts have varying degrees of image quality, and the underlying documents can be damaged and degraded. Therefore these documents and the challenges they represent differ strongly from the material usually studied in document retrieval and visual question answering, such as contemporary business documents. Due to the highly diverse visual organisation of historical multigraphic manuscripts across time and traditions, the project will place a special focus on the spatial relationships between objects on the page, such as illustrations, text regions, decorations and other non-textual objects. This focus will also extend to objects occurring within illustrations, to enable the answering of questions that reference the spatial relationships both on the page and inside of illustrations. The resulting system will have so called visual spatial reasoning capabilities.
The scientific novelty of this project lies in the extension of question answering systems towards historical multigraphic documents, placing a special focus on the spatial reasoning capabilities of the system. The project will explore how visual elements on a page interact to form its semantic meaning and make these insights available for question answering. The resulting system will answer questions based on the content of entire pages, capturing the semantic meaning of the visual elements present in the page. This system will be incorporated into software that helps scholars navigate and work with large manuscripts and collections of manuscripts. In this way, the project will integrate into the work of the Project Group on Navigating Multigraphic Written Artefacts. Being situated in UWA, this doctoral project will take advantage of feedback from experts in a wide variety of disciplines, so that these systems can be build to maximize their usefulness for their intended users.