Navigating Multigraphic Written Artefacts

Project Group

This project group examines complex distributions of different sign systems and contents in multigraphic WAs and develops digital methods to support research navigation through such complexities. Building on UWA’s sustained engagement with the digital turn in the Humanities, the project complements established advances in text recognition and linguistic semantics by addressing the non-textual dimensions of manuscripts, inscriptions, and related materials. It takes UWA’s concept of visual organisation as a methodological anchor, focusing on the spatial distribution of visual elements such as script, pictures, and other signs, and their relationships. In doing so, the group positions multimodal learning as a central contribution to digital Manuscript Studies, with the explicit aim of scaling expert practices of recognition, comparison, and categorisation across large corpora.

The project brings together researchers from computer science and the humanities to address the challenge of navigating large-scale digitised image collections of multigraphic written artefacts. The collaboration centres on shared research questions: what constitutes a relevant visual concept in multigraphic written artefacts, and how can computational models represent such concepts in ways that facilitate human interpretation? Humanities scholars identify and define the interpretative categories that guide what specialists look for and compare in these materials, including motifs, scenes, layout cues, script regions, and marginal structures, and they document the reasoning scholars use when they search, browse, compare exemplars, and formulate descriptions. These insights underpin an annotation framework with clear categories and guidance, and with records of uncertainty, sources, and agreement between project annotators who label the digitised materials. The project also sets shared rules for managing data, including consistent metadata, rights and access, and the implications of computational power consumption for responsible research practice.

Computer scientists translate these scholarly requirements into research on computer vision and vision language methods. Baseline tasks such as region detection, layout segmentation, element linking, and clustering of recurring patterns provide controlled points of comparison, while multimodal approaches link scholarly descriptions to page-level and region-level evidence. Text-based image retrieval becomes a testbed for relating disciplinary vocabularies to visual concepts at scale, enabling searches for similar elements across thousands, and potentially far more, digitised images. An interactive navigation environment integrates these methods into a unified workflow with an emphasis on usability, sustainability, and long-term maintainability, so that evaluation concerns extend beyond accuracy to include qualitative scholarly utility.

Case studies drawn from Europe, Africa, and Asia illustrate the range of materials and questions addressed in the project. Examples include French medieval Bible moralisée manuscripts, whose dense and formally repetitive image programmes and tightly structured pages support large-scale comparison of visual organisation and pictorial formulae, and Islamic manuscripts from West Africa in dispersed digital archives, where automated analysis of layout and marginal space informs subject-oriented classification across heterogeneous collections. A further case study helps to uncover the compilation process of a large collection of judicial case records from nineteenth-century China through an automated categorisation by paper supplier, based on different pre-printed frames and gridlines.

Evaluation proceeds as both machine- and human-based inquiry. Quantitative measures, for example precision and recall, are paired with qualitative assessments of usefulness: whether systems assist hypothesis formation, support accountable comparison, surface meaningful clusters while making uncertainty explicit (for example, confidence levels and recorded disagreements) and keeping each result linked to its page context and documented provenance, including collection identifiers, source metadata, and annotation history. The project serves as a proof of concept for widening access to large-scale digitised cultural collections, and it integrates reflection on how such methods shape scholarly and public interaction with these artefacts.