Chemometrics
Stephan Seifert, Lucas Voges
The different methods are generating complex data that can be regarded as analytical fingerprints of written artefacts. In order to exploit these data for artefact profiling the relevant information has to be extracted, which is why various chemometric approaches are applied. Explorative methods like Principal Component Analysis (PCA) are utilised without prior knowledge in order to analyse the main differences between the data sets. In addition, machine learning approaches like Random Forests (RFs) focus on the differentiation of predefined groups, e.g. written artefacts that originate from different locations. The models that are obtained by the data of samples with known group membership are subsequently utilised to classify samples for which these properties are not yet known. In addition, machine learning approaches are also applied for the characterization of the samples. This means that variables that are characteristic for specific groups are selected and interactions of different variables are analysed to intensively study the sample properties. Furthermore, pre knowledge about the artefacts is included into the analysis to improve classification performance and directly test for specific written artefact properties.