Training Parameters for the P2PaLA structure tool (under construction)
These are the structure types that are tagged using Transkriubs on region level. Do not use whitespace in those structure types and be careful with case sensitivity, i.e. we recommend using only lowercase letters. Also we recommend to use dashes (-) and underscores (_) as the only special character, although other may work too.
paragraph heading footnote page-number
Merged structure types
Merged structure types are used to treat certain structure types the same as others during training (e.g. ‘footnote-continued’ or ‘footer’ like ‘footnote’). Expected is a list of the structure types, separated by a colon with the structure types to merge.
Here, regions tagged with ‘footnote-continue’ and ‘footer’ are regarded as ‘footnote’ while ‘header’ is regarded as ‘heading’ during training.
You can specify whether the model should be able to detect (text-)regions, baselines or both.