Starting layout analysis processes via the API can be done with POST requests to
https://transkribus.eu/TrpServer/rest/LA
The following query parameters are available on this endpoint:
collId
: the collection ID you with the documents you want to processdoBlockSeg
true
-> existing layout will be deletedfalse
(default) -> keep existing text block regions
doLineSeg
true
(default) -> detect lines in text blocksfalse
-> keep existing lines
doPolygonToBaseline
true
-> inspect line polygons and add baselinesfalse
(default) -> keep existing baselines
doBaselineToPolygon
true
-> extrapolate new line polygons from baselinesfalse
(default) -> skip
jobImpl
: the tool to use, default (omit this parameter) is “TranskribusLAJob” which is recommended for most documents
The request body specfies the pages to be processed, in terms of document IDs and page IDs. Optionally, a transcript ID (tsId) can specify a transcription version and PAGE XML region element IDs can be passed for processing specific sections of a page. The endpoint accepts JSON or XML:
{ "docList" : { "docs" : [ { "docId" : 1543, "pageList" : { "pages" : [ { "pageId" : 1234, "regionIds" : [ "the_xml_id_of_a_text_region" ] }, { "pageId" : 12345, "tsId" : 1234567 } ] } } ] } }
Equivalent XML representation:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <jobParameters> <docList> <docs> <docId>1543</docId> <pageList> <pages> <pageId>1234</pageId> <regionIds>the_xml_id_of_a_text_region</regionIds> </pages> <pages> <pageId>12345</pageId> <tsId>1234567</tsId> </pages> </pageList> </docs> </docList> </jobParameters>
If successful (HTTP status code 200), the response will contain a job status object with a jobId that can be used to monitor the progress (see Job API).