Text2Image Parameters

thresh

          Threshold for text alignment. If the confidence of a text-to-image
           alignment above this threshold, an alignment is done (default = 0.0). A
            good value is between 0.01 and 0.05. Note that the confidence is stored
            in the pageXML anyway, so deleting text alignments with low confidence
            can also be made later.

hyphen

can be null, a non-negative double value or a json-string (default:
null). If no value is set or value is “Infinity”, no hyphenation is done.
If value is a positive double value, the value are the additional costs
to recognize a hyphenation. The default hyphenation signs at the end of
the line are ‘¬’, ‘-‘, ‘:’, ‘=’. The default hyphenation signs at the
beginning of the line are empty. There can be hyphenations between all
letter-pairs. If one wants to use hyphenation rules for a specific
language, this can be configured using the key ‘hyphen_lang’.
The hyphenation sign in the groundtruth will be ‘¬’.
If one wants to configure more, one has to write a j-son-string.
Keys:
prefixes: list of hyphenation sign that can be hyphens at the
beginning of a line (default: empty)

suffixes: list of hyphenation sign that can be hyphens at the end of
a line (default: empty)

skipSuffix: boolean if suffix is optional (true) of forced (false)
(default: false)

skipPrefix: boolean if prefix is optional (true) of forced (false)
(default: false)

hypCosts: non-negative value that produces additional costs to
recognize a hyphenation. (default: 0.0)

pattern: language pattern (e.g. EN_GB, EN_US, DE, ES, FR,…)
(default: empty)

example: “{

“skipSuffix”:false,

“skipPrefix”:true,

“suffixes”:[“¬”,”-“,”:”,”\u003d”],

“prefixes”:[“:”,”\u003d”],

“hypCosts”:6.0,

“pattern”:”EN_GB”

}”

one of the 4 suffixes have to be recognized and one of the both
prefixes can be recognized. Hyphenation costs of 6.0 are added.
Hyphenation is only possible as defined for language EN_GB.

hyphen_lang

    if hyphen is given, hyphenation-rules from different languages can be
    applied. If value = null or empty, a linebreak between all letters is
    possible (unicode-characters of Category L). Otherwise, a rule is applied
    ( see https://github.com/mfietz/JHyphenator.git for details). The
    language have e.g. "DE" for German and "EN" for English. Default = null.

skip_word

    makes it possible to skip a word, for example if a baseline is too short
    (default: null). The value have to be a positive double value. It
    repesents the default delete-costs for each character. A good value is
    4.0. The higher the value, the less words were skipped. If value = 0, a
    word can be deleted without producing costs (destroys the algorithm), if
    value = Infinity, no characters can be deleted.

skip_bl

    makes it possible to skip a baseline (default: null). Sometimes the LA
    finds a baseline in noise (aka false positive). It is possible to delete
    those baselines instead of "pressing" a sequence into the line. The value
    has to be positive double value. The lower the value, the easier a line
    is ignored. A good value is 0.2.

jump_bl

    makes it possible to handle wrong reading order in the LA (default: null)
    The value makes it possible to jump instead of the after a line to every
    other line. If value = 0, the reading order has no effect at all. If
    value = Infinity is the same like value = null. If you cannot trust the
    reading order, set value = 0.

best_pathes

    if the number of confmats and references gets too large, one can only
    keep a specific number of paths at each reference. As default all paths
    are calculated (like setting value = Infinity). A good value is 200.0

Cookie	Description	Duration
viewed_cookie_policy	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.	1 hour
PHPSESSID	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.	1 year

Cookie	Description	Duration
VISITOR_INFO1_LIVE	This cookie is set by Youtube. Used to track the information of the embedded YouTube videos on a website.	5 months
IDE	Used by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.	2 years

Cookie	Description	Duration
GPS	This cookie is set by Youtube and registers a unique ID for tracking users based on their geographical location	30 minutes
tk_or	This cookie is set by JetPack plugin on sites using WooCommerce. This is a referral cookie used for analyzing referrer behavior for Jetpack	5 years
tk_r3d	The cookie is installed by JetPack. Used for the internal metrics fo user activities to improve user experience	3 days
tk_lr	This cookie is set by JetPack plugin on sites using WooCommerce. This is a referral cookie used for analyzing referrer behavior for Jetpack	1 year
_ga	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, camapign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assigns a randoly generated number to identify unique visitors.	2 years
_gid	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the wbsite is doing. The data collected including the number visitors, the source where they have come from, and the pages viisted in an anonymous form.	1 day
matomo	For statistical analysis, we use “Matomo” on this website. This is an open source tool for web analysis. Matomo does not transmit data to servers outside the control of the READ-COOP. Matomo is deactivated when you visit our website. Only if you actively consent will your usage behaviour be recorded anonymously.	1 year

Cookie	Description	Duration
YSC	This cookies is set by Youtube and is used to track the views of embedded videos.	1 year
_gat	This cookies is installed by Google Universal Analytics to throttle the request rate to limit the colllection of data on high traffic sites.	1 minute

Text2Image Parameters

thresh

hyphen

hyphen_lang

skip_word

skip_bl

jump_bl

best_pathes

The COOP

Products & Services

Useful information

Helpful resources

Community