Character Error Rate (CER)

The Character Error Rate (CER) compares, for a given page, the total number of characters (n), including spaces, to the minimum number of insertions (i), substitutions (s) and deletions (d) of characters that are required to obtain the Ground Truth result. The formula to calculate CER is as follows: CER = [ (i + s + d) / n ]*100

Every little transcription error is a statistically full-fledged error. This means that every missing comma, a “u” instead of a “v”, an additional space or even an uppercase letter instead of a lowercase letter are included in the CER as errors.

Figure 1 CER on Train Set and on Validation Set

