Walk the blobs in the word together with the text string and reject map.
NOTE: All evaluation is done on the baseline normalised word. This is so that the BOX class can be used (integer). The reasons for this are:
A) Try to re-estimatate x-ht and caps ht from confirmed pts in word.
FOR each non reject blob IF char is baseline posn ambiguous Remove ambiguity by comparing its posn with respect to baseline. IF char is a confirmed x-ht char Add x-ht posn to confirmed_x_ht pts for word IF char is a confirmed caps-ht char Add blob_ht to caps ht pts for word IF Std Dev of caps hts < 2 (AND # samples > 0) Use mean as caps ht estimate (Dont use median as we can expect a fair variation between the heights of the NON_AMBIG_CAPS_HT_CHS) IF Std Dev of caps hts >= 2 (AND # samples > 0) Suspect small caps font. Look for 2 clusters, each with Std Dev < 2. IF 2 clusters found Pick the smaller median as the caps ht estimate of the smallcaps. IF failed to estimate a caps ht Use the median caps ht if there is one, ELSE use the caps ht estimate of the previous word. NO!!! IF there are confirmed x-height chars Estimate confirmed x-height as the median value ELSE IF there is a confirmed caps ht Estimate confirmed x-height as a fraction of confirmed caps ht value ELSE Use the value for the previous word or the row value if this is the first word in the block. NO!!!
B) Add in case ambiguous blobs based on confirmed x-ht/caps ht, changing case as necessary. Reestimate caps ht and x-ht as in A, using the extended clusters.
C) If word contains rejects, and x-ht estimate significantly differs from original estimate, return TRUE so that the word can be rematched