Todo List

page Hacking Tesseract V1.03
Fix broken references: 78: Warning: unable to resolve reference to `tess_api' for command 139: Warning: unable to resolve reference to `stacktraces' for command

Member word_blob_quality
outword is word on edges of block vs inword is any word within a block, right (in word_blob_quality())?

Member compute_height_modes
Assuming that the baseline represents the median height, ascenders would be one "mode" and descenders would be a second "mode". Since different ascenders/descenders "ascend"/vs only so high/vs, there may be a half dozen "modes"... but why is the limit (MAX_HEIGHT_MODES) so high (12)?

Member vigorous_noise_removal
Is it ONLY done in preparation for restore_underlined_blobs()?

Member AddToNormProtosList
Not sure if it adds ALL of them

Member ParseArguments
The usage printed on error is out of sync with above

Member ParseArguments
The usage printed on error is out of sync with above

Member WriteNormProtos
Ask Ray for definition of significant and insignificant features and why this is needed to begin with (guess: extract "decisive" features from "fluff" caused by the many different fonts used in training)

Member EvidenceOf
This needs a higher-level explanation of what it does and how

fmg: What effect does dec/increasing this have?

Page Glossary of OCR terms (as used in Tesseract) V0.04
Add reference to entry to read after Edge Detection

while callpicofeat() is called, it's from a somewhat unexpected place

above really needs a BS-check, courtesy Ray Smith.

Page Glossary of OCR terms (as used in Tesseract) V0.04
Some of these questions (which were for myself) need answers and/or BS-check by someone more knowledgeable.

Page Glossary of OCR terms (as used in Tesseract) V0.04
need to verify exact relationship between color xsition & sign

Page Glossary of OCR terms (as used in Tesseract) V0.04
there have been reports to the contrary on the forums on this exact issue - need to check why

Page Glossary of OCR terms (as used in Tesseract) V0.04
While InitMicroFXVars() called, Microfeatures never extracted...

Page Glossary of OCR terms (as used in Tesseract) V0.04
next part needs work, hard-hat & barf-bag area

Page Glossary of OCR terms (as used in Tesseract) V0.04
How the heck does MySqrt2() work? What's the multiplication by '41943' for?

Page Glossary of OCR terms (as used in Tesseract) V0.04
list should describe whether each object is temporary (i.e,, bucket); can be de/serialized (?block?); exists at the same time but for different purpose (outlines & blobs) and what those different purposes are; is derived from user's image or developer/pre-training; serves a pivotal/conceptually important derivation; etc.

Page Glossary of OCR terms (as used in Tesseract) V0.04
Q: What's the difference between seam and edge (A: Edges are pre-outline while seams are post-row)?

Page Glossary of OCR terms (as used in Tesseract) V0.04
Put "i" dot ref and note here.

Page Glossary of OCR terms (as used in Tesseract) V0.04
Fix broken references:

Page DAWG = Directed Acyclic Word Graphs
Need to add info here on:

Page A note on transitions
None of the functions in blkocc.cpp seem to be called during standard processing of an image - are they only for development?

Generated on Wed Feb 28 19:49:29 2007 for Tesseract by  doxygen 1.5.1