Sunday, April 10, 2011

What next?

Step one was to get the maatraa clipping code into Tesseract, which has happened. We still have the following issues to resolve before we can have excellent recognition rates:

We need to split the following glyphs into separate consonant and vowel signs.

1) Consonant + descending vowel sign

Example:





2) Consonant + ascending vowel sign

Example:



In summary we need to be able to do the following transformation before sending the image to Tesseract:


FROM
TO

No comments:

Post a Comment