Monday, December 7, 2009

Preliminary results for tilt method


I wrote this python code that reads in a box file and performs the rotation operation on the corresponding image


#!/usr/bin/python
#-*- coding:utf8 -*-

import Image,ImageDraw
import sys

box = open(sys.argv[1],'r')
print type(sys.argv[1])

lines = box.readlines()
image_name = sys.argv[1].split('.')[0]+'.tif'

input_image = Image.open(image_name)

wt = input_image.size[0]
ht = input_image.size[1]
#print wt," ",ht
new_image=Image.new("L",(wt*2,ht),255)
pen=ImageDraw.Draw(new_image)

offset = 0
prevtlx = 0
for line in lines:
fields = line.split(' ')
delta_y = int(int(fields[4].strip())) - int(fields[2])
delta_x = int(fields[3]) - int(fields[1])
top_left_x = int(fields[1])
top_left_y = ht - int(fields[2]) - delta_y
bot_right_x = int(fields[3])
bot_right_y = ht - int(fields[4].strip()) + delta_y
box = (top_left_x,top_left_y,bot_right_x,bot_right_y)
char = input_image.crop(box)
char = char.rotate(90)
if top_left_x<prevtlx:
offset = 0

newwt = char.size[0]
newht = char.size[1]

newbox = (top_left_x+offset , top_left_y , top_left_x+offset+newwt ,top_left_y+newht)
print newbox
offset = offset+ (newwt - newht + 2)
prevtlx = top_left_x

new_image.paste(char, newbox)
#aw_input('>')
new_image.save('mod.tif',"TIFF")


Then I take an image and run the following command on it to generate the box file:
tesseract bengali2.tif bengali2 -l ban batch.nochop makebox


On running the script one finds the images below:




Transformed Image

The experiment has been somewhat disappointing. The quality of the character images degrades after rotation. Also since the boxing is not perfect, wrong groups have been rotated. Not that this technique can not be used. I need to make the same modifications in Tesseract C++ code. The idea is to rotate the character images and compare the classifier confidence between the original and the modified character image. The higher value will be chosen.
Also, I need a version of Pango renderer that can render the vowel signs without the dotted circles. I probably need to make a few lines of changes and rebuild Pango, as Sayamindu said.
So here I dive into the code base again.

1 comment:

  1. Hi Debayan,

    I talked with you about OCR after your talk at FOSS.in. Dropping this comment to let you know that I have managed to locate you on world wide web and will be following your work.

    Abhaya

    ReplyDelete