I wrote this python code that reads in a box file and performs the rotation operation on the corresponding image
#!/usr/bin/python
#-*- coding:utf8 -*-
import Image,ImageDraw
import sys
box = open(sys.argv[1],'r')
print type(sys.argv[1])
lines = box.readlines()
image_name = sys.argv[1].split('.')[0]+'.tif'
input_image = Image.open(image_name)
wt = input_image.size[0]
ht = input_image.size[1]
#print wt," ",ht
new_image=Image.new("L",(wt*2,ht),255)
pen=ImageDraw.Draw(new_image)
offset = 0
prevtlx = 0
for line in lines:
fields = line.split(' ')
delta_y = int(int(fields[4].strip())) - int(fields[2])
delta_x = int(fields[3]) - int(fields[1])
top_left_x = int(fields[1])
top_left_y = ht - int(fields[2]) - delta_y
bot_right_x = int(fields[3])
bot_right_y = ht - int(fields[4].strip()) + delta_y
box = (top_left_x,top_left_y,bot_right_x,bot_right_y)
char = input_image.crop(box)
char = char.rotate(90)
if top_left_x<prevtlx:
offset = 0
newwt = char.size[0]
newht = char.size[1]
newbox = (top_left_x+offset , top_left_y , top_left_x+offset+newwt ,top_left_y+newht)
print newbox
offset = offset+ (newwt - newht + 2)
prevtlx = top_left_x
new_image.paste(char, newbox)
#aw_input('>')
new_image.save('mod.tif',"TIFF")
Then I take an image and run the following command on it to generate the box file:
tesseract bengali2.tif bengali2 -l ban batch.nochop makeboxOn running the script one finds the images below:
Transformed Image
The experiment has been somewhat disappointing. The quality of the character images degrades after rotation. Also since the boxing is not perfect, wrong groups have been rotated. Not that this technique can not be used. I need to make the same modifications in Tesseract C++ code. The idea is to rotate the character images and compare the classifier confidence between the original and the modified character image. The higher value will be chosen.
Also, I need a version of Pango renderer that can render the vowel signs without the dotted circles. I probably need to make a few lines of changes and rebuild Pango, as Sayamindu said.
So here I dive into the code base again.
Also, I need a version of Pango renderer that can render the vowel signs without the dotted circles. I probably need to make a few lines of changes and rebuild Pango, as Sayamindu said.
So here I dive into the code base again.
Hi Debayan,
ReplyDeleteI talked with you about OCR after your talk at FOSS.in. Dropping this comment to let you know that I have managed to locate you on world wide web and will be following your work.
Abhaya