12 Mayıs 2013 Pazar

Letter Recognition Example

Hello Everybody! In this post, we are going to deal with letters. To keep matters simple, we will only work on letters fitting nicely on 10x15 matrices. Again, we will be  using python 2.

Our code should be able to recognize letters after a "supervised learning" session. This means, someone will draw a letter, tell the program which one it is, repeat it a few times and the program will learn the letter.

For those who want to start playing:

  1. Run data_creator.py, draw a letter, type it in the entry box. Press submit, repeat a few times for each letter.
  2. Run show.py, draw a letter, press submit. Have fun.

First, I want to explain the structure of our example source code, so that you can focus on the important parts later. The whole program consists of 4 files. main.py is where we define the Letter class and do the real job. In data_holder.py we define a class to help us with saving learned data. Finally, data_creator.py gives us a GUI for teaching letters and show.py one for recognition. I won't be showing the source code in the article, you can download it here. (In case of a broken link, please let me know).

As you can seen in main.py lines 3-9, a letter has a code (a single char showing which letter it is) and a map. The map is where the magic happens. It is a 10x15 matrix (could theoretically be in any size, but when it is smaller, the letters can't be drawn so good, when it is bigger, you need a longer teaching session), which holds how often a pixel is used in that letter. Also, how many times was the pixel painted in the teaching session. You can think of each teaching entry as another 10x15 matrix. An element is 1 if corresponding pixel is painted, otherwise 0. When the program is given an input, the matrix of the letter is summed with the input. After a few repeats the letter matrix contains how often a pixel is used when drawing the letter. Here are a few visualised examples of what I have teached:

(Lighter pixels are used more often, images created by modifying show.py). The learning part is in main Letter.learn() where a letter learns from a matrix, Letter.merge() where the data of two same letters are summed up.

When comparing an input with saved data, the program calculates a single value by summing the values of each pixel in the learned matrix if the corresponding pixel in input is 1. Then this value is divided by the sum of all values in the learned matrix, so that bigger letters (or the ones learned with longer sessions) do not produce bigger values. Otherwise a fully painted matrix would match every single letter better than anything else. Lastly, the letter with biggest match value is selected. Easy, isn't it?

Feel free to edit and publish the source code and the article. I would be glad if you could give a link to this page and send one to your work to me, but you don't have to. Here are a few things to do:

  • Make the program say no letter when the biggest match is under a certain value
  • Teach more letters (actually, you have to start by teaching, the save.let file isn't in the archive)
  • Make it italic aware.
  • Make it grow letter which do not fit nicely in the matrix. In real life scenarios images do not come in such nice fittings
  • A web applet, someone please!
  • Have fun!

May the source be with you!