HƯỚNG DẪN OCR: Optical Character Recognition

Joe

Thành viên VIP
21/1/13
3,075
1,342
113
JavaDude.gif Part I

During the 80/90 the IT society was inundated and buried by a sudden Tsunami which was triggered and caused by Apple and Co. It was the WYSIWYG. What is it? It stands for What You See Is What You Get, Or in today's parlance: the ICONS. The icons were brand-new for that time when people had to struggle with cryptic texts and weird menus. They, the icons, were the "ground-breaking" revolution of that time of the IT text-oriented world. A world of boffins and eggheads. A computer without graphical icons -regardless of what kind of computer- is today practically unsellable. Could you imagine that your iPhone or Android smartphone were driven by only "texts" and "menus" ? Even an ancient Chinese man with the name Confucius had to admit that a picture was better to memorize than a narrative.

I am not a revolutionary, nor a dictator in order to shuffle the IT society like Steve Jobs with his WYSIWYG. But what I want to show is a little wave ITOC-ICAW (If The Other Can - I Can As Well). What is OCR really? Instead of trying to explain OCR to you I took the freedom to cite a text from Wikipedia:
Optical character recognition
or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image
More HERE. In short: getting a corrigible text out of an image.

And like Master Confucius said that if he did it he would understand it. So, I show you the way how to implement an OCR in JAVA and let you try to implement it so that you remember and understand OCR. I have showed you how to process Images and Pixels in JAVA and mentioned about the OCR in relationship with the Fonts. Let refresh our memory:
Java:
  public static BufferedImage createFontImage(String string, String fontName, int fontAtt, int size) {
    //  create a BufferImage with width = 1 and height = 1
    BufferedImage image = new BufferedImage(1, 1, BufferedImage.TYPE_INT_ARGB);
    Graphics2D g = image.createGraphics();
    // create a font with the given fontName (e.g. TimesRoman), attribute (e.g. Font.BOLD) and size (e.g. 15)
    Font font = new Font(fontName, fontAtt, size);
    // convert to FontMetrics
    FontMetrics metrics = g.getFontMetrics(font);
    int height = metrics.getHeight();
    int width  = metrics.stringWidth(string);
    // create an image with this width and height
    image = new BufferedImage(width, height, BufferedImage.TYPE_INT_ARGB);
    // draw or write the given letter (or String) in BLACK with the background WHITE
    g = image.createGraphics();
    g.setFont(font);
    g.setColor(Color.WHITE);
    g.fillRect(0, 0, width, height);
    g.setColor(Color.BLACK);
    g.drawString(string, 0, height);
    //
    // eliminate the "void"
    //
    int y = 0;
    int white = Color.WHITE.getRGB();
    LOOP: for (; y < height; ++y)
    for (int x = 0; x < width; ++x)
    if (image.getRGB(x, y) != white) {
      if (y == 0) return image; // no need to rectify
      break LOOP;
    }
    int H = height - y;
    int[] pixels  = image.getRGB(0, y, width, H, null, 0, width);
    BufferedImage img = new BufferedImage(width, H, BufferedImage.TYPE_INT_ARGB);
    img.setRGB(0, 0, width, H, pixels, 0, width);
    return img;
  }
and we get this image: 1641303963009.png

The Image is based on black pixels that are on a white background. Hard to recognize the "special features" as we know about Facial Recognition, for example, like this:

1641303839510.png
(source: click HERE)

As you see, there is an algorithm that leads you to the right direction. Not so complicated like Optical Face Recognition, OCR is simpler if you know how and where to start with. As I have showed you how the pixels of an image could be processed and altered. For example: to change the color of an image sentence from black to cyan 1641303925036.png

The problem of OCR from an image is the "ghosting" (or to be more precise: fogging) between the letters and the background. Modern cars (e.g. AUDI, BMW, Mercedes, etc.) allow their users to write a destination address on a little screen, or the cars can "recognize" the traffic signs. All that bases on a very distinctive segregation between the letters and the background environment (i.e. color). The little screen acts as a transparent background. Example: a traffic sign.

1641304045166.png

The OCR is here a lot easier to work with. To achieve the same result of Optical Face Recognition we need to create a similar and distinctive environment for the letters out of an image like the OCR of modern cars with their (preconditioned) environments.
  1. Focusing on the sentence to reduce the unnecessary scanning work and Removing the interfering colors by "Translucentifying" the background
  2. Isolating letter by letter
  3. "DOTifying" the letter
  4. Fixing the distinctive features (like Facial features)
For example

Step 1
Focused on the Image
1641304432725.png

and extracted 1641304338749.png

Step 2: pick letter by letter
1641304887590.png

Step 3: dotifying the letter
1641639371184.png

Step 4: set the distinctive features:
T_LucidaConsole_0_20_DOT.png
 
Sửa lần cuối:

Joe

Thành viên VIP
21/1/13
3,075
1,342
113
Part II

The main problems of OCR are, beside the myriad free styles, the unequal size, the different fonts and the colors which could be tarnished or blurred by the surrounding or the quality of the image. For Example: all characters of Lucida Console have the same size while the size of Dialog characters is different. Examples:

Lucida Console:
1641662129659.png

Dialog:
1641662201376.png

Courier:
1641662470197.png

In this brief and concise tutorial I show you two different implementations:
  1. Dynamic OCR: the recognition bases on the distinctive features,
  2. Static OCR: the recognition bases on a given Font Table
The 1st one is relatively difficult to cover all different fonts while the last one is easier, but restrictive to one font type. The distinctive features of the letter A of both fonts Dialog and Lucida Console are similar, but the A of Courier slightly deviates from the top and from the feet. A dynamic A-Recognition implementation that works with Lucida Console and Dialog could fail with Courier. On the other hand, a Static A Recognition requires 3 different Character Font Tables.

We start with the dynamic OCR.

Dynamic OCR

As mentioned in Part I the image that contains a string which should be OCRed should be firstly "cleaned up" by "translucentifying" of the unnecessary surrounding.
Java:
    rgb &= 0xFFFFFF; // only the RGB of the selected pixel (the color of the string)
    BufferedImage img = ImageIO.read(new File(imageFile));
    int width  = (int) img.getWidth();
    int height = (int) img.getHeight();
    // black: 0x000000, white:0xffffff
    BufferedImage bImg = new BufferedImage(width, height, BufferedImage.TYPE_INT_ARGB);
    for (int y = 0; y < height; ++y)
    for (int x = 0; x < width; ++x) {
      int p = img.getRGB(x,y);
      int alpha =  p & 0xFF000000;
      if ((p&0x00FFFFFF == rgb) {
        bImg.setRGB(x, y, p | 0xFF000000); // keep and make this pixel to 100% opaque
      } else { // translucentify this pixel
        bImg.setRGB(x, y,  p & 0x00FFFFFF); // set Transparent
      }
    }
   return bImg;
and then focused on the string only (by color of the String):
Java:
    ... // img is the translucentified image
    int width = img.getWidth();
    int height = img.getHeight();
    // the upper-most corner
    int xa = 0, ya = 0;
    LA:for (int y = 0; y < height; ++y)
      for (int x = 0; x < width; ++x)
      if ((img.getRGB(x, y) & ALPHA) != 0) {
        if (x < xa || xa == 0) xa = x;
        if (y < ya || ya == 0) ya = y;
      }
    if (xa == width || ya == height) return null;
    // the lower-most corner
    int xe = 0, ye = 0;
    LE:for (int x = width-1; x >= 0; --x)
      for (int y = height-1; y >= 0; --y)
      if ((img.getRGB(x, y) & ALPHA) != 0) {
        if (x > xe || xe == 0) xe = x;
        if (y > ye || ye == 0) ye = y;
      }
    if (xe <= xa || ye <= ya) return null;
    // set some tolerant
    int wi = 1 + xe - xa, he = 1 + ye - ya;
    // focus only on the area with the given RGB and copy the pixels into a new BufferedImage
    BufferedImage bImg = new BufferedImage(wi, he, BufferedImage.TYPE_INT_ARGB);
    for (int b = 0; b < he; ++b, ++ya)
      for (int a = 0, i = xa; a < wi; ++a, ++i)
        bImg.setRGB(a, b, img.getRGB(i, ya));
    return bImg;
With the "DOTifying" of each letter (or character) it's just a chore to implement a Letter Recognition. Example: letterH( )
Java:
    /*
    Specification of  xy[10]:
      [0]: x coordinate X
      [1]: y coordinate Y
      [2]: xW (the Letter Width)
      [3]: yH (the Letter Height)
      [4]: gap between 2 letters
      [5]: return found Letter
      [6]: Starting Y value
      [7]: ending Y value
      [8]: starting X
      [9]: ending X
    */
  public boolean letterH(int[] xy ) {
    int a = xy[0]+((xy[2]-xy[0])>>1);
    int t  = leftT(xy[0], xy[1]);
    int up = upperY(xy[0]+t);
    int ud = lowerY(xy[0]+t);
    if (onVertical(xy[3], xy[0], xy[1]) && onVertical(xy[3], xy[2]-1, xy[1]) && onHorizontal(xy[2], xy[0], up) &&
        noHorizontal(xy[2]-4, xy[0]+4, xy[1]+3) && noVertical(up, a, xy[1]) && noVertical(xy[3], a, ud+1) && xy[5] > xy[1]) {
      xy[5] = (int)'H';
      return true;
    }
    //  lower case h ?
    int Y = xy[1] > xy[6]? xy[6] : xy[1];
    if (onVertical(xy[3], xy[0], Y) && noVertical(up, xy[2]-1, xy[1]) && onHorizontal(xy[2], xy[0], up) &&
        noHorizontal(xy[2], xy[0]+t, xy[1]+((up-xy[1])>>1)) && noVertical(xy[3], a, ud+1)) {
      xy[5] = (int)'h';
      return true;
    }
    return false;
  }
The OCR of letter H bases on the following computing features:
H.png
and the result is:
JoeMimosas.png
 
Sửa lần cuối:
  • Like
Reactions: Thanhpv

Joe

Thành viên VIP
21/1/13
3,075
1,342
113
The so-called DOTifying of a letter (or character) is in other words the Visualization of the Pixels so that we can single out the distinctive features of each individual letter. To do that we have to create a tool that does the job and bases on our knowledge about Image and Pixel Processing (see HERE). Example:
Java:
    JButton dot = new JButton("DOT");
    dot.addActionListener(e -> {
      try {
        BufferedImage img = ImageTools.createFontImage(letter, font, t, si ); // see Tutorial: Image and Pixel Processing
        img = ImageTools.createDOTImage(ImageTools.filterColor(0, 0, img), font, t, si); // call the ImageTools
        if (img != null) {
          String sav = JOptionPane.showInputDialog(jf,"Save Pixel File?", "yes");
          if ("yes".equalsIgnoreCase(sav)) {
            FileOutputStream fou = new FileOutputStream("./images/"+LET+"_"+font+"_"+type+"_"+size+"_DOT.png", false);
            ImageIO.write(img, "png", fou);
            fou.flush();
            fou.close();
          }
          dis.setIcon(new ImageIcon(img));
          dis.setText(letter);
        } else dis.setText("Invalid Image for:"+letter);
        jf.pack();
      } catch (Exception ex) {
        ex.printStackTrace();
      }
    });
The ImageTools API:
Java:
  /**
   @param img  BufferedImage
   @param fontName  String, e.g TimesRoman, Courier, etc. (Case Sensitive)
   @param fontAtt   int, FontAttribute, e.g. Font.Bold, etc.
   @param size      int, FontSize (e.g. 18 dpi)
   @return BufferedImage of the string with the given font
   @exception Exception thrown by JAVA
  */
  public static BufferedImage createDOTImage(BufferedImage image, String fontName, int fontAtt, int size) throws Exception {
    int dy, dot = 0;
    int width = image.getWidth();
    int height = image.getHeight();
    ByteArrayOutputStream bao = new ByteArrayOutputStream(width*height);
    Graphics2D g = image.createGraphics();
    int pixel = g.getColor().getRGB();
    // the X axis (uppermost)
    bao.write("     ".getBytes());
    for (int i = 0; i < width; ++i) bao.write(String.format(" %02X", i).getBytes());
    bao.write("\n".getBytes());
    // the DOTifying of Pixels and Y scale on the leftmost side
    for (int y = 0; y < height; ++y) {
      bao.write(String.format("%04X ", y).getBytes());
      for (int x = 0; x < width; ++x) { // Black color
        dot =  image.getRGB(x, y); // get the Pixel
        if ((dot & 0xFFFFFF) == 0 || dot == pixel)
             bao.write(" . ".getBytes());
        else bao.write("   ".getBytes());
      }
      bao.write("\n".getBytes());
    }
    bao.write("\n".getBytes());
    String[] lines = (new String(bao.toByteArray())).split("\n");
    // create the Font based on the given specification
    Font font = new Font(fontName, fontAtt, size);
    FontMetrics metrics = g.getFontMetrics(font);
    dy = metrics.getHeight();
    for (String line:lines) {
      int l = metrics.stringWidth(line);
      if (l > width) width = l;
    }
    width += 10;
    size = lines.length;
    height = dy * size;
    image = new BufferedImage(width, height, BufferedImage.TYPE_INT_ARGB);
    g = image.createGraphics();
    // generate an image
    g.setFont(font);
    // Background: c0c0c0 Silver
    g.setColor(new Color(0xC0C0C0));
    g.fillRect(0, 0, width, height);
    g.setColor(Color.BLACK);
    for (int i = 0, y = dy; i < size; ++i) {
      g.drawString(lines[i], 10, y);
      y += dy;
    }
    return image;
  }
and that is what we get (the infamous WYSIWYG :))):
GetFont.png

The hardest work is the isolation of each individual letter out of the image and from a string. Of course, it can be only done with some pre-conditions. For example: handwriting letter is NOT supported here. And mixing of different fonts is also unsupported. With some pre-defined rules the work with the isolation of the letters becomes more meaningful and more precise. However, if you want to cover everything you have to inflate your work as a real job. And if I did that here I would blast this forum into pieces. Example:

Burning_Bitcoin.jpg

The letters within a string are separated and put in a pipeline so that they can be recognized (OCR) and then converted individually (from left to right) to their according appropriate font: A....Z a....z 0....9 and all the special characters ($ % & € etc.) Meaning that we have to implement each OCR method for each letter. Some letters are very similar to each other (e.g. uppercase I and lowercase L, V or Y or 5 and S, etc.) so that they can be grouped together and share the same implementation.

Because WHITE spaces (blanks, Tab, NewLine, etc.) don't have color they are unseen or to be more precise: they are translucent. Hence an OCR of such a white space is an arbitrary interpretation. I did here as 1 space between 2 words -regardless how many spaces are there, and a new line is inserted at the end of each string. And the result is as following:

fata.png

To materialize our OCR approach we need to design some Lines of Feature (LOF) just like the Facial Recognition lines. The LOF base on the reduced X-Y coordinate system of the image. For example: the reduced X-Y coordinate system of the following image:

1642329219876.png

Reduced X-Y Coordinate System:
1642329484335.png

From the reduced Image we need only to separate and to pipeline each individual letter out of the text and then to visualize it pixel-by-pixel (DOTifying), so that we could finally set the LOF for each of the letters.
TheLineOfFeatures.png

An implementation of the upperSegment:
Java:
  // Upper Segment left and right meet at yU: upward
  private boolean upperSegment(BufferedImage img, int yU, int y, int xL, int xR) {
    for (int a = xL, b = xR; y >= yU; --y, ++a, --b) {
      if ((img.getRGB(a, y) & ALPHA) != 0 || (img.getRGB(b, y) & ALPHA) != 0) continue;
      // tolerant +/-1
      if ((a+1) < xR && (img.getRGB(a+1, y) & ALPHA) != 0 || (b-1) > xL && (img.getRGB(b-1, y) & ALPHA) != 0) continue;
      return false;
    }
    return true;
  }
If we were able to complete the necessary LOF (with some tolerant or Safety Gap) we would get this similar result:
OCR.png
 
Sửa lần cuối:

Joe

Thành viên VIP
21/1/13
3,075
1,342
113
PART II

With the dynamic technique it is easier to OCR the character and it works faster than the static technique. The Static OCR bases on an Alphanumeric table (A..Z a..z 0..9 and special characters #*+@....). An incoming letter has to be compared with all characters in the Alphanumeric table (character by character) and the result is the best-matched comparison. The probability that a letter is correctly recognized (identified) is depending on the quality of the retrieved letters of the image. It's because of the compression technique of the image. PNG is the best, JPG or GIF could cause more blurred pixels and that could easily falsify the comparison result.

Image Comparison is a very tricky work. The reason is the huge possibilities of Pixels (Alpha, Red, Green and Blue): 4294967295 possibilities (hex. 0xFFFFFFFF). Pixel-Comparison is the best way to get lost in the maze of colors. However, in OCR it is usually about letters with one color (black or white). As said in the previous session a string of letters can be isolated, pipelined for letter-by-letter comparison. The isolation bases on the specified string color (black or white) and the "translucentification" of the unneeded.


1643737216618.png

Similar to the dynamic OCR the static OCR relies on some clues that make out the similarity between the two images. Examples:
  • Resize the images so that they have the same size (width and height)
  • The dispersion between opaque and translucentified pixels (the X and Y Coordinate)
Examples:
Java:
public class StaticOCR implements OCR {
  public StaticOCR(BufferedImage img, int[] xy) {
    this.xy = xy;
    this.img = img;
    font = img.getGraphics().getFont();
    fontName = font .getName().toLowerCase();
    list = java.util.Arrays.asList(letters);
    idx = list.indexOf("a");
  }
  public int ocrLetter( ) {
    try {
      // UpperCase or tall letter
      boolean up = (xy[5] > xy[1]);
      BufferedImage dImg, sImg, aImg, bImg;
      sImg = ImageTools.extractImage(img, xy[0], xy[1], xy[10], xy[11]);
      int width = sImg.getWidth(), height = sImg.getHeight(), alp = getAlphas(sImg);
      for (int ix = up? 0:idx, mx = up? idx:list.size(); ix < mx; ++ix) {
        String letter = letters[ix];
        dImg = ImageTools.createFontImage(letter, fontName, Font.BOLD, 30);
        if (dImg == null || alp > 0 && getAlphas(dImg) == 0 || alp == 0 && getAlphas(dImg) > 0) continue;
        int dW = dImg.getWidth(), dH = dImg.getHeight();
        if (dW > width || dH > height) {
          aImg = sImg;
          bImg = resize(dImg, width, height);
        } else if (dW < width || dH < height) {
          bImg = dImg;
          aImg = resize(sImg, dW, dH);
        } else {
          aImg = sImg;
          bImg = dImg;
        }
        if (compare(aImg, bImg)) {
          return (int)letter.charAt(0);
        }
      }
    } catch (Exception ex) {
      ex.printStackTrace();
      return 0;
    }
    return (int)' ';
  }
  // source Image: sImg, letter from Alphanumeric table: dImg
  private boolean compare(BufferedImage sImg, BufferedImage dImg) {
    int width = sImg.getWidth(), height = sImg.getHeight();
    int dWidth = dImg.getWidth(), dHeight = dImg.getHeight();
    if (width != dWidth || height != dHeight || width < 2 || height < 2) return false;
    float matched = 0f, total = 0f;
    //
    for (int y = 0; y < height; ++y) for (int x = 0; x < width; ++x) {
      if ((sImg.getRGB(x, y) & ALPHA) == (dImg.getRGB(x, y) & ALPHA)) ++matched;
      ++total;
    }
    int esW = width-1, edW = dWidth-1, esH = height-1, edH = dHeight-1;
    int smW = width >> 1, dmW = dWidth >> 1, smH = height >> 1, dmH = dHeight >> 1;
    boolean xOK = xCount(sImg, smH) == xCount(dImg, dmH) && xCount(sImg, 0) == xCount(dImg, 0) &&
                  xCount(sImg, esH) == xCount(dImg, edH);
    boolean yOK = yCount(sImg, smW) == yCount(dImg, dmW) || yCount(sImg, 0) == yCount(dImg, 0) &&
                  yCount(sImg, esW) == yCount(dImg, edW);
    // must match 75% pixels between 2 images and the dispersions of opaque & translucentified pixels
    return (matched/total) > 0.75f && xOK && yOK;
  }
  // check all translucentified pixels
  private int getAlphas(BufferedImage img) {
    int alp = 0;
    int width = img.getWidth(), height = img.getHeight();
    for (int y = 0; y < height; ++y) for (int x = 0; x < width; ++x) if ((img.getRGB(x, y) & ALPHA) == 0) ++alp;
    return alp;
  }
  // Resize the image to the given width and height
  private BufferedImage resize(BufferedImage image, int width, int height) {
    try {
      if (image.getWidth() == width && image.getHeight() == height) return image;
      BufferedImage letterImg = new BufferedImage(width, height, BufferedImage.TYPE_INT_ARGB);
      Graphics2D graphics2D = letterImg.createGraphics();
      graphics2D.setBackground(Color.WHITE);
      graphics2D.setPaint(Color.WHITE);
      graphics2D.fillRect(0, 0, width, height);
      graphics2D.setRenderingHint(RenderingHints.KEY_INTERPOLATION, RenderingHints.VALUE_INTERPOLATION_BILINEAR);
      graphics2D.drawImage(image, 0, 0, width, height, null);
      return ImageTools.filterColor(0, 0, letterImg);
    } catch (Exception ex) { }
    return image;
  }
  // the X dispersion
  private int xCount(BufferedImage img, int y) {
    int cnt = 0, width = img.getWidth();
    LOOP: for (int x = 0; x < width; ++x) if ((img.getRGB(x, y) & ALPHA) != 0) {
      for (++cnt, ++x; x < width; ++x) if ((img.getRGB(x, y) & ALPHA) == 0) break;
      if (x < width) continue LOOP;
      return cnt;
    }
    return cnt;
  }
  // the Y dispersion
  private int yCount(BufferedImage img, int x) {
    int cnt = 0, height = img.getHeight();
    LOOP: for (int y = 0; y < height; ++y) if ((img.getRGB(x, y) & ALPHA) != 0) {
      for (++cnt, ++y; y < height; ++y) if ((img.getRGB(x, y) & ALPHA) == 0) break;
      if (y < height) continue LOOP;
      return cnt;
    }
    return cnt;
  }
  // the Alphanumeric Table
  private String[] letters = {
                             "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P",
                             "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z",
                             "b", "d", "f", "h", "k", "l", "t",
                             "1", "2", "3", "4", "5", "6", "7", "8", "9", "0",
                             "!", "§", "$", "%", "&", "/", "(", ")", "'",
                             "\\", "@", "€", "{", "}", "[", "]", "|",

                             "a", "c", "e",
                             "g", "i", "j", "m", "n", "o", "p", "q", "r", "s", "u",
                             "v", "w", "x", "y", "z",

                             "\"", "=", "?", "*", "+", "#", ";",
                             ",", ":", "_", "-", "~",  "°", "^", "<", ">"
                            };
And here the result:
staticHello.png

The letter "r" is failed by the comparison because of the blurring of the image that needs to be "resized" accordingly to the other image (either the source or a letter from the table) so that a comparison can be made...
 
Sửa lần cuối:

Joe

Thành viên VIP
21/1/13
3,075
1,342
113
PART III

The last part OCR -Optical Character Recognition- in JAVA is a package of 5 API and one Test application. The package consists of:
  1. The Interface OCR.java
  2. The DynamicOCR.java
  3. The StaticOCR.java
  4. The frame JIOCR.java
  5. The ImageTools.java
As usual I don't give you the full sources, but some pieces of codes so that you can assemble and build the package for yourself. As Master Confucius said "If you let me do it I will understand it". Trying is the only way to become a superb developer ;)

ImageTools provides the static methods that allow you to focus on the sentence (or character string) and to create a Font Image for a character (e.g. A or a) and to filter the sentence out of the image.
Java:
  /**
  @param RGB  int, 6 hex RGB color (e.g. 0xff20cd) to be filtered
  @param tolerant  int, 2 hex (e.g. 0x01)
  @param img BufferedImage to be filtered
  @return BufferedImage
  @exception Exception thrown by JAVA
  */
  public static BufferedImage filterColor(int RGB, int tolerant, BufferedImage img) throws Exception {
    int width  = (int) img.getWidth();
    int height = (int) img.getHeight();
    //
    int red   = (RGB >> 16) & 0xFF;
    int green = (RGB >> 8) & 0xFF;
    int blue  =  RGB & 0xFF;
    //
    tolerant  &= 0xFF;
    int hRED   = red + tolerant, lRED = red - tolerant;
    int hGREEN = green + tolerant, lGREEN = green - tolerant;
    int hBLUE  = blue + tolerant, lBLUE = blue - tolerant;
    // black: 0x000000, white:0xffffff
    BufferedImage bImg = new BufferedImage(width, height, BufferedImage.TYPE_INT_ARGB);
    for (int y = 0; y < height; ++y)
    for (int x = 0; x < width; ++x) {
      int p = img.getRGB(x,y);
      int alpha =  p & ALPHA;
      red   = (p >> 16) & 0xFF;
      green = (p >> 8) & 0xFF;
      blue  =  p & 0xFF;
      if (hRED >= red && lRED <= red &&
          hGREEN >= green && lGREEN <= green &&
          hBLUE >= blue && lBLUE <= blue) {
        bImg.setRGB(x, y, p);
      } else { // translucent this pixel
        bImg.setRGB(x, y, ARGB); // set Transparent
      }
    }
    return bImg;
  }
  /**
  @param RGB  int, 6 hex RGB color (e.g. 0xff20cd) to be focused
  @param tolerant  int, 2 hex (e.g. 0x01)
  @param imgFile String, input Image File
  @return BufferedImage
  @exception Exception thrown by JAVA
  */
  public static BufferedImage focus(int RGB, int tolerant, String imgFile) throws Exception {
    return narrowing(filterColor(RGB, tolerant, ImageIO.read(new File(imgFile))));
  }
  /**
   @param string    String for the font (or a letter)
   @param fontName  String, e.g TimesRoman, Courier, etc. (Case Sensitive)
   @param fontAtt   int, FontAttribute, e.g. Font.Bold, etc.
   @param size      int, FontSize (e.g. 18 dpi)
   @return BufferedImage of the string with the given font
  */
  public static BufferedImage createFontImage(String string, String fontName, int fontAtt, int size) throws Exception {
    //                                       w        h
    BufferedImage image = new BufferedImage(size, size << 1, BufferedImage.TYPE_INT_ARGB);
    //Font font = new Font(fontName, fontAtt, size);
    Graphics2D g = image.createGraphics();
    // Create a negative background
    //RenderingHints rh = new RenderingHints(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_ON);
    //rh.put(RenderingHints.KEY_RENDERING, RenderingHints.VALUE_RENDER_QUALITY);
    //g.setRenderingHints(rh);
    g.setFont(new Font(fontName, fontAtt, size));
    g.setColor(Color.white);
    g.fillRect(0, 0, size,  size << 1);
    g.setColor(Color.black);
    g.drawString(string, 0, size);
    return narrowing(filterColor(0, 0, image));
  }
//
  private static BufferedImage narrowing(BufferedImage img) throws Exception {
    int width = img.getWidth();
    int height = img.getHeight();
    int xa = width, ya = height;
    for (int y = 0; y < height; ++y) for (int x = 0; x < width; ++x)
      if ((img.getRGB(x, y) & ALPHA) != 0) {
        if (x < xa) xa = x;
        if (y < ya) ya = y;
    }
    if (xa == width || ya == height) return img;
    //
    int xe = 0, ye = 0; // the lower corner
    for (int x = width-1; x >= 0; --x) for (int y = height-1; y >= 0; --y)
    if ((img.getRGB(x, y) & ALPHA) != 0) {
      if (x > xe) xe = x;
      if (y > ye) ye = y;
    }
    if (xe <= xa || ye <= ya) return img;
    // set some tolerant
    int wi = 1 + xe - xa, he = 1 + ye - ya;
    // focus only on the area with the given RGB
    BufferedImage bImg = new BufferedImage(wi, he, BufferedImage.TYPE_INT_ARGB);
    for (int b = 0; b < he; ++b, ++ya) for (int a = 0, i = xa; a < wi; ++a, ++i)
      bImg.setRGB(a, b, img.getRGB(i, ya));
    return bImg;
  }
  //
  private static int ARGB  = 0x00FFFFFF;
  private static int ALPHA = 0xFF000000;
The Interface OCR.java
Java:
package joeapp.color;
// Joe Nartca (C)
interface OCR {
  public int ocrLetter( );
}
The Frame JIOCR.java

Java:
import joeapp.color.*;
// Joe Nartca (C)
public class JIOCR {
  /**
  @param buf BufferedImage containing the string to be scanned
  @param RGB int, the (Pixel) color (e.g. black = 0, white = 0xFFFFFF) of the letters
  @param dyn boolean, true: dynamic OCR, false: static
  @return String that is found within the image of imgFile (can be null if dynamic)
  @exception Exception thrown by JAVA
  */
  public static String scan(BufferedImage img, int RGB, boolean dyn, String font) throws Exception {
    /*
    Specification of  xy[20]:
      [0]: x
      [1]: y
      [2]: xW (the Letter Width)
      [3]: yH (the Letter Height)
      [4]: gap between 2 letters
      [5]: return found Letter
      [6]: Starting Y value
      [7]: ending Y value
      [8]: starting X
      [9]: ending X
      [10]: offset width
      [11]: offset heigth
      [12]: left Mid x
      [13]: upper Mid Y
      [14]: right Mid x
      [15]: under Mid Y
      [16]: Height between 2 lines
      [17]: last X or xy[2]-1
      [18]: last Y or xy[3]-1
      [19]: actual line height
    */
    img = ImageTools.focus(RGB, 0, img);
    int he = img.getHeight();
    int wi = img.getWidth();
    int ALPHA = 0xFF000000;
    //
    int xy[] = new int[20];
    int A, x, y = 0, X, Y;
    StringBuilder sb = new StringBuilder();
    // set the Starting coordinate x, y
    LA:for (y = 0; y < he; ++y)
      for (x = 0; x < wi; ++x)
      if ((img.getRGB(x, y) & ALPHA) != 0) {
        if (x < xy[8]) xy[8] = x;
        if (y < xy[6]) xy[6] = y;
      }
    if (xy[8] == wi || xy[6] == he) return null;   
    // the lower corner
    LE:for (x = wi-1; x >= 0; --x)
      for (y = he-1; y >= 0; --y)
      if ((img.getRGB(x, y) & ALPHA) != 0) {
        if (x > xy[9]) xy[9] = x;
        if (y > xy[7]) xy[7] = y;
      }
    //
    xy[4] = wi; // get the gap between two letters
    GAP:for (X = xy[8]; X < xy[9]; ++X) for (Y = xy[6]; Y < xy[7]; ++Y)
    if ((img.getRGB(X, Y) & ALPHA) == 0) {
      if (Y > xy[6]) continue GAP;
      for (++Y; Y < xy[7]; ++Y) if ((img.getRGB(X, Y) & ALPHA) != 0) continue GAP;
      IN:for (A = X+1; A < xy[9]; ++A) for (Y = xy[6]; Y < xy[7]; ++Y) if ((img.getRGB(A, Y) & ALPHA) != 0) break IN;
      if (xy[4] > (1+A-X)) xy[4] = 1+A-X; // +1 due to start at 0
      if (A == xy[9]) break GAP; // take the GAP between 2 letters 
    }
    int GAP = xy[4] << 1; // double the gap
    if (xy[8] == wi || xy[6] == he) return sb.toString();
    // 
    OCR ocr; // create the OCR instance
    if (dyn || font == null) ocr = new DynamicOCR(img, xy);  // Dynamic OCR
    else ocr = new StaticOCR(img, xy, font); // Static OCR
    //
    ++xy[9];
    x = xy[8];
    // LOOP a column 
    while (x < xy[9]) {
      y = xy[6]; // starting y
      HEI: for (xy[3] = xy[7], Y = y; Y < xy[7]; ++Y) {
        for (X = xy[8]; X < xy[9]; ++X) if ((img.getRGB(X, Y) & ALPHA) != 0) continue HEI;
        xy[3] = Y; // general height for this line
        break;
      }
      xy[19] = xy[3] - xy[6]; // actual height
      if (x == xy[8]) { // adjust the starting x and y
        COL: for (; x < xy[9]; ++x) for (y = xy[6]; y < xy[3]; ++y)
        if ((img.getRGB(x,y) & ALPHA) != 0) break COL;
        ROW: for (y = xy[6] ; y < xy[3]; ++y) for (X = x; X < xy[9]; ++X)
        if ((img.getRGB(X, y) & ALPHA) != 0) break ROW;
      }
      //
      WID: // the letter Width
      for (X = x; X < xy[9]; ++X) for (Y = xy[6]; Y < xy[3]; ++Y) if ((img.getRGB(X, Y) & ALPHA) == 0) {
        for (Y = xy[6]; Y < xy[3]; ++Y) if ((img.getRGB(X, Y) & ALPHA) != 0) continue WID;
        A = X+1; if (A < xy[9]) for (Y = xy[6]; Y < xy[3]; ++Y) if ((img.getRGB(A, Y) & ALPHA) != 0) continue WID;
        xy[2] = X;
        break WID;
      }
      xy[0] = x; xy[1] = y;
      if (X == xy[9]) xy[2] = xy[9];
      for (X = x; X < xy[2]; ++X) if ((img.getRGB(X, y) & ALPHA) != 0) break;
      if (X == xy[2]) { // adjust the Starting line
        LO: for (Y = y; Y < xy[3]; ++Y) for (X = x; X < xy[2]; ++X)
        if ((img.getRGB(X,Y) & ALPHA) != 0) {
          xy[1] = Y;
          break LO;
        }
      }
      he = xy[3]; // saved the new height
      // cut the empty rows if NOT g/j/p/q/y
      UP: for (Y = xy[3]; Y >= xy[1]; --Y) for (X = xy[0]; X < xy[2]; ++X)
      if ((img.getRGB(X, Y) & ALPHA) != 0) {
        if (Y > xy[1]) xy[3] = Y+1;
        break UP;
      }
      A = xy[2]; // save this position
      //
      xy[10] = xy[2] - xy[0]; // X size
      xy[11] = xy[3] - xy[1]; // Y size
      xy[12] = xy[0]+(xy[10] >> 1); // left mid X
      xy[13] = xy[1]+(xy[11] >> 1); // upper mid Y
      xy[14] = xy[12] + (xy[10]%2); // right mid X
      xy[15] = xy[13] + (xy[11]%2); // under mid Y
      xy[16] = xy[3] - xy[6]; // height between 2 lines
      xy[17] = xy[2] - 1; // last X column
      xy[18] = xy[3] - 1; // last Y row
      //xy[5]  = xy[6]+(xy[11] >> 2); // Upper/Lower
      xy[5]  = xy[6]+3; // Upper/Lower
      sb.append((char) ocr.ocrLetter( ));
      //
      xy[3] = he; // restore the actual height
      NXT: for (X = xy[2]; X < xy[9]; ++X) for (Y = xy[6]; Y < xy[3]; ++Y)
      if ((img.getRGB(X, Y) & ALPHA) != 0) break NXT;
      //
      x = (X >= xy[9])? xy[8]: X; // new x position
      if (x == xy[8]) { // Spacing between new word?
        sb.append("\n"); // new starting y
        BEG:for (y = xy[6] = xy[3]; y < xy[7]; ++y)
        for (X = x; X < xy[9]; ++X) if ((img.getRGB(X,y) & ALPHA) != 0) {
          xy[6] = y;
          break BEG;
        }
      } else if ((x-A) > GAP) sb.append(" ");
      if (xy[6] >= xy[7] || (x+1) >= xy[9]) break;
    }
    return sb.toString();
  }
}
 
Sửa lần cuối:

Joe

Thành viên VIP
21/1/13
3,075
1,342
113
The DynamicOCR.java
Java:
import joeapp.color.ImageTools;
// Joe Nartca (C)
public class DynamicOCR implements OCR {
  public DynamicOCR(BufferedImage img, int xy[]) {
    this.img = img;
    this.xy = xy;
  }
  //
  public int ocrLetter( ) {
    // Optical Characrter Reconition
    // ----------------------------------------------------------------------------
    if (letterA( )) return xy[5];
    if (letterB( )) return xy[5];
    if (letterC( )) return xy[5]; // shared with (
    ....
  }
  private boolean letterA( ) {
    int Yd = lowerY(xy[12]), Xl = leftX(xy[1]), Xr = rightX(xy[1]);
    int Xld = leftX(Yd), Xrd = rightX(Yd), Yu = upperY(xy[0]);
    if (Yd < xy[18] && leftSlant(xy[0], Xl, xy[1]) && rightSlant(Xr, xy[17], xy[1]) && onHorizontal(Xrd, Xld, Yd) &&
        noVertical(Yu, Xld, xy[1]) && noHorizontal(Xl, xy[0], xy[1]) && yCount(xy[12], xy[1]) == 2 &&
        xCount(xy[0], xy[18]) == 2 && Xl > xy[0] && Xr < xy[17]) {
      xy[5] = (int)'A';
      return true;
    }
    int t = upperT(leftX(xy[1]), xy[1]);
    Yu = t > 1? xy[1]+1 : xy[1]; int Xlu = leftX(Yu), Xru = rightX(Yu);
    int Yc = nextUpperY(Xlu, Yu); Yd = lowerY(Xlu); Xld = Xlu; Xrd = rightX(Yd);
    int Xlc = leftX(Yc), Xrc = rightX(Yc), Ye = Yc-1;
    if (xy[5] < xy[1] && onHorizontal(Xru, Xlu, Yu) && onHorizontal(Xrc, Xlc, Yc) && onHorizontal(Xrd, Xld, Yd) &&
        onVertical(Yd, Xlc, Yc) && onVertical(xy[3], Xru, Yu) && noHorizontal(leftX(Ye), xy[0], Ye) &&
        yCount(xy[12], xy[1]) == 3) {
      xy[5] = (int)'a';
      return true;
    }
    return false;
  }
  private boolean letterB( ) {
    ...
  }
  ...
  private int leftMostX( ) {  
    for (int x = xy[0]; x < xy[2]; ++x) for (int y = xy[1]; y < xy[3]; ++y)
    if ((img.getRGB(x, y) & ALPHA) != 0) return x;
    return xy[0];
  }
  private int rightMostX( ) {  
    for (int x = xy[17]; x >= xy[0]; --x) for (int y = xy[1]; y < xy[3]; ++y)
    if ((img.getRGB(x, y) & ALPHA) != 0) return x;
    return xy[17];
  }
  private int upperMostY( ) {  
    for (int y = xy[1]; y < xy[3]; ++y) for (int x = xy[0]; x < xy[2]; ++x)
    if ((img.getRGB(x, y) & ALPHA) != 0) return y;
    return xy[1];
  }
  private int underMostY( ) {  
    for (int y = xy[18]; y >= xy[1]; --y)for (int x = xy[0]; x < xy[2]; ++x)
    if ((img.getRGB(x, y) & ALPHA) != 0) return y;
    return xy[18];
  }
  private int leftX(int y) {
    for (int x = xy[0]; x < xy[2]; ++x) if ((img.getRGB(x, y) & ALPHA) != 0) return x;
    return xy[0];
  }
  private int lastLeftX(int x, int y) {
    for (int X = x; X < xy[2]; ++X) if ((img.getRGB(X, y) & ALPHA) == 0) return (X-1);
    return xy[17];
  }
  private int nextLeftX(int x, int y) {
    for (int X = x; X < xy[2]; ++X) if ((img.getRGB(X, y) & ALPHA) == 0) {
      for (++X; X < xy[2]; ++X) if ((img.getRGB(X, y) & ALPHA) != 0) return X;
    }
    return x;
  }
  private int rightX(int y) {
    for (int x = xy[17]; x >= xy[0]; --x) if ((img.getRGB(x, y) & ALPHA) != 0) return x;
    return xy[17];
  }
  private int lastRightX(int x, int y) {
    for (int X = x; X >= xy[0]; --X) if ((img.getRGB(X, y) & ALPHA) == 0) return (X-1);
    return xy[0];
  }
  private int nextRightX(int x, int y) {
    for (int X = x; X >= xy[0]; --X) if ((img.getRGB(X, y) & ALPHA) == 0) {
      for (--X; X >= xy[0]; --X) if ((img.getRGB(X, y) & ALPHA) != 0) return X;
    }
    return x;
  }
  private int upperY(int x) {
    for (int y = xy[1]; y < xy[3]; ++y) if ((img.getRGB(x, y) & ALPHA) != 0) return y;
    return xy[1];
  }
  private int nextUpperY(int x, int y) {
    for (int Y = y; Y < xy[3]; ++Y) if ((img.getRGB(x, Y) & ALPHA) == 0) {
      for (++Y; Y < xy[3]; ++Y) if ((img.getRGB(x, Y) & ALPHA) != 0) return Y;
    }
    return y;
  }
  private int lastUpperY(int x, int y) {
    for (int Y = y+1; Y < xy[3]; ++Y) if ((img.getRGB(x, Y) & ALPHA) == 0) return (Y-1);
    return xy[18];
  }
  private int lowerY(int x) {
    for (int y = xy[18]; y >= xy[1]; --y) if ((img.getRGB(x, y) & ALPHA) != 0) return y;
    return xy[18];
  }
  private int nextLowerY(int x, int y) {
    for (int Y = y; Y >= xy[1]; --Y) if ((img.getRGB(x, Y) & ALPHA) == 0) {
      for (--Y; Y >= xy[1]; --Y) if ((img.getRGB(x, Y) & ALPHA) != 0) return Y;
    }
    return y;
  }
  private int lastLowerY(int x, int y) {
    for (int Y = y-1; Y >= xy[1]; --Y) if ((img.getRGB(x, Y) & ALPHA) == 0) return (Y+1);
    return xy[1];
  }
  private int leftT(int x,  int y) {
    int t = 0;
    for (; x < xy[2]; ++x, ++t) if ((img.getRGB(x, y) & ALPHA) == 0) return t;
    return t;
  }
  private int rightT(int x, int y) {
    int t = 0;
    for (; x >= xy[0]; --x, ++t) if ((img.getRGB(x, y) & ALPHA) == 0) return t;
    return t;
  }
  private int upperT(int x, int y) {
    int t = 0;
    for (; y < xy[3]; ++y, ++t) if ((img.getRGB(x, y) & ALPHA) == 0) return t;
    return t;
  }
  private int lowerT(int x, int y) {
    int t = 0;
    for (; y >= xy[1]; --y, ++t) if ((img.getRGB(x, y) & ALPHA) == 0) return t;
    return t;
  }
  private int countX(int x, int y) {
    int t = 0;
    for (; x < xy[2]; ++x) if ((img.getRGB(x, y) & ALPHA) != 0) ++t;
    return t;
  }
  private int countY(int x, int y) {
    int t = 0;
    for (; y < xy[3]; ++y) if ((img.getRGB(x, y) & ALPHA) != 0) ++t;
    return t;
  }
  private int xCount(int x, int y) {
    int cnt = 0;
    LOOP: for (; x < xy[2]; ++x) if ((img.getRGB(x, y) & ALPHA) != 0) {
      for (++cnt, ++x; x < xy[2]; ++x) if ((img.getRGB(x, y) & ALPHA) == 0) break;
      if (x < xy[2]) continue LOOP;
      return cnt;
    }
    return cnt;
  }
  //
  private int yCount(int x, int y) {
    int cnt = 0;
    LOOP: for (; y < xy[3]; ++y) if ((img.getRGB(x, y) & ALPHA) != 0) {
      for (++cnt, ++y; y < xy[3]; ++y) if ((img.getRGB(x, y) & ALPHA) == 0) break;
      if (y < xy[3]) continue LOOP;
      return cnt;
    }
    return cnt;
  }
  // Upper Segment left and right meet at yU: upward
  private boolean upperSegment(int yU, int y, int xL, int xR) {
    for (int a = xL, b = xR; y >= yU; --y, ++a, --b) {
      if ((img.getRGB(a, y) & ALPHA) != 0 || (img.getRGB(b, y) & ALPHA) != 0) continue; // tolerant +/-1
      if ((a+1) < xR && (img.getRGB(a+1, y) & ALPHA) != 0 || (b-1) > xL && (img.getRGB(b-1, y) & ALPHA) != 0) continue;
      return false;
    }
    return true;
  }
  //under Segment left and right meet at yD: downward
  private boolean underSegment(int yD, int y, int xL, int xR) {
    for (int a = xL, b = xR; y < yD; ++y, ++a, --b) {
      if ((img.getRGB(a, y) & ALPHA) != 0 || (img.getRGB(b, y) & ALPHA) != 0) continue; // tolerant +/-1
      if ((a+1) < xR && (img.getRGB(a+1, y) & ALPHA) != 0 || (b-1) > xL && (img.getRGB(b-1, y) & ALPHA) != 0) continue;
      return false;
    }
    return true;
  }
  // left Segment like the left side of O: forward
  private boolean leftSegment(int xL, int x, int y, int yU, int yD) {
    for (int a = y, b = y; x < xL; ++x, --a, ++b) {
      if ((img.getRGB(x,a) & ALPHA) != 0 || (img.getRGB(x,b) & ALPHA) != 0) continue; // tolerant +/-1
      if ((a-1) < yD && (img.getRGB(x,a+1) & ALPHA) != 0 || (b+1) < yD && (img.getRGB(x,b+1) & ALPHA) != 0) continue;
      return false;
    }
    return true;
  }
  // right Segment like the right side of O: backward
  private boolean rightSegment(int xR, int x, int y, int yU, int yD) {
    for (int a = y, b = y; x > xR; --x, --a, ++b) {
      if ((img.getRGB(x,a) & ALPHA) != 0 || (img.getRGB(x,b) & ALPHA) != 0) continue; // tolerant +/-1
      if ((a-1) > yU && (img.getRGB(x,a-1) & ALPHA) != 0 || (b+1) < yD && (img.getRGB(x,b+1) & ALPHA) != 0) continue;
      return false;
    }
    return true;
  }
  // downwards starting with Xr/Yr
  private boolean leftSlant(int Xl, int Xr, int Yr) {
    while (Xr > Xl && Yr < xy[3]) // with tolerant +1
    if ((img.getRGB(Xr, Yr) & ALPHA) != 0) {
      if (--Xr <= Xl) return true;
      if ((img.getRGB(Xr, Yr) & ALPHA) == 0) {
        Yr = lastLowerY(Xr, lastUpperY(Xr+1, Yr));
        if ((img.getRGB(Xr, Yr) & ALPHA) == 0) ++Yr;
      }
    } else return false;
    return true;
  }
  // upwards starting with Xl/Yl
  private boolean rightSlant(int Xl, int Xr, int Yl) {
    while (Xl < Xr && Yl < xy[3])  // with tolerant +1
    if ((img.getRGB(Xl, Yl) & ALPHA) != 0) {
      if (++Xl >= Xr) return true;
      if ((img.getRGB(Xr, Yl) & ALPHA) == 0) {
        Yl = lastLowerY(Xl, lastUpperY(Xl-1, Yl));
        if ((img.getRGB(Xl, Yl) & ALPHA) == 0) ++Yl;
      }
    } else return false;
    return true;
  }
  // tolerant +/- 1
  private boolean onHorizontal(int xW, int x, int y) {
    int Yu = y > 0? y-1:y+1;
    int Yl = y < xy[18]? y+1:y-1;
    for (; x < xW; ++x) if ((img.getRGB(x, y) & ALPHA) == 0) {
      if ((img.getRGB(x, Yu) & ALPHA) != 0 || (img.getRGB(x, Yl) & ALPHA) != 0) continue;
      return false;
    }
    return true;
  }
  // tolerant +/- 1
  private boolean noHorizontal(int xW, int x, int y) {
    int Yu = y > 0? y-1:y+1;
    int Yl = y < xy[18]? y+1:y-1;
    for (; x < xW; ++x) if ((img.getRGB(x, y) & ALPHA) != 0) {
      if ((img.getRGB(x, Yu) & ALPHA) == 0 || (img.getRGB(x, Yl) & ALPHA) == 0) continue;
      return false;
    }
    return true;
  }
  // tolerant +/- 1
  private boolean onVertical(int yH, int x, int y) {
    int Xl = x > 0? x-1:x;
    int Xr = x < xy[17]? x+1:x-1;
    for (; y < yH; ++y) if ((img.getRGB(x, y) & ALPHA) == 0) {
      if ((img.getRGB(Xl, y) & ALPHA) != 0 || (img.getRGB(Xr, y) & ALPHA) != 0) continue;
      return false;
    }
    return true;
  }
  // tolerant +/- 1
  private boolean noVertical(int yH, int x, int y) {
    int Xl = x > 0? x-1:x;
    int Xr = x < xy[17]? x+1:x-1;
    for (; y < yH; ++y) if ((img.getRGB(x, y) & ALPHA) != 0) {
      if ((img.getRGB(Xl, y) & ALPHA) == 0 || (img.getRGB(Xr, y) & ALPHA) == 0) continue;
      return false;
    }
    return true;
  }
  //
  private int xy[];
  private BufferedImage img;
  private final int ALPHA = 0xFF000000;
}
The StaticOCR.java
Java:
// Joe Nartca (C)
public class StaticOCR implements OCR {
  public StaticOCR(BufferedImage Img, int[] xy, String font) {
    this.xy = xy;
    this.Img = Img;
    this.font = font;
  }
  public int ocrLetter( ) {
    try {
      float matched, total, com, xRatio, yRatio, MAX = 0f, MIN = 0.75f, LOW = 0.55f;
      BufferedImage dImg, aImg, sImg = ImageTools.extractImage(Img, xy[0], xy[1], xy[10], xy[11]);
      int width = sImg.getWidth(), height = sImg.getHeight(), LETTER = (int)' ', dWidth, dHeight, IX, MX;
      if (xy[5] > xy[1]) { // UpperCase/tall characters
        IX = 0;
        MX = indexOf("a");
      } else {
        IX = indexOf("a");
        MX = letters.length;
      }
      boolean slim = (width << 1) < height;
      for (int ix = IX; ix < MX; ++ix) {
        String letter = letters[ix]; // check letter by letter
        dImg = ImageTools.createFontImage(letter, font, Font.BOLD, 30);
        dWidth = dImg.getWidth(); dHeight = dImg.getHeight();
        if (slim ^ ((dWidth << 1) < dHeight)) continue;
        if (dWidth > width || dHeight > height) {
          aImg = sImg;
          dImg = resize(dImg, width, height);
          dWidth = dImg.getWidth(); dHeight = dImg.getHeight();
        } else if (dWidth < width || dHeight < height) {
          aImg = resize(sImg, dWidth, dHeight);
        } else {
          aImg = sImg;
        }
        int a = noVertical(dImg, 0)? 1:0, max = noVertical(dImg, dWidth-1)? dWidth-1:dWidth;
        if (a == 1 || max == (dWidth-1)) { // adjust the BufferedImages?
          aImg = ImageTools.extractImage(aImg, a, 0, max-a, dHeight);
          dImg = ImageTools.extractImage(dImg, a, 0, max-a, dHeight);
          dWidth = aImg.getWidth();
        }
        matched = 0; total = 0; com = 0;
        for (int y = 0; y < dHeight; ++y) for (int x = 0; x < dWidth; ++x) {
          if ((aImg.getRGB(x, y) & ALPHA) == (dImg.getRGB(x, y) & ALPHA)) ++matched;
          if (((aImg.getRGB(x, y) & ALPHA) & (dImg.getRGB(x, y) & ALPHA)) != 0) ++com;
          ++total;
        }
        xRatio = xCount(aImg, dImg); yRatio = yCount(aImg, dImg);
        if ((matched/total) > MIN && (xRatio > LOW && yRatio > LOW || xRatio == 1f || yRatio == 1f)) {
          com /= getPixels(dImg);
          if (com > MAX) {
            LETTER = (int)letter.charAt(0);
            MAX = com;
          }
        }
      }
      return LETTER;
    } catch (Exception ex) {
      ex.printStackTrace();
    }
    return (int)' ';
  }
  // ---------------------------------------------------------------------------
  private BufferedImage resize(BufferedImage image, int width, int height) {
    try {
      BufferedImage img = new BufferedImage(width, height, BufferedImage.TYPE_INT_ARGB);
      Graphics2D graphics2D = img.createGraphics();
      graphics2D.setBackground(Color.WHITE);
      graphics2D.setPaint(Color.WHITE);
      graphics2D.fillRect(0, 0, width, height);
      graphics2D.setRenderingHint(RenderingHints.KEY_INTERPOLATION, RenderingHints.VALUE_INTERPOLATION_BILINEAR);
      graphics2D.drawImage(image, 0, 0, width, height, null);
      return ImageTools.filterColor(0, 0, img);
    } catch (Exception ex) { }
    return image;
  }
  private int indexOf(String x) {
    for (int i = 0; i < letters.length; ++i)
      if (letters[i].equals(x)) return i;
    return -1;
  }
  // ---------------------------------------------------------------------------
  private int getPixels(BufferedImage img) {
    int pix = 0;
    for (int x, y = 0, mx = img.getWidth(), my = img.getHeight(); y < my; ++y)
      for (x = 0; x < mx; ++x) if ((img.getRGB(x, y) & ALPHA) != 0) ++pix;
    return pix;
  }
  // ---------------------------------------------------------------------------
  private boolean onHorizontal(BufferedImage img, int y) {
    for (int x = 0, mx = img.getWidth(); x < mx; ++x) if ((img.getRGB(x, y) & ALPHA) == 0) return false;
    return true;
  }
  private boolean onVertical(BufferedImage img, int x) {
    for (int y = 0, my = img.getHeight(); y < my; ++y) if ((img.getRGB(x, y) & ALPHA) == 0) return false;
    return true;
  }
  private boolean noVertical(BufferedImage img, int x) {
    for (int y = 0, my = img.getHeight(); y < my; ++y) if ((img.getRGB(x, y) & ALPHA) != 0) return false;
    return true;
  }
  // ---------------------------------------------------------------------------
  private float xCount(BufferedImage sImg, BufferedImage dImg) {
    float ratio = 0f;
    int width = sImg.getWidth(), height = sImg.getHeight();
    for (int y = 0; y < height; ++y) {
      int x, sCnt = 0, sx = -1, sxe = 0;
      for (x = 0; x < width; ++x) if ((sImg.getRGB(x, y) & ALPHA) != 0) {
        if (sx < 0) sx = x;
        for (++sCnt, ++x; x < width; ++x) if ((sImg.getRGB(x, y) & ALPHA) == 0) break;
        if (sxe == 0) sxe = x;
      }
      int dCnt = 0, dx = -1, dxe = 0;
      for (x = 0; x < width; ++x) if ((dImg.getRGB(x, y) & ALPHA) != 0) {
        if (dx < 0) dx = x;
        for (++dCnt, ++x; x < width; ++x) if ((dImg.getRGB(x, y) & ALPHA) == 0) break;
        if (dxe == 0) dxe = x;
      }
      if (sx >= 0 && dx >= 0 && sCnt == dCnt) ++ratio;
    }
    return ratio/height;
  }
  private float yCount(BufferedImage sImg, BufferedImage dImg) {
    float ratio = 0f;
    int width = sImg.getWidth(), height = sImg.getHeight();
    for (int x = 0; x < width; ++x) {
      int y, sCnt = 0, sy = -1, sye = 0;
      for (y = 0; y < height; ++y) if ((sImg.getRGB(x, y) & ALPHA) != 0) {
        if (sy < 0) sy = y;
        for (++sCnt, ++y; y < height; ++y) if ((sImg.getRGB(x, y) & ALPHA) == 0) break;
        if (sye == 0) sye = y;
      }
      int dCnt = 0, dy = -1, dye = 0;
      for (y = 0; y < height; ++y) if ((dImg.getRGB(x, y) & ALPHA) != 0) {
        if (dy < 0) dy = y;
        for (++dCnt, ++y; y < height; ++y) if ((dImg.getRGB(x, y) & ALPHA) == 0) break;
        if (dye == 0) dye = y;
      }
      if (sy >= 0 && dy >= 0 && sCnt == dCnt) ++ratio;
    }
    return ratio/width;
  }
 
  private int xy[];
  private String font;
  private BufferedImage Img;
  private final int ALPHA = 0xFF000000;
 
  private String[] letters = {
                             "A", "B", "b", "C", "D", "d", "E", "F", "f", "G", "H", "h", "I", "i", "J", "j", "K", "k",
                             "L", "l", "M", "N", "O", "P","Q", "R", "S", "T", "t", "U", "V", "W", "X", "Y", "Z",
                                                         
                             "1", "2", "3", "4", "5", "6", "7", "8", "9", "0",
                            
                             "@", "€", "§", "$", "%", "&", "?", "/", "(", ")", "!", "\\", "{", "}", "[", "]", "|", "#",

                             "a", "c", "e", "g", "m", "n", "o", "p", "q", "r", "s", "u", "v", "w", "x", "y", "z",

                             "=", "*", "+", ";", "'", "\"", ",", "_", "-", "~",  "°", "^", "<", ">"
                            };

}
 
Sửa lần cuối:

Joe

Thành viên VIP
21/1/13
3,075
1,342
113
And finally the TestOCR.java

Java:
import joeapp.color.JIOCR;
import joeapp.color.ImageTools;
// Joe Nartca
public class TestOCR extends JFrame {
  public TestOCR(String... a) throws Exception {
    if (a.length > 0) imgFile = a[0];
    setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
    JLabel lab = new JLabel("Image File ");
    JTextField txt = new JTextField(imgFile);
    txt.addActionListener(e -> {
      imgFile = txt.getText();
      if (imgFile == null || imgFile.length() == 0) return;
      load( );
    });
    JLabel lChar = new JLabel("CharColor");
    cBox = new JComboBox<>("Black!White".split("!"));
    cBox.addActionListener(e -> {
      txt.grabFocus();
    });
    JLabel lFont = new JLabel("Font");
    fBox = new JComboBox<>(GraphicsEnvironment.getLocalGraphicsEnvironment().getAvailableFontFamilyNames());
    //fBox = new JComboBox<>("Arial!Courier New!Dialog!Lucida Console!Monospaced!SansSerif!Tahoma!Times New Roman!Verdana".
    //                       split("!"));
    fBox.addActionListener(e -> {
      txt.grabFocus();
    });
    JRadioButton sRad = new JRadioButton("static");
    JRadioButton dRad = new JRadioButton("dynamic");
    sRad.setSelected(true);
    sRad.addActionListener(e -> {
      txt.grabFocus();
      dynamic = false;
      fBox.setEnabled(true);
      dRad.setSelected(false);
    });
    dRad.addActionListener(e -> {
      txt.grabFocus();
      dynamic = true;
      fBox.setEnabled(false);
      sRad.setSelected(false);
    });
    result = new JButton("EMPTY");
    result.addActionListener(e -> {
      System.exit(0);
    });
    txta = new JTextArea();
 
    JPanel center = new JPanel();
    JPanel north = new JPanel();
    JPanel south = new JPanel();
    north.add(lab); north.add(txt); north.add(lChar); north.add(cBox); 
    center.add(result); center.add(txta);
    south.add(lFont); south.add(fBox); south.add(sRad); south.add(dRad);
  
    lab.setHorizontalAlignment(JLabel.RIGHT);
    result.setHorizontalAlignment(JLabel.LEFT);
 
    add("North", north);
    add("Center", center);
    add("South", south);
 
    pack();
    setVisible(true);
  }
  private void load( ) {
    try {
      BufferedImage img = ImageIO.read(new File(imgFile));
      if (img == null) {
        result.setText("Invalid file: "+imgFile);
        return;
      }
      txta.selectAll();
      txta.replaceSelection("");
      int color = "Black".equals((String)cBox.getSelectedItem())? 0:0xFFFFFF;
      long beg = System.currentTimeMillis();
      String found = JIOCR.scan(img, color, dynamic, (String)fBox.getSelectedItem());
      double time = (double)(System.currentTimeMillis() - beg)/1000;
      txta.append("Found Text:\n\n"+found+"\n\nTime:"+time+" Sec.");
      result.setText("");
      result.setIcon(new ImageIcon(img));
      pack();
    } catch (Exception ex) {
      ex.printStackTrace();
    }
  }
  //
  public static void main(String... a) throws Exception {
    UIManager.setLookAndFeel("com.sun.java.swing.plaf.nimbus.NimbusLookAndFeel");
    if (a.length > 0 && a[0].indexOf("/") < 0) a[0] = "./images/"+a[0];
    new TestOCR(a);
  }
  private JButton result;
  private JTextArea txta;
  private boolean dynamic = false;
  private JComboBox<String> fBox, cBox;
  private String imgFile = "./images/Mimosas.png";
}
...and here it is:
TestOCR_1.png

TestOCR_2.png

TestOCR_3.png

TestOCR_4.png

The difference of elapsed time between dynamic and static shows you how fast the dynamic OCR is. Instead of a chain invoking of DYNAMIC methods letterA, LetterB...you can group them in an array and invoke the methods in a loop. Example:

Java:
Class<?> ocr = this.getClass();
Method methods[] = ocr.getDeclaredMethods();
for (Method m : methods) {
  if (m.getName().startsWith("letter")) {
    m.setAccessible(true);
    Boolean b = (Boolean)m.invoke(ocr);
    if (b) return xy[5];
  }
}
It looks smart, but at the expense of performance and it's hard to find out what method has caused an exception.
That's all, folk! Hope you've enjoyed this OCR Tutorial
 
Sửa lần cuối:
  • Like
Reactions: Tuấn Anh