TesseractDotnet Example
Download source (TesseractDotnetExample.zip)



The use of TesseractDotnet is quite simple, except for the lack of references (the Tesseract itself does as well). For example, to OCR an image which is treated as a single line text, the code below works well:

  1. using System;
  2. using System.Drawing;
  3. using tesseract;
  4.  
  5. // ...
  6.  
  7. TesseractProcessor processor = new TesseractProcessor();
  8.  
  9. bool succeed = processor.Init(@"..\tessdata\""eng", 3); // TesseractEngineMode: DEFAULT
  10. if (!succeed)
  11. {
  12.     // Deal with error
  13.     Application.Exit();
  14. }
  15.  
  16. processor.SetVariable("tessedit_pageseg_mode""3"); // TesseractPageSegMode: PSM_SINGLE_LINE
  17.  
  18. Image image = Image.FromFile("...");
  19.  
  20. processor.Clear();
  21. processor.ClearAdaptiveClassifier();
  22.  
  23. string result = processor.Apply(image);
  24.  
  25. // ...

But there is one point requiring particular attention. The first parameter of method TesseractProcessor.Init(), dataPath, must be ended with slash "/" or backslash "\", or the initialization will be failed. When it failed, the Init() method returns false. But if you did not deal with the returned value, then when the program runs to the line calling Apply(), an exception will be throwed:

AccessViolationException:
Attempted to read or write protected memory. This is often an indication that other memory is corrupt.


Current language: English · also available in: Chinese (Simplified)
1 Comments
Jay
Wed, 28 Dec 2011 23:48 +0800
同遭遇。。
Leave a Comment
Name (required)
E-mail (required, will not be published)
Website (optional)
Comment
A syntax system which is similar to wiki markup is available, see the guide