Publications:Ethiopic Document Image Database for Testing Character Recognition Systems
Property "Publisher" has a restricted application area and cannot be used as annotation property by a user. Property "Author" has a restricted application area and cannot be used as annotation property by a user. Property "Author" has a restricted application area and cannot be used as annotation property by a user.
| Title | Ethiopic Document Image Database for Testing Character Recognition Systems |
|---|---|
| Author | |
| Year | 2006 |
| PublicationType | Report |
| Journal | |
| HostPublication | |
| Conference | |
| DOI | |
| Diva url | http://hh.diva-portal.org/smash/record.jsf?searchId=1&pid=diva2:408389 |
| Abstract | In this paper we describe the acquisition and content of a large database of Ethiopic documents for testing and evaluating character recognition systems. The Ethiopic Document Image Database (EDIDB) contains documents written in Amharic and Geez languages. The database was built from a variety of documents such as printouts, books, newspapers, and magazines. Documents written in various font types, sizes and styles were included in the database. Degraded and poor quality documents were also included in the database to represent the real life situation. A total of 1,204 pages were scanned at a resolution of 300 dpi and saved as grayscale images of JPEG format. We also describe an evaluation protocol for standardizing the comparison of recognition systems and their results. The database is made available to the research community through http://www.hh.se/staff/josef/. |