- About | People | Projects
- Presentations | Publications
- Exhibits | Printable Materials
- Goals & Reports | News | Giving
- UF Digital Collections (UFDC)
- Digital Library of the Caribbean (dLOC)
- Caribbean Newspaper Digital Library (CNDL)
- Florida Digital Newspaper Library
- Institutional Repository @UF
- Related Libraries
Digital Library Center
Smathers Libraries
University of Florida
P.O Box 117003
Gainesville, FL 32611 USA
P: 352.273.2900
F: 352.846.3702
DLC@uflib.ufl.edu
Workflow in the DLC, Featuring Prime Recognition™ Software: About and Running
Features
- Six OCR engines
- WordScan
- TextBridge
- Recore
- TypeReader
- OmniPage
- FineReader
- Voting / Weighing
- 11 Western Languages
- Fault Tolerant Architecture / Automatic Engine Recovery
- Image Pre-Processing
- 1 to 6 CPU's
- 16 output format formats including .PRO: metadata on location, confidence, etc., per character.
For Developers
- Application Programming Interface (API): 40 documented calls
- Dynamic Link Libraries (DLLs)
- Configurable Initialization files (INIs)
Bundled Software
- PrimeView™: Image Zoning
- Job Server:
- Batch OCR
- Prioritized jobs
- Network aware
- Job file:
Prime Recognition Job File Version=3.90 1 E:\Prdev\Images\UF70000002n001.tif E:\Prdev\Templates\UF70000002n001.ptm - Document Template file:
Prime Recognition Document Template
Version=3.90
0,-1
1,0,1,0,10,0,64,1,0,0
4
0,0,1,999999,0,0,0,0,
1,0,1,999999,0,0,0,0,
2,0,1,999999,0,0,0,0,
3,0,1,999999,0,0,0,0,
Hardware
Dedicated server with 2 CPU's, 67GB HD, 2GB RAM
Throughput and Accuracy
- Slower (six engines instead of one), but more accruate (over 99%)
- Faster with larger files (processing overhead): 0.7 to 3.4 seconds per MB
Running PrimeOCR™
- Write a job file to the Job folder:
Prime Recognition Job File Version=3.90 1 \\Smathersnt19\ScanTech\ScanQC\Complete\UF00016972\00116.tif \\Smathersnt19\Prdev\Templates\UF70000002n001.ptm
- is a Prime Recognition™ program always running on the remote server
- reads the locations of the TIFF image and the document template from the job file
- processes the TIF file according to the template
- outputs selected file types: PDF+Image and TXT
- leaves the original, archival TIF unchanged in its folder.
