SharePoint OCR - Convert Scanned Documents to Searchable PDFs


How do you convert scanned/captured documents into searchable files and migrate them to Microsoft SharePoint? PSI:Capture provides several Optical Character Recognition (OCR) engine options, to provide your organization with the ability to perform full text conversions, or focus on specific areas of the page, with Zone OCR.

There are several output format options, including the following: searchable PDF, text formats, Microsoft Word, HTML and others.

Below is a full OCR feature set for PSI:Capture Enterprise and PSI:Capture for MFPs:

 

Full text Optical Character Recognition (OCR) Features:


  • Create searchable PDF s within a Microsoft SharePoint Document Library
  • OCR output formats include: PDF, Word, WordPerfect, HTML, Text, and many others
  • Scan paper files for conversion, or import digital images from folders
  • Convert PDF, TIFF, BMP, JPG, PNG and GIF to searchable files
  • Broad range of language support
  • Performance tuning options to choose between recognition speed and accuracy
  • Three OCR engine options:  Standard, GlyphReader and Tesseract to provide for speed and accuracy
  • Open Source OCR Tesseract provides the ability to customize the OCR engine
  • PDF creation engine allows for PDF with hidden text, text PDF and image PDF
  • PDF fields can be populated with index field information


Zone Optical Character Recognition (OCR) Features:


  • Zone OCR Separation - ability to split files based on key terms
  • Create Zone OCR processing templates based on document types to gather data from capture images
  • Automatically create SharePoint Document Libraries from captured text
  • Specify zone types and filters
  • Perform imagine processing prior to OCR to insure the best accuracy

To learn more about SharePoint and OCR, contact our advanced capture Sales Team to find a reseller today.