Use Acrobat to Create Searchable PDFs of All Your Documents


4-Step Computer Security Upgrade

Learn to encrypt your files, secure your computer when using public Wi-Fi, enable two-factor authentication, and use good passwords.

ScanSnap S1500 Document Scanner

Performing OCR on every file as you go—even if you take advantage of the ScanSnap s1500‘s first-page OCR feature—makes scanning go really slowly. Even on my super-fast desktop computer, it triples or quadruples my average scanning time. Fortunately, you can use Acrobat to do the time-consuming scanning while you aren’t at your computer with Acrobat’s batch-OCR feature.

In Acrobat 9 Standard, which is what I use with my ScanSnap s1500, this feature is not obvious. In Acrobat 10, I’m told it will be much more powerful and easy to use. For now, here’s how to get Acrobat to OCR a whole directory:

Back everything up, first, just in case something goes wrong. Then, go to Document > OCR Text Recognition > Recognize Text in Multiple Files Using OCR….

In the next window, select Add Folders from the drop-down menu.

You can do this one client folder at a time, or just do your entire law firm directory or file server. It may take a moment to add all the files. You may want to skim the list. Text files and other documents may get pulled in, because Acrobat thinks you want them converted to PDF. I wish there was a way to select only PDF files, but I haven’t been able to find one, yet.

In the next dialog, pick your options.

When you hit OK, Acrobat will start performing OCR on all the files in the folder(s) you selected. This is going to take a long time. Set this to start working before you leave for the night, if not the weekend. It depends on the size of the job, but if you start with a small folder, you will get the idea right away. When you get back, though, you’ll have a whole directory full of searchable PDFs!

This isn’t a perfect solution. I wish Acrobat could be set to watch a directory and automatically OCR every new PDF file, preferably on a schedule. However, it works well enough, and will get you started. Acrobat will skip files with already-recognized text, so you can re-run this job periodically without worrying that Acrobat will re-OCR everything.


Get Lawyerist in Your Inbox, Daily

Current Articles
Current Lab Discussions
  • Wonderful tip, I had been looking for a feature like this in Acrobat for a while! Thanks!

  • Sam,
    Thanks for the write-up. You can schedule batches by using plug-ins like Evermap’s AutoBatch:

    Dave Stromfeld, Acrobat Product Manager
    Adobe Systems

  • Za

    If you are using ScanSnap and Adobe Acrobat Pro, you have a few options to automate the process.

    With ScanSnap, when you scan a file, you can set the system to:
    1. Prompt you for a new file name; and
    2. Automatically recognize text (OCR) in the document.

    For #1, go to the ScanSnap Manager , click on the “Save” tab, and check the box for rename file after scanning.

    For #2, go to the ScanSnap Manager, click on the “File option” tab, select a file format of .pdf, then check the box for “Convert to Searchable PDF.” When the document is scanned, the ABBYY software will immediately convert the document to searchable .pdf

    Using #2 above has a limitation. If you want to scan multiple documents, and you scan a large document, the text recognition conversion can be slow. ScanSnap will not allow you to scan more documents until it is finished the text recognition. If you scan a large document you will be stalled until the text recognition is complete.

    Another option is what you describe with Adobe. In addition to the process that you identify to perform a text recognition of everything in a folder, Adobe Acrobat Pro offers an “Action Wizard.” In Adobe Acrobat X (10) and XI (11) you can set up an Action within the Action Wizard to automate a process.

    Click on Tools, then Action Wizard, then Create New Action. With Adobe Acrobat, you can set up an action to recognize text for every file in a folder.

    Alternatively, you can set up an action that will recognize text in a document, prompt you for a new file name, and then prompt you for a new location to save the file. The Action Wizard is a great feature if you scan files to a network folder. As you rename and save a file to a new location, you will be emptying the folder, and therefore you will be assured that every document is searchable, in the event that it is misplaced.

    Lastly, if you want to search for a file that is text searchable in Windows 7, click the Windows 7 icon in the bottom left of the screen and in the search box just above the icon type your search term. A search from this box will search all of your files, and is an efficient way to find files on your computer and searchable network drives.

    ABBYY has a Corporate version of their software that will monitor a folder, and if a new file is added, it will perform a text recognition on any newly added files. ABBYY makes the software that is bundled with ScanSnap and that provides the ScanSnap text recognition capability.