Performing OCR on every file as you go—even if you take advantage of the ScanSnap s1500‘s first-page OCR feature—makes scanning go really slowly. Even on my super-fast desktop computer, it triples or quadruples my average scanning time. Fortunately, you can use Acrobat to do the time-consuming scanning while you aren’t at your computer with Acrobat’s batch-OCR feature.
In Acrobat 9 Standard, which is what I use with my ScanSnap s1500, this feature is not obvious. In Acrobat 10, I’m told it will be much more powerful and easy to use. For now, here’s how to get Acrobat to OCR a whole directory:
Back everything up, first, just in case something goes wrong. Then, go to Document > OCR Text Recognition > Recognize Text in Multiple Files Using OCR….
In the next window, select Add Folders from the drop-down menu.
You can do this one client folder at a time, or just do your entire law firm directory or file server. It may take a moment to add all the files. You may want to skim the list. Text files and other documents may get pulled in, because Acrobat thinks you want them converted to PDF. I wish there was a way to select only PDF files, but I haven’t been able to find one, yet.
In the next dialog, pick your options.
When you hit OK, Acrobat will start performing OCR on all the files in the folder(s) you selected. This is going to take a long time. Set this to start working before you leave for the night, if not the weekend. It depends on the size of the job, but if you start with a small folder, you will get the idea right away. When you get back, though, you’ll have a whole directory full of searchable PDFs!
This isn’t a perfect solution. I wish Acrobat could be set to watch a directory and automatically OCR every new PDF file, preferably on a schedule. However, it works well enough, and will get you started. Acrobat will skip files with already-recognized text, so you can re-run this job periodically without worrying that Acrobat will re-OCR everything.