3 August 2020

Workaround for OCR on PDF with Renderable Text (with Bookmarks)

One of the frustrating things about working with Adobe Acrobat is trying to OCR a PDF with renderable text. If you try running OCR on the PDF, it will stop at each instance of renderable text on a page and not go any further. This is often a problem for me because the documents I deal with tend to be scanned documents, with page numbers inserted in the PDF as renderable text.

The usual workaround is to export all the pages of the PDF as TIFF images, then to re-create the PDF and then to run OCR.

However, today, I encountered this issue with the added headache of trying to keep the existing bookmarks that I had added to the original PDF. The workaround I found on the Adobe forums (here) works, and I am recorded this here so I do not have to scavenge for the same instructions every time!

  1. Export the PDF to TIFFs, and merge them into a new PDF.
  2. Save this new PDF as a separate document.
  3. Run OCR on this new PDF and save the new PDF again.
  4. Go back to the original PDF.
  5. Use Replace Pages…” and select the new, OCR’d PDF.
  6. Specify the full range of page numbers (1 to the end).
  7. Replace the pages. The PDF should now have all the bookmarks and also have been OCR’d.

Workarounds Productivity

Previous post
Microscope Actual Play: “A Watery Grave”, Part 1 2 August 2020, 4 players, ~2 hours of playtime Players: Y, YH, A, and myself This was the first time that I, or anyone else at the table, had played
Next post
Microscope Actual Play: “A Watery Grave”, Part 2 This is a continuation of the previous post documenting our play of Microscope. You can find Part 1 here. To recap, the big picture of this history