Modern CAT tools like Trados Studio and memoQ deal perfectly well with PDFs. Oh but hang on a minute, that’s “real PDFs”, not scans!
Real PDFs are the result of an electronic conversion from one file format to the Adobe PDF format. That type of file can be very easily processed with the latest CAT tools. Just load it, ask the software to prepare it, and the Editor part of your CAT tool displays a perfect workable version of the file.
However, when the PDF is a scan, things are much harder. If the text is handwritten, forget about it, it’s not going to work. You can pay to have it typed for you, or you can type it if you have the time, otherwise, you have to translate from paper into an editor.
But if the PDF is a scan of a typed text, you can still process it. I use a software package that has good OCR (Optical Character Recognition) and I find it very helpful. Before I run the scan in my software, I tell the software what language we are looking at. The software processes the scan and gives me a Word file. That file is not perfect, so I do a little bit of editing, and save. The resulting Word file I process with my CAT tool.
I really think it’s worth the effort: my translation memories are there to capitalise on everything I translate. They are like mini treasures that I can tap into when my mind goes blank. And believe me, after nearly 35 years doing this job, it does, go blank!
Thank goodness for all my translation memories!