Translating documents in PDF format is a bigger challenge than translating from any other formats. Why is it so? It’s undoubtedly because of the file format itself. Of course, there are editable PDF files and with one click they can be converted into a text document. It’s becoming more complicated when we have to deal with noneditable PDF files. These files quite often contain graphic elements, which also need to be translated. Translating PDF scans is also a multistage process of formatting the document.
Which documents belong to this category? Scans of graphic files are examples of non-editable document – infographics or advertising materials with text embedded in the background, which needs to be translated. If you want to translate such documents, you have to read this article. The process of translating PDF files and ways to lower the cost of it will be explained here. How to avoid, responsible for a high price, time-consuming formatting of the document?
How translating PDF files usually looks like?
First OCR needs to be done by a DTP specialist. In this process, automatic text detection happens. Thanks to the appropriate software, the text is separated, which will allow in the future it’s edition and replace with translated version.
After OCR process, the validity of the text is checked. Even the most advanced OCR software won’t replace a sharp specialists eye. That’s why the text needs to be verified, because words with similar spelling could be missed.
Next in CAT tool (Computer-assisted Translation, not Machine translation) a new translation project is created, which is transferred to the translator. There are some different workflow possibilities. In extended version it will be a correction by a native speaker, proofreading and, in the final stage, quality assurance, the last verification. Shorter workflow would be just translation and a quality assurance.
The last step in a translation process is giving the translation to a DTP specialist. He is responsible for graphic files so the target document is the same as a source document, but just translated. So called back engineering is in use here. In this process the original graphic layout stays in translated document with it’s high quality. Processes compliant with quality certificate ISO 9001:2009 deliver the highest translation quality.
It’s obvious that translating PDF files bounds a whole crew of professionals. Thanks to this we can receive best quality target file, but it takes time and it’s labor-intensive.
What can you do to make translation process faster.
If you can, find your editable files. Behind most of PDF files there’s an editable document. PDF files are almost always generated out of different format files, which is possible at least with an option ‘print into PDF’. You can make PDF files from Microsoft Word (DOCX), Microsoft PowerPoint (PPTX) or internet browser (HTML). PDF format gives the final file version, not intended for further editing. This format freeze the text and graphics, not allowing them to move, which is why it’s so hard for editing.
Translating document directly from PDF file it has to be initially formatted, which makes the whole process longer and more expensive. To avoid that, just send translator editable file. Provider will perform the translation based on editable document, allowing CAT tools to directly project the original format.
Of course if you need a PDF scan translation, you won’t find editable version. It depends on a graphic layout and text formatting how complicated the files edition process will be.
With editable documents you don’t have to pay for recreation formatting. What’s more, thanks to CAT tools, it is possible to fit translations into original text. If you forget to send editable text to your translator and he will remind about it – don’t delete the message, try to figure out, where the file can be. You can use it as an advantage!