OCR PDF · 3 min read

OCR Thai PDF — Extract Thai Text from Scanned Documents

Scanned Thai documents — official forms, contracts, certificates written in Thai script — can be converted to searchable PDFs with free OCR. Here is how to recognize Thai characters including vowels and tone marks from any scanned PDF.

Step 1 — Open the OCR Tool

Go to pdfeditor.onl/ocr-pdf. No sign-up needed. The Tesseract Thai language model runs in your browser and processes files locally.

Step 2 — Upload the Thai PDF

Upload your scanned Thai document. Thai script has no word spaces and uses stacked vowel marks — a 300 DPI scan is important for reliable recognition.

Tip: Thai fonts printed in clear, high-contrast modern typefaces (such as those used in government documents) OCR much better than decorative or handwritten-style fonts.

Step 3 — Select Thai Language

Choose Thai (ภาษาไทย) from the language selector. This activates the Thai script recognition model including all 44 consonants, vowel forms, and 5 tone marks.

Step 4 — Review the Recognized Thai Text

Check the text blocks carefully. Thai OCR accuracy is good for standard government and business documents. Complex stacked vowel combinations and certain rare characters may need manual correction.

Step 5 — Download the Searchable Thai PDF

Click Download PDF. The Thai text layer is embedded. You can now search the document in Thai using Ctrl+F in Chrome or Adobe Reader.

OCR Thai PDF — Free →

Frequently Asked Questions

Does it support Thai mixed with English (code-switching)?

Yes. Tesseract handles mixed Thai-English documents. Select Thai as the primary language — the model also recognizes Latin characters embedded in Thai text.

Can I extract Thai text from a government ID or form?

For visual text extraction and general search, yes. For official identity verification purposes, always use the original document — OCR output is not an official record.

← Back to All Guides