ocrmypdf
  • Introduction
  • Release notes
  • PDF optimization
  • Installing additional language packs
  • Installing the JBIG2 encoder

Usage

  • Cookbook
  • OCRmyPDF Docker image
  • Advanced features
  • Batch processing
  • PDF security issues
  • Common error messages

Developers

  • Using the OCRmyPDF API
  • Contributing guidelines
ocrmypdf
  • Docs »
  • OCRmyPDF documentation
  • View page source

OCRmyPDF documentation¶

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched.

PDF is the best format for storing and exchanging scanned documents. Unfortunately, PDFs can be difficult to modify. OCRmyPDF makes it easy to apply image processing and OCR to existing PDFs.

  • Introduction
  • Release notes
  • PDF optimization
  • Installing additional language packs
  • Installing the JBIG2 encoder

Usage

  • Cookbook
    • Basic examples
    • Image processing
    • Don’t actually OCR my PDF
    • Redo existing OCR
    • Improving OCR quality
    • PDF optimization
  • OCRmyPDF Docker image
    • Installing the Docker image
    • Using the Docker image on the command line
    • Adding languages to the Docker image
    • Executing the test suite
    • Accessing the shell
    • Using the OCRmyPDF web service wrapper
  • Advanced features
    • Control of unpaper
    • Control of OCR options
    • Changing the PDF renderer
    • Return code policy
    • Debugging the intermediate files
  • Batch processing
    • Batch jobs
    • Directory trees
    • Hot (watched) folders
    • macOS Automator
  • PDF security issues
    • PDFs may contain malware
    • How OCRmyPDF processes PDFs
    • Using OCRmyPDF online or as a service
    • Password protection, digital signatures and certification
  • Common error messages
    • Page already has text
    • Input file ‘filename’ is not a valid PDF

Developers

  • Using the OCRmyPDF API
    • Example
  • Contributing guidelines
    • Big changes
    • Code style
    • Tests
    • New Python dependencies
    • New non-Python dependencies
    • Style guide: Is it OCRmyPDF or ocrmypdf?
    • Known ports/packagers

Indices and tables¶

  • Index
  • Module Index
  • Search Page
Next

© Copyright 2020, James R. Barlow. Licensed under Creative Commons Attribution-ShareAlike 4.0.

Built with Sphinx using a theme provided by Read the Docs.