PDF Processing — Orellius

/pdfOfficial

Read, create, edit, merge, split, rotate, watermark, encrypt, OCR, and manipulate PDF files using Python libraries and command-line tools.

PDFDocumentsPythonProcessing· 2 min read

Quick import: Download the .md file and save it to .claude/commands/ (Claude Code), .cursorrules (Cursor), or paste as a system prompt in ChatGPT, Gemini, or any LLM API.

#What it does

The PDF skill handles all PDF operations -- reading, creating, editing, merging, splitting, rotating, watermarking, encrypting/decrypting, extracting images, OCR on scanned documents, and filling forms. It covers both Python libraries and command-line tools.

#How to use

Activate whenever a .pdf file is mentioned or needs to be produced. This includes any PDF manipulation task.

Extract all tables from this PDF into an Excel file

Merge these three PDFs and add a watermark

#Skill instructions

#Tool Selection

| Task | Best Tool | |------|-----------| | Merge PDFs | pypdf | | Split PDFs | pypdf | | Extract text | pdfplumber | | Extract tables | pdfplumber | | Create PDFs | reportlab | | Command-line merge | qpdf | | OCR scanned PDFs | pytesseract | | Fill PDF forms | pdf-lib or pypdf |

#Key Libraries

pypdf: Basic operations (merge, split, rotate, encrypt, watermark, metadata)
pdfplumber: Text and table extraction with layout preservation
reportlab: Creating new PDFs with Canvas or Platypus (multi-page)
pytesseract + pdf2image: OCR for scanned documents

#Command-Line Tools

pdftotext: Text extraction with layout preservation
qpdf: Merge, split, rotate, decrypt
pdftk: Alternative for merge, split, rotate operations

#Important Notes

Never use Unicode subscript/superscript characters in ReportLab PDFs (they render as black boxes)
Use ReportLab's <sub> and <super> XML tags in Paragraph objects instead
For form filling, consult the dedicated FORMS.md reference

This skill is from the Anthropic Skills Repository.

Anthropic·16 Mar, 2026

View all skills