Welcome to PyMuPDF#
PyMuPDF is a high-performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
PyMuPDF is hosted on GitHub and registered on PyPI.
This documentation covers all versions up to 1.25.1.
- Opening Files
- Text
- How to Extract all Document Text
- How to Extract Text as Markdown
- How to Extract Key-Value Pairs from a Page
- How to Extract Text from within a Rectangle
- How to Extract Text in Natural Reading Order
- How to Extract Table Content from Documents
- How to Mark Extracted Text
- How to Mark Searched Text
- How to Mark Non-horizontal Text
- How to Analyze Font Characteristics
- How to Insert Text
- How to Extract Text with Color
- Images
- How to Make Images from Document Pages
- How to Increase Image Resolution
- How to Create Partial Pixmaps (Clips)
- How to Zoom a Clip to a GUI Window
- How to Create or Suppress Annotation Images
- How to Extract Images: Non-PDF Documents
- How to Extract Images: PDF Documents
- How to Handle Image Masks
- How to Make one PDF of all your Pictures (or Files)
- How to Create Vector Images
- How to Convert Images
- How to Use Pixmaps: Gluing Images
- How to Use Pixmaps: Making a Fractal
- How to Interface with NumPy
- How to Add Images to a PDF Page
- How to Use Pixmaps: Checking Text Visibility
- Annotations
- Drawing and Graphics
- Stories
- How to Add a Line of Text with Some Formatting
- How to use Images
- How to Read External HTML and CSS for a Story
- How to Output Database Content with Story Templates
- How to Integrate with Existing PDFs
- How to Make Multi-Columned Layouts and Access Fonts from Package pymupdf-fonts
- How to Make a Layout which Wraps Around a Predefined “no go area” Layout
- How to Output an HTML Table
- How to Create a Simple Grid Layout
- How to Generate a Table of Contents
- How to Display a List from JSON Data
- Using the Alternative
Story.write*()
functions
- Journalling
- Multiprocessing
- OCR - Optical Character Recognition
- Optional Content Support
- Low-Level Interfaces
- Common Issues and their Solutions