Welcome to PyMuPDF#

PyMuPDF is a high-performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

PyMuPDF is hosted on GitHub and registered on PyPI.

This documentation covers all versions up to 1.23.25.

How to Guide

Find out about PyMuPDF Utilities#

The GitHub repository PyMuPDF-Utilities contains a full range of examples, demonstrations and use cases.

Do you need PDF to DOCX conversion?#

We recommend the pdf2docx library which uses PyMuPDF and the python-docx library to provide simple document conversion from PDF to DOCX format.

