機能の比較#

機能比較表#

以下の表は、PyMuPDF が他の典型的な解決策と比較した場合の違いを示しています。

Feature	PyMuPDF	pikepdf	PyPDF2	pdfrw	pdfplumber / pdfminer
Supports Multiple Document Formats	PDF XPS EPUB MOBI FB2 CBZ SVG TXT Image	PDF	PDF	PDF	PDF
Implementation	Python and C	Python and C++	Python	Python	Python
Render Document Pages	All document types	No rendering	No rendering	No rendering	No rendering
Write Text to PDF Page	See: Page.insert_htmlbox or: Page.insert_textbox or: TextWriter
Supports CJK characters
Extract Text	All document types		PDF only		PDF only
Extract Text as Markdown (.md)	All document types
Extract Tables	All document types				PDF only
Extract Vector Graphics	All document types				Limited
Draw Vector Graphics (PDF)
Based on Existing, Mature Library	MuPDF	QPDF
Automatic Repair of Damaged PDFs
Encrypted PDFs			Limited		Limited
Linerarized PDFs
Incremental Updates
Integrates with Jupyter and IPython Notebooks
Joining / Merging PDF with other Document Types	All document types	PDF only	PDF only	PDF only	PDF only
OCR API for Seamless Integration with Tesseract	All document types
Integrated Checkpoint / Restart Feature (PDF)
PDF Optional Content
PDF Embedded Files			Limited		Limited
PDF Redactions
PDF Annotations	Full		Limited
PDF Form Fields	Create, read, update		Limited, no creation
PDF Page Labels
Support Font Sub-Setting

パフォーマンス#

8つのPDFファイル（合計7,031ページ）にテキストと画像が含まれている固定されたセットのテストスイートを使用して、PyMuPDF のパフォーマンスをさまざまなタスクに対してベンチマークします。

以下は、タスクごとにグループ化された現在の結果です：

Copying

This refers to opening a document and then saving it to a new file. This test measures the speed of reading a PDF and re-writing as a new PDF. This process is also at the core of functions like merging / joining multiple documents. The numbers below therefore apply to PDF joining and merging.

The results for all 7,031 pages are:

600

500

400

300

200

100

⏱

seconds

3.05

10.54

33.57

494.04

PyMuPDF

PDFrw

PikePDF

PyPDF2

fastest

←

slowest

Text Extraction

This refers to extracting simple, plain text from every page of the document and storing it in a text file.

The results for all 7,031 pages are:

400

300

200

100

⏱

seconds

8.01

27.42

101.64

227.27

PyMuPDF

XPDF

PyPDF2

PDFMiner

fastest

←

slowest

Rendering

This refers to making an image (like PNG) from every page of a document at a given DPI resolution. This feature is the basis for displaying a document in a GUI window.

The results for all 7,031 pages are:

1000

800

600

400

200

⏱

seconds

367.04

646

851.52

PyMuPDF

XPDF

PDF2JPG

fastest

←

注釈

これらのパフォーマンスのタイミングに関する方法の詳細については、パフォーマンス比較方法を参照してください。

ライセンスと著作権#

PyMuPDFとMuPDFは現在、オープンソースのAGPLと商用ライセンス契約の両方で提供されています。ライセンスのガイドラインに従うことを確認するため、配布資料（COPYINGファイル）とここにあるAGPLライセンス契約の全文をお読みください。AGPLの要件を満たせないと判断された場合は、商用ライセンスに関する詳細情報については、 Artifex にお問い合わせください。

Artifex Artifexは、MuPDF の独占的な商業ライセンスエージェントです。

Artifex 、Artifex のロゴ、MuPDF 、およびMuPDFのロゴは、Artifex Software Inc. の登録商標です。

This documentation covers PyMuPDF v1.24.3 features as of 2024-05-09 00:00:01.

The major and minor versions of PyMuPDF and MuPDF will always be the same. Only the third qualifier (patch level) may deviate from that of MuPDF.

Do you have any feedback on this page?

This software is provided AS-IS with no warranty, either express or implied. This software is distributed under license and may not be copied, modified or distributed except as expressly authorized under the terms of that license. Refer to licensing information at artifex.com or contact Artifex Software Inc., 39 Mesa Street, Suite 108A, San Francisco CA 94129, United States for further information.

このドキュメントは 1.24.3 までのすべてのバージョンを対象としています。