定数と列挙型¶

Constants and enumerations of MuPDF as implemented by PyMuPDF. Each of the following values is accessible as pymupdf.value.

定数¶

Base14_Fonts¶

PDFベース14フォントの事前定義されたPythonリスト。

Type:: list

csRGB¶

事前定義されたRGBカラースペース pymupdf.Colorspace(pymupdf.CS_RGB)。

Type:: Colorspace (カラースペース)

csGRAY¶

事前定義されたGRAYカラースペース pymupdf.Colorspace(pymupdf.CS_GRAY)。

Type:: Colorspace (カラースペース)

csCMYK¶

事前定義されたCMYKカラースペース pymupdf.Colorspace(pymupdf.CS_CMYK)。

Type:: Colorspace (カラースペース)

CS_RGB¶

1 – Colorspace (カラースペース) のタイプはRGBA

Type:: int

CS_GRAY¶

2 – Colorspace (カラースペース) のタイプはGRAY

Type:: int

CS_CMYK¶

3 – Colorspace (カラースペース) のタイプはCMYK

Type:: int

mupdf_version¶

'x.xx.x' -- MuPDF version that is being used by PyMuPDF.

Type:: string

mupdf_version_tuple¶

MuPDF version as a tuple of integers, (major, minor, patch).

Type:: tuple

pymupdf_version¶

'x.xx.x' -- PyMuPDF version.

Type:: string

pymupdf_version_tuple¶

PyMuPDF version as a tuple of integers, (major, minor, patch).

Type:: tuple

pymupdf_date¶: Disabled (set to None) in 1.26.1.

version¶

(pymupdf_version, mupdf_version, timestamp) -- combined version information where timestamp is the generation point in time formatted as "YYYYMMDDhhmmss".

Type:: tuple

VersionBind¶: Legacy equivalent to mupdf_version.

VersionFitz¶: Legacy equivalent to pymupdf_version.

VersionDate¶: Disabled (set to None) in 1.26.1.

ドキュメントの許可¶

コード	許可されたアクション
PDF_PERM_PRINT	文書を印刷する
PDF_PERM_MODIFY	文書の内容を変更する
PDF_PERM_COPY	テキストやグラフィックスをコピーしたりその他の抽出を行う
PDF_PERM_ANNOTATE	テキスト注釈やインタラクティブなフォームフィールドを追加または変更する
PDF_PERM_FORM	フォームに記入し、文書に署名する
PDF_PERM_ACCESSIBILITY	廃止されましたが、常に許可されています
PDF_PERM_ASSEMBLE	ページの挿入、回転、削除、ブックマーク、サムネイル画像の操作
PDF_PERM_PRINT_HQ	高品質印刷

PDFオプショナルコンテンツコード¶

コード	意味
PDF_OC_ON	一時的にOCGをONに設定します
PDF_OC_TOGGLE	OCGステータスを一時的に切り替えます
PDF_OC_OFF	一時的にOCGをOFFに設定します

PDF暗号化方式コード¶

コード	意味
PDF_ENCRYPT_KEEP	変更しない
PDF_ENCRYPT_NONE	暗号化を解除する
PDF_ENCRYPT_RC4_40	RC4 40ビット
PDF_ENCRYPT_RC4_128	RC4 128ビット
PDF_ENCRYPT_AES_128	Advanced Encryption Standard 128ビット
PDF_ENCRYPT_AES_256	Advanced Encryption Standard 256ビット
PDF_ENCRYPT_UNKNOWN	不明

フォントファイルの拡張子¶

このテーブルは、PDFから抽出されたフォントファイルバッファを保存する際に使用すべきファイル拡張子を示しています。この文字列は、Document.get_page_fonts()、Page.get_fonts()、および Document.extract_font() によって返されます。

拡張子	説明
ttf	TrueTypeフォント
pfa	ASCII用のPostscriptフォント（さまざまなサブタイプ）
cff	Type1Cフォント（Type1と同等の圧縮フォント）
cid	文字識別子フォント（Postscript形式）
otf	OpenTypeフォント
n/a	抽出できない、 PDFベース14フォント、Type 3フォント、その他

テキストの配置¶

TEXT_ALIGN_LEFT¶: 0 – 左揃え。

TEXT_ALIGN_CENTER¶: 1 – 中央揃え。

TEXT_ALIGN_RIGHT¶: 2 – 右揃え。

TEXT_ALIGN_JUSTIFY¶: 3 – 両端揃え。

Font Properties¶

Please note that the following bits are derived from what a font has to say about its properties. It may not be (and quite often is not) correct.

TEXT_FONT_SUPERSCRIPT¶: 1 -- the character or span is a superscript. This property is computed by MuPDF and not part of any font information.

TEXT_FONT_ITALIC¶: 2 -- the font is italic.

TEXT_FONT_SERIFED¶: 4 -- the font is serifed.

TEXT_FONT_MONOSPACED¶: 8 -- the font is mono-spaced.

TEXT_FONT_BOLD¶: 16 -- the font is bold.

テキスト抽出フラグ¶

Option bits controlling the amount of data, that are parsed into a TextPage (テキストページ).

For the PyMuPDF programmer, some combination (using Python's | operator, or simply use +) of these values are aggregated in the flags integer, a parameter of all text search and text extraction methods. Depending on the individual method, different default combinations of the values are used. Please use a value that meets your situation. Especially make sure to switch off image extraction unless you really need them. The impact on performance and memory is significant!

TEXT_PRESERVE_LIGATURES¶: 1 – 設定されている場合、リガチャは元の形式のままアプリケーションに渡されます。それ以外の場合、リガチャは構成要素に展開されます。例：リガチャ「ffi」は、3つの個別の文字 f、f、および i に展開されます。デフォルトはPyMuPDFで「オン」です。MuPDFは以下の7つのリガチャに対応しています："ff"、"fi"、"fl"、"ffi"、"ffl"、"ft"、"st"。

TEXT_PRESERVE_WHITESPACE¶: 2 – 設定されている場合、空白はそのまま渡されます。それ以外の場合、水平空白（水平タブを含む）のいずれかのタイプは可変幅のスペース文字に置き換えられます。デフォルトはPyMuPDFで「オン」です。

TEXT_PRESERVE_IMAGES¶: 4 – 設定されている場合、画像は TextPage (テキストページ) に保存されます。これにより、テキスト抽出の出力に（通常は大きな）バイナリ画像コンテンツが含まれることになります。ただし、これはタイプ「blocks」、「dict」、「json」、「rawdict」、「rawjson」、「html」、および「xhtml」のテキスト抽出にのみ適用され、デフォルトです。ただし、「blocks」とともに使用される場合、画像メタデータのみが返され、画像自体は返されません。

TEXT_INHIBIT_SPACES¶: 8 – 設定されている場合、Mupdfは文字間の大きな間隔に欠落したスペース文字を追加しようとしません。PDFでは、作成者はしばしば次の文字の位置を指し示すためにスペースを挿入しませんが、直接の場所のアドレスを提供します。PyMuPDFのデフォルトは「オフ」です - したがって、スペースが生成されます。

TEXT_DEHYPHENATE¶: 16 – 行末のハイフンを無視し、次の行に結合します。テキスト検索関数と内部で使用されます。ただし、一般的に使用できます。ONの場合、テキスト抽出は結合されたテキスト行（またはスパン）を返します。最初の行のハイフンが除去されます。異なる行にある「first meth-」と「od leads to wrong results」の2つの個別のスパンが「first method leads to wrong results」として結合され、それに応じて更新されたバウンディングボックス（bbox）：結果のスパンの文字はもはや同じy座標を持ちません。

TEXT_PRESERVE_SPANS¶: 32 – 各スパンに対して新しい行を生成します。PyMuPDFでは使用されませんが（オフです）、使用可能です。"dict"、"json"、"rawdict"、"rawjson"の各行には正確に1つのスパンが含まれます。

TEXT_CLIP, TEXT_MEDIABOX_CLIP: 64 -- Characters entirely outside a page's mediabox or contained in other "clipped" areas will be ignored. This is default in PyMuPDF. (TEXT_MEDIABOX_CLIP is an old alias.)

TEXT_USE_CID_FOR_UNKNOWN_UNICODE¶: 128 -- Use raw character codes instead of U+FFFD. This is the default for text extraction in PyMuPDF. If you want to detect when encoding information is missing or uncertain, toggle this flag and scan for the presence of U+FFFD (= chr(0xfffd)) code points in the resulting text.

TEXT_COLLECT_STRUCTURE¶: 256 -- Extract or generate the Document (ドキュメント) structure. Detail documentation pending.

TEXT_ACCURATE_BBOXES¶

512 -- Ignore metric values of all fonts when computing character boundary boxes -- most prominently the ascender and descender values. Instead, follow the drawing commands of each character's glyph and compute their rectangle hull as the bbox. This is the smallest rectangle wrapping all points used for drawing the visual appearance - see the Shape（シェイプ） class for understanding the background. This will especially result in individual character heights. For instance a (white) space will have a bbox of zero height (because nothing is drawn) -- in contrast to the non-zero boundary box generated when using font metrics. This option may be useful to cope with failures of getting meaningful boundary boxes, even for fonts containing errors. Its use will slow down text extraction somewhat because of the incurred computational effort.

Note that this has no effect by default - one must also disable the global quad corrections setting with pymupdf.TOOLS.unset_quad_corrections(True).

TEXT_COLLECT_VECTORS¶: 1024 -- Collect vector drawings into the TextPage (テキストページ). These are stored as blocks alongside text and image blocks, depending on other extraction flags. See TextPage.extractBLOCKS() and TextPage.extractDICT() for details. Beyond these two methods, vector graphics extraction is also available for TextPage.extractJSON(), TextPage.extractRAWDICT(), TextPage.extractRAWJSON() and TextPage.extractXML().

TEXT_IGNORE_ACTUALTEXT¶: 2048 -- Ignore built-in differences between text appearing in e.g. PDF viewers versus text stored in the PDF. See Adobe PDFリファレンス, page 615 for background. If set, the stored ("replacement" text) is ignored in favor of the displayed text.

TEXT_SEGMENT¶: 4096 -- Attempt to segment page into different regions. Detail documentation pending.

TEXT_COLLECT_STYLES¶: 32768 -- Request collecting text decoration properties. This includes text underlining and strikeout. In contrast to public awareness, these are not font properties, but are drawn separately as vector graphics or annotations on top of the text. In addition, the flag bit will also cause MuPDF to detect "fake bold" text. In many cases, Document creators simulate bold text by printing the same text multiple times with slight offsets. If this flag is set, such text will be marked as bold in the resulting text spans.

TEXT_LAZY_VECTORS¶: 1048576 -- Delay vector blocks in the extraction slightly to avoid breaking what would otherwise be continuous lines of text.

TEXT_FUZZY_VECTORS¶: 2097152 -- If this option is set, we 'fuzzily' collect rectangular vectors of the same colour together. This enables us to spot where 'pixels' or 'slices' of vectors are used to create the appearance of characters on the page without exploding the storage and processing time requirements.

以下の定数は、テキスト抽出と検索のための上記のデフォルトの組み合わせを表します：

TEXTFLAGS_TEXT¶: TEXT_PRESERVE_LIGATURES | TEXT_PRESERVE_WHITESPACE | TEXT_MEDIABOX_CLIP | TEXT_USE_CID_FOR_UNKNOWN_UNICODE

TEXTFLAGS_WORDS¶: TEXT_PRESERVE_LIGATURES | TEXT_PRESERVE_WHITESPACE | TEXT_MEDIABOX_CLIP | TEXT_USE_CID_FOR_UNKNOWN_UNICODE

TEXTFLAGS_BLOCKS¶: TEXT_PRESERVE_LIGATURES | TEXT_PRESERVE_WHITESPACE | TEXT_MEDIABOX_CLIP | TEXT_USE_CID_FOR_UNKNOWN_UNICODE

TEXTFLAGS_DICT¶: TEXT_PRESERVE_LIGATURES | TEXT_PRESERVE_WHITESPACE | TEXT_MEDIABOX_CLIP | TEXT_PRESERVE_IMAGES | TEXT_USE_CID_FOR_UNKNOWN_UNICODE

TEXTFLAGS_RAWDICT¶: TEXT_PRESERVE_LIGATURES | TEXT_PRESERVE_WHITESPACE | TEXT_MEDIABOX_CLIP | TEXT_PRESERVE_IMAGES | TEXT_USE_CID_FOR_UNKNOWN_UNICODE

TEXTFLAGS_HTML¶: TEXT_PRESERVE_LIGATURES | TEXT_PRESERVE_WHITESPACE | TEXT_MEDIABOX_CLIP | TEXT_PRESERVE_IMAGES | TEXT_USE_CID_FOR_UNKNOWN_UNICODE

TEXTFLAGS_XHTML¶: TEXT_PRESERVE_LIGATURES | TEXT_PRESERVE_WHITESPACE | TEXT_MEDIABOX_CLIP | TEXT_PRESERVE_IMAGES | TEXT_USE_CID_FOR_UNKNOWN_UNICODE

TEXTFLAGS_XML¶: TEXT_PRESERVE_LIGATURES | TEXT_PRESERVE_WHITESPACE | TEXT_MEDIABOX_CLIP | TEXT_USE_CID_FOR_UNKNOWN_UNICODE

TEXTFLAGS_SEARCH¶: TEXT_PRESERVE_WHITESPACE | TEXT_MEDIABOX_CLIP | TEXT_DEHYPHENATE

リンクの目的の種類¶

linkDest.kind （リンクの目的の種類）の可能な値。

LINK_NONE¶

0 – 目的地なし。ダミーリンクを示します。

Type:: int

LINK_GOTO¶

1 – このドキュメント内の場所を指します。

Type:: int

LINK_URI¶

2 – URIを指します。通常はインターネット構文で指定されたリソースです。

PyMuPDFは、コロンを含み、file: で始まらない任意の外部リンクを LINK_URI として扱います。

Type:: int

LINK_LAUNCH¶

3 – 別のファイル（任意の「実行可能」タイプ）を開きます。

PyMuPDFは、 file: で始まるかコロンを含まない外部リンクを LINK_LAUNCH として扱います。

Type:: int

LINK_NAMED¶

4 – 名前付きの場所を指します。

Type:: int

LINK_GOTOR¶

5 – 別のPDFドキュメント内の場所を指します。

Type:: int

リンクの目的地フラグ¶

注釈

この整数の最も右側のバイトはビットフィールドです。したがって、これらのビットの真偽を＆演算子でテストします。

LINK_FLAG_L_VALID¶

1 (ビット 0) 左上の x 値が有効です

Type:: bool

LINK_FLAG_T_VALID¶

2 (ビット 1) 左上の y 値が有効です

Type:: bool

LINK_FLAG_R_VALID¶

4 (ビット 2) 右下の x 値が有効です

Type:: bool

LINK_FLAG_B_VALID¶

8 (ビット 3) 右下の y 値が有効です

Type:: bool

LINK_FLAG_FIT_H¶

16 (ビット 4) 水平フィット

Type:: bool

LINK_FLAG_FIT_V¶

32 (ビット 5) 垂直フィット

Type:: bool

LINK_FLAG_R_IS_ZOOM¶

64 (ビット 6) 右下の x はズーム値です

Type:: bool

ウィジェットの定数¶

ウィジェットフラグ（field_flags）¶

すべてのフィールドタイプに共通：

PDF_FIELD_IS_READ_ONLY 1
PDF_FIELD_IS_REQUIRED 1 << 1
PDF_FIELD_IS_NO_EXPORT 1 << 2

テキストウィジェット：

PDF_TX_FIELD_IS_MULTILINE  1 << 12
PDF_TX_FIELD_IS_PASSWORD  1 << 13
PDF_TX_FIELD_IS_FILE_SELECT  1 << 20
PDF_TX_FIELD_IS_DO_NOT_SPELL_CHECK  1 << 22
PDF_TX_FIELD_IS_DO_NOT_SCROLL  1 << 23
PDF_TX_FIELD_IS_COMB  1 << 24
PDF_TX_FIELD_IS_RICH_TEXT  1 << 25

ボタンウィジェット：

PDF_BTN_FIELD_IS_NO_TOGGLE_TO_OFF  1 << 14
PDF_BTN_FIELD_IS_RADIO  1 << 15
PDF_BTN_FIELD_IS_PUSHBUTTON  1 << 16
PDF_BTN_FIELD_IS_RADIOS_IN_UNISON  1 << 25

チョイスウィジェット：

PDF_CH_FIELD_IS_COMBO  1 << 17
PDF_CH_FIELD_IS_EDIT  1 << 18
PDF_CH_FIELD_IS_SORT  1 << 19
PDF_CH_FIELD_IS_MULTI_SELECT  1 << 21
PDF_CH_FIELD_IS_DO_NOT_SPELL_CHECK  1 << 22
PDF_CH_FIELD_IS_COMMIT_ON_SEL_CHANGE  1 << 26

PDF標準ブレンドモード¶

詳細については、Adobe PDFリファレンスのページ324をご覧ください：

PDF_BM_Color "Color"
PDF_BM_ColorBurn "ColorBurn"
PDF_BM_ColorDodge "ColorDodge"
PDF_BM_Darken "Darken"
PDF_BM_Difference "Difference"
PDF_BM_Exclusion "Exclusion"
PDF_BM_HardLight "HardLight"
PDF_BM_Hue "Hue"
PDF_BM_Lighten "Lighten"
PDF_BM_Luminosity "Luminosity"
PDF_BM_Multiply "Multiply"
PDF_BM_Normal "Normal"
PDF_BM_Overlay "Overlay"
PDF_BM_Saturation "Saturation"
PDF_BM_Screen "Screen"
PDF_BM_SoftLight "Softlight"

スタンプ注釈アイコン¶

MuPDFは、ラバースタンプ注釈に次のアイコンを定義しています：

STAMP_Approved 0
STAMP_AsIs 1
STAMP_Confidential 2
STAMP_Departmental 3
STAMP_Experimental 4
STAMP_Expired 5
STAMP_Final 6
STAMP_ForComment 7
STAMP_ForPublicRelease 8
STAMP_NotApproved 9
STAMP_NotForPublicRelease 10
STAMP_Sold 11
STAMP_TopSecret 12
STAMP_Draft 13

This software is provided AS-IS with no warranty, either express or implied. This software is distributed under license and may not be copied, modified or distributed except as expressly authorized under the terms of that license. Refer to licensing information at artifex.com or contact Artifex Software Inc., 39 Mesa Street, Suite 108A, San Francisco CA 94129, United States for further information.

定数と列挙型¶

定数¶

ドキュメントの許可¶

PDFオプショナルコンテンツコード¶

PDF暗号化方式コード¶

フォントファイルの拡張子¶

テキストの配置¶

Font Properties¶

テキスト抽出フラグ¶

リンクの目的の種類¶

リンクの目的地フラグ¶

ウィジェットの定数¶

ウィジェットのタイプ（field_type）¶

テキストウィジェットのサブタイプ（text_format）¶

ウィジェットフラグ（field_flags）¶

PDF標準ブレンドモード¶

スタンプ注釈アイコン¶

定数と列挙型¶

定数¶

ドキュメントの許可¶

PDFオプショナルコンテンツコード¶

PDF暗号化方式コード¶

フォントファイルの拡張子¶

テキストの配置¶

Font Properties¶

テキスト抽出フラグ¶

リンクの目的の種類¶

リンクの目的地フラグ¶

注釈関連の定数¶

アノテーションタイプ¶

注釈フラグビット¶

注釈の線の終端スタイル¶

ウィジェットの定数¶

ウィジェットのタイプ（field_type）¶

テキストウィジェットのサブタイプ（text_format）¶

ウィジェットフラグ（field_flags）¶

PDF標準ブレンドモード¶

スタンプ注釈アイコン¶