Xml#
New in v1.21.0
This represents an HTML or an XML node. It is a helper class intended to access the DOM (Document Object Model) content of a Story object.
There is no need to ever directly construct an Xml object: after creating a Story, simply take Story.body
– which is an Xml node – and use it to navigate your way through the story’s DOM.
Method / Attribute |
Description |
---|---|
Add a ul tag - bulleted list, context manager. |
|
Add a pre tag, context manager. |
|
Add a dl tag, context manager. |
|
add a div tag (renamed from “section”), context manager. |
|
Add a header tag (one of h1 to h6), context manager. |
|
Add a hr tag. |
|
Add a img tag. |
|
Add a a tag. |
|
Add a ol tag, context manager. |
|
Add a p tag. |
|
Add a span tag, context manager. |
|
Add subscript text(sub tag) - inline element, treated like text. |
|
Add subscript text (sup tag) - inline element, treated like text. |
|
Add code text (code tag) - inline element, treated like text. |
|
Add code text (code tag) - inline element, treated like text. |
|
Add code text (code tag) - inline element, treated like text. |
|
Add code text (code tag) - inline element, treated like text. |
|
Add a text string. Line breaks |
|
Append a child node. |
|
Make a copy if this node. |
|
Make a new node with a given tag name. |
|
Create direct text for the current node. |
|
Find a sub-node with given properties. |
|
Repeat previous “find” with the same criteria. |
|
Insert an element after current node. |
|
Insert an element before current node. |
|
Remove this node. |
|
Set the alignment using a CSS style spec. Only works for block-level tags. |
|
Set an arbitrary key to some value (which may be empty). |
|
Set the background color. Only works for block-level tags. |
|
Set bold on or off or to some string value. |
|
Set text color. |
|
Set the number of columns. Argument may be any valid number or string. |
|
Set the font-family, e.g. “sans-serif”. |
|
Set the font size. Either a float or a valid HTML/CSS string. |
|
Set a id. A check for uniqueness is performed. |
|
Set italic on or off or to some string value. |
|
Set inter-block text distance ( |
|
Set height of a line. Float like 1.5, which sets to |
|
Set the margin(s), float or string with up to 4 values. |
|
Insert a page break after this node. |
|
Insert a page break before this node. |
|
Set any or all desired properties in one call. |
|
Set (add) a “style” that is not supported by its own |
|
Set (add) a “class” attribute. |
|
Set indentation for first textblock line. Only works for block-level nodes. |
|
Either the HTML tag name like p or |
|
Either the node’s text or |
|
Check if the node is a text. |
|
Contains the first node one level below this one (or |
|
Contains the last node one level below this one (or |
|
The next node at the same level (or |
|
The previous node at the same level. |
|
The top node of the DOM, which hence has the tagname html. |
Class API
- class Xml#
-
- add_header(value)#
Add a header tag (one of h1 to h6), context manager. See headings.
- Parameters:
value (int) – a value 1 - 6.
- add_image(name, width=None, height=None)#
Add an img tag. This causes the inclusion of the named image in the DOM.
- Parameters:
name (str) – the filename of the image. This must be the member name of some entry of the Archive parameter of the Story constructor.
width – if provided, either an absolute (int) value, or a percentage string like “30%”. A percentage value refers to the width of the specified
where
rectangle inStory.place()
. If this value is provided andheight
is omitted, the image will be included keeping its aspect ratio.height – if provided, either an absolute (int) value, or a percentage string like “30%”. A percentage value refers to the height of the specified
where
rectangle inStory.place()
. If this value is provided andwidth
is omitted, the image’s aspect ratio will be honored.
- add_link(href, text=None)#
Add an a tag - inline element, treated like text.
- Parameters:
href (str) – the URL target.
text (str) – the text to display. If omitted, the
href
text is shown instead.
- add_number_list()#
Add an ol tag, context manager.
- add_paragraph()#
Add a p tag, context manager.
- add_subscript(text)#
Add “subscript” text(sub tag) - inline element, treated like text.
- add_superscript(text)#
Add “superscript” text (sup tag) - inline element, treated like text.
- add_code(text)#
Add “code” text (code tag) - inline element, treated like text.
- add_var(text)#
Add “variable” text (var tag) - inline element, treated like text.
- add_samp(text)#
Add “sample output” text (samp tag) - inline element, treated like text.
- add_kbd(text)#
Add “keyboard input” text (kbd tag) - inline element, treated like text.
- add_text(text)#
Add a text string. Line breaks
\n
are honored as br tags.
- set_align(value)#
Set the text alignment. Only works for block-level tags.
- Parameters:
value – either one of the Text Alignment or the text-align values.
- set_attribute(key, value=None)#
Set an arbitrary key to some value (which may be empty).
- Parameters:
key (str) – the name of the attribute.
value (str) – the (optional) value of the attribute.
- get_attributes()#
Retrieve all attributes of the current nodes as a dictionary.
- Returns:
a dictionary with the attributes and their values of the node.
- get_attribute_value(key)#
Get the attribute value of
key
.- Parameters:
key (str) – the name of the attribute.
- Returns:
a string with the value of
key
.
- remove_attribute(key)#
Remove the attribute
key
from the node.- Parameters:
key (str) – the name of the attribute.
- set_bgcolor(value)#
Set the background color. Only works for block-level tags.
- Parameters:
value – either an RGB value like (255, 0, 0) (for “red”) or a valid background-color value.
- set_bold(value)#
Set bold on or off or to some string value.
- Parameters:
value –
True
,False
or a valid font-weight value.
- set_color(value)#
Set the color of the text following.
- Parameters:
value – either an RGB value like (255, 0, 0) (for “red”) or a valid color value.
- set_columns(value)#
Set the number of columns.
- Parameters:
value – a valid columns value.
Note
Currently ignored - supported in a future MuPDF version.
- set_font(value)#
Set the font-family.
- Parameters:
value (str) – e.g. “sans-serif”.
- set_fontsize(value)#
Set the font size for text following.
- Parameters:
value – a float or a valid font-size value.
- set_id(unqid)#
Set a id. This serves as a unique identification of the node within the DOM. Use it to easily locate the node to inspect or modify it. A check for uniqueness is performed.
- Parameters:
unqid (str) – id string of the node.
- set_italic(value)#
Set italic on or off or to some string value for the text following it.
- Parameters:
value –
True
,False
or some valid font-style value.
- set_leading(value)#
Set inter-block text distance (
-mupdf-leading
), only works on block-level nodes.- Parameters:
value (float) – the distance in points to the previous block.
- set_lineheight(value)#
Set height of a line.
- Parameters:
value – a float like 1.5 (which sets to
1.5 * fontsize
), or some valid line-height value.
- set_margins(value)#
Set the margin(s).
- Parameters:
value – float or string with up to 4 values. See CSS documentation.
- set_pagebreak_after()#
Insert a page break after this node.
- set_pagebreak_before()#
Insert a page break before this node.
- set_properties(align=None, bgcolor=None, bold=None, color=None, columns=None, font=None, fontsize=None, indent=None, italic=None, leading=None, lineheight=None, margins=None, pagebreak_after=False, pagebreak_before=False, unqid=None, cls=None)#
Set any or all desired properties in one call. The meaning of argument values equal the values of the corresponding
set_
methods.Note
The properties set by this method are directly attached to the node, whereas every
set_
method generates a new span below the current node that has the respective property. So to e.g. “globally” set some property for the body, this method must be used.
- add_style(value)#
Set (add) some style attribute not supported by its own
set_
method.- Parameters:
value (str) – any valid CSS style value.
- add_class(value)#
Set (add) some “class” attribute.
- Parameters:
value (str) – the name of the class. Must have been defined in either the HTML or the CSS source of the DOM.
- set_text_indent(value)#
Set indentation for the first textblock line. Only works for block-level nodes.
- Parameters:
value – a valid text-indent value. Please note that negative values do not work.
- append_child(node)#
Append a child node. This is a low-level method used by other methods like
Xml.add_paragraph()
.- Parameters:
node – the Xml node to append.
- create_text_node(text)#
Create direct text for the current node.
- Parameters:
text (str) – the text to append.
- Return type:
- Returns:
the created element.
- create_element(tag)#
Create a new node with a given tag. This a low-level method used by other methods like
Xml.add_paragraph()
.- Parameters:
tag (str) – the element tag.
- Return type:
- Returns:
the created element. To actually bind it to the DOM, use
Xml.append_child()
.
- insert_before(elem)#
Insert the given element
elem
before this node.- Parameters:
elem – some Xml element.
- insert_after(elem)#
Insert the given element
elem
after this node.- Parameters:
elem – some Xml element.
- clone()#
Make a copy of this node, which then may be appended (using
Xml.append_child()
) or inserted (using one ofXml.insert_before()
,Xml.insert_after()
) in this DOM.- Returns:
the clone (Xml) of the current node.
- remove()#
Remove this node from the DOM.
- debug()#
For debugging purposes, print this node’s structure in a simplified form.
- find(tag, att, match)#
Under the current node, find the first node with the given
tag
, attributeatt
and valuematch
.- Parameters:
tag (str) – restrict search to this tag. May be
None
for unrestricted searches.att (str) – check this attribute. May be
None
.match (str) – the desired attribute value to match. May be
None
.
- Return type:
Xml.
- Returns:
None
if nothing found, otherwise the first matching node.
- find_next(tag, att, match)#
Continue a previous
Xml.find()
(orfind_next()
) with the same values.- Return type:
Xml.
- Returns:
None
if none more found, otherwise the next matching node.
- tagname#
Either the HTML tag name like p or
None
if a text node.
- text#
Either the node’s text or
None
if a tag node.
- is_text#
Check if a text node.
- first_child#
Contains the first node one level below this one (or
None
).
- last_child#
Contains the last node one level below this one (or
None
).
- next#
The next node at the same level (or
None
).
- previous#
The previous node at the same level.
- root#
The top node of the DOM, which hence has the tagname html.
Setting Text properties#
In HTML tags can be nested such that innermost text inherits properties from the tag enveloping its parent tag. For example <p>
.
To achieve the same effect, methods like Xml.set_bold()
and Xml.set_italic()
each open a temporary span with the desired property underneath the current node.
In addition, these methods return there parent node, so they can be concatenated with each other.
Context Manager support#
The standard way to add nodes to a DOM is this:
body = story.body
para = body.add_paragraph() # add a paragraph
para.set_bold() # text that follows will be bold
para.add_text("some bold text")
para.set_italic() # text that follows will additionally be italic
para.add_txt("this is bold and italic")
para.set_italic(False).set_bold(False) # all following text will be regular
para.add_text("regular text")
Methods that are flagged as “context managers” can conveniently be used in this way:
body = story.body
with body.add_paragraph() as para:
para.set_bold().add_text("some bold text")
para.set_italic().add_text("this is bold and italic")
para.set_italic(False).set_bold(False).add_text("regular text")
para.add_text("more regular text")