graphtage.xml¶
A graphtage.Filetype
for parsing, diffing, and rendering XML files.
This class is also currently used for parsing HTML.
The parser is implemented atop xml.etree.ElementTree
. Any XML or HTML accepted by that module will also be
accepted by this module.
xml classes¶
HTML¶
-
class
graphtage.xml.
HTML
¶ Bases:
graphtage.xml.XML
The HTML file type.
-
__init__
()¶ Initializes the HTML file type.
By default, HTML associates itself with the “html”, “text/html”, and “application/xhtml+xml” MIME types.
-
build_tree
(path: str, options: Optional[graphtage.BuildOptions] = None) → graphtage.TreeNode¶ Builds an intermediate representation tree from a file of this
Filetype
.- Parameters
path – Path to the file to parse
options – An optional set of options for building the tree
- Returns
The root tree node of the provided file
- Return type
-
build_tree_handling_errors
(path: str, options: Optional[graphtage.BuildOptions] = None) → Union[str, graphtage.TreeNode]¶ Same as
Filetype.build_tree()
, but it should return a human-readable error string on failure.This function should never throw an exception.
-
get_default_formatter
() → graphtage.xml.XMLFormatter¶ Returns the default formatter for printing files of this type.
-
XML¶
-
class
graphtage.xml.
XML
¶ Bases:
graphtage.Filetype
The XML file type.
-
__init__
()¶ Initializes the XML file type.
By default, XML associates itself with the “xml”, “application/xml”, and “text/xml” MIME types.
-
build_tree
(path: str, options: Optional[graphtage.BuildOptions] = None) → graphtage.TreeNode¶ Builds an intermediate representation tree from a file of this
Filetype
.- Parameters
path – Path to the file to parse
options – An optional set of options for building the tree
- Returns
The root tree node of the provided file
- Return type
-
build_tree_handling_errors
(path: str, options: Optional[graphtage.BuildOptions] = None) → Union[str, graphtage.TreeNode]¶ Same as
Filetype.build_tree()
, but it should return a human-readable error string on failure.This function should never throw an exception.
-
get_default_formatter
() → graphtage.xml.XMLFormatter¶ Returns the default formatter for printing files of this type.
-
XMLChildFormatter¶
-
class
graphtage.xml.
XMLChildFormatter
(*args, **kwargs)¶ Bases:
graphtage.formatter.Formatter
[Union
[TreeNode
,Edit
]]-
DEFAULT_INSTANCE
: Formatter[T] = <graphtage.xml.XMLChildFormatter object>¶
-
__init__
()¶ Initializes a sequence formatter.
- Parameters
start_symbol – The symbol to print at the start of the sequence.
end_symbol – The symbol to print at the end of the sequence.
delimiter – A delimiter to print between items.
delimiter_callback –
A callback for when a delimiter is to be printed. If omitted, this defaults to:
lambda p: p.write(delimiter)
-
static
__new__
(cls, *args, **kwargs) → graphtage.formatter.Formatter[T]¶ Instantiates a new formatter.
This automatically instantiates and populates
Formatter.sub_formatters
and sets theirparent
to this new formatter.
-
delimiter_callback
: Callable[[Printer], Any]¶
-
edit_print
(printer: graphtage.printer.Printer, edit: graphtage.Edit)¶ Called when the edit for an item is to be printed.
If the
SequenceNode
being printed either is not edited or has no edits, then the edit passed to this function will be aMatch(child, child, 0)
.This implementation simply delegates the print to the Formatting Protocol:
self.print(printer, edit)
-
get_formatter
(item: T) → Optional[Callable[[graphtage.printer.Printer, T], Any]]¶ Looks up a formatter for the given item using this formatter as a base.
Equivalent to:
get_formatter(item.__class__, base_formatter=self)
-
item_newline
(printer: graphtage.printer.Printer, is_first: bool = False, is_last: bool = False)¶ Called before each node is printed.
This is also called one extra time after the last node, if there is at least one node printed.
The default implementation is simply:
printer.newline()
-
items_indent
(printer: graphtage.printer.Printer) → graphtage.printer.Printer¶ Returns a Printer context with an indentation.
This is called as:
with self.items_indent(printer) as p:
immediately after the
self.start_symbol
is printed, but before any of the items have been printed.This default implementation is equivalent to:
return printer.indent()
-
parent
: Optional[Formatter[T]] = None¶
-
print
(printer: graphtage.printer.Printer, node_or_edit: Union[graphtage.TreeNode, graphtage.Edit], with_edits: bool = True)¶ Prints the given node or edit.
- Parameters
printer – The printer to which to write.
node_or_edit – The node or edit to print.
with_edits – If :keyword:True, print any edits associated with the node.
Note
The protocol for determining how a node or edit should be printed is very complex due to its extensibility. See the Printing Protocol for a detailed description.
-
print_ListNode
(*args, **kwargs)¶
-
print_SequenceNode
(printer: graphtage.printer.Printer, node: graphtage.sequences.SequenceNode)¶ Formats a sequence node.
The protocol for this function is as follows:
Print
self.start_symbol
- With the printer returned by
self.items_indent
: - For each
edit
in the sequence (or just a sequence ofgraphtage.Match
for each child, if the node is not edited): Call
self.item_newline(printer, is_first=index == 0)
Call
self.edit_print(printer, edit)
- For each
- With the printer returned by
If at least one edit was printed, then call
self.item_newline(printer, is_last=True)
Print
self.start_symbol
-
property
root
¶ Returns the root formatter.
-
sub_format_types
: Sequence[Type[Formatter[T]]] = ()¶
-
sub_formatters
: List[Formatter[T]] = []¶
-
XMLElement¶
-
class
graphtage.xml.
XMLElement
(tag: graphtage.StringNode, attrib: Optional[Dict[graphtage.StringNode, graphtage.StringNode]] = None, text: Optional[graphtage.StringNode] = None, children: Sequence[graphtage.xml.XMLElement] = (), allow_key_edits: bool = True)¶ Bases:
graphtage.TreeNode
,collections.abc.Iterable
,Sized
,abc.ABC
“A node representing an XML element.
-
__init__
(tag: graphtage.StringNode, attrib: Optional[Dict[graphtage.StringNode, graphtage.StringNode]] = None, text: Optional[graphtage.StringNode] = None, children: Sequence[graphtage.xml.XMLElement] = (), allow_key_edits: bool = True)¶ Initializes an XML element.
- Parameters
tag – The tag of the element.
attrib – The attributes of the element.
text – The text of the element.
children – The children of the element.
allow_key_edits – Whether or not to allow keys to be edited when matching element attributes.
-
all_children_are_leaves
() → bool¶ Tests whether all of the children of this container are leaves.
Equivalent to:
all(c.is_leaf for c in self)
- Returns
True
if all children are leaves.- Return type
-
attrib
: graphtage.DictNode¶ The attributes of this element.
-
calculate_total_size
() → int¶ Calculates the size of this node. This is an arbitrary, immutable value that is used to calculate the bounded costs of edits on this node.
- Returns
An arbitrary integer representing the size of this node.
- Return type
-
children
() → Collection[graphtage.TreeNode]¶ The children of this node.
Equivalent to:
list(self)
-
dfs
() → Iterator[graphtage.TreeNode]¶ Performs a depth-first traversal over all of this node’s descendants.
self
is always included and yielded first.This implementation is equivalent to:
stack = [self] while stack: node = stack.pop() yield node stack.extend(reversed(node.children()))
-
diff
(node: graphtage.TreeNode) → Union[graphtage.EditedTreeNode, T]¶ Performs a diff against the provided node.
- Parameters
node – The node against which to perform the diff.
- Returns
An edited version of this node with all edits being
completed
.- Return type
Union[EditedTreeNode, T]
-
edit_modifiers
: Optional[List[Callable[[graphtage.TreeNode, graphtage.TreeNode], Optional[graphtage.Edit]]]] = None¶
-
editable_dict
() → Dict[str, Any]¶ Copies
self.__dict__
, callingTreeNode.editable_dict()
on anyTreeNode
objects therein.This is equivalent to:
ret = dict(self.__dict__) if not self.is_leaf: for key, value in ret.items(): if isinstance(value, TreeNode): ret[key] = value.make_edited() return ret
This is used by
TreeNode.make_edited()
.
-
property
edited
¶ Returns whether this node has been edited.
The default implementation returns
False
, whereasEditedTreeNode.edited()
returnsTrue
.
-
classmethod
edited_type
() → Type[Union[graphtage.EditedTreeNode, T]]¶ Dynamically constructs a new class that is both a
TreeNode
and anEditedTreeNode
.The edited type’s member variables are populated by the result of
TreeNode.editable_dict()
of theTreeNode
it wraps:new_node.__dict__ = dict(wrapped_tree_node.editable_dict())
- Returns
A class that is both a
TreeNode
and anEditedTreeNode
. Its constructor accepts aTreeNode
that it will wrap.- Return type
Type[Union[EditedTreeNode, T]]
-
edits
(node) → graphtage.Edit¶ Calculates the best edit to transform this node into the provided node.
- Parameters
node – The node to which to transform.
- Returns
The best possible edit.
- Return type
-
get_all_edits
(node: graphtage.TreeNode) → Iterator[graphtage.Edit]¶ Returns an iterator over all edits that will transform this node into the provided node.
- Parameters
node – The node to which to transform this one.
- Returns
An iterator over edits. Note that this iterator will automatically
explode
anyCompoundEdit
in the sequence.- Return type
Iterator[Edit]
-
property
is_leaf
¶ Container nodes are never leaves, even if they have no children.
- Returns
False
- Return type
-
make_edited
() → Union[graphtage.EditedTreeNode, T]¶ Returns a new, copied instance of this node that is also an instance of
EditedTreeNode
.This is equivalent to:
return self.edited_type()(self)
- Returns
A copied version of this node that is also an instance of
EditedTreeNode
and thereby mutable.- Return type
Union[EditedTreeNode, T]
-
print
(printer: graphtage.printer.Printer)¶ Prints this node.
-
tag
: graphtage.StringNode¶ The tag of this element.
-
text
: Optional[graphtage.StringNode]¶ The text of this element.
-
to_obj
()¶ Returns a pure Python representation of this node.
For example, a node representing a list, like
graphtage.ListNode
, should return a Pythonlist
. A node representing a mapping, likegraphtage.MappingNode
, should return a Pythondict
. Container nodes should recursively callTreeNode.to_obj()
on all of their children.This is used solely for the providing objects to operate on in the commandline expressions evaluation, for options like –match-if and –match-unless.
-
property
total_size
¶ The size of this node.
This is an arbitrary, immutable value that is used to calculate the bounded costs of edits on this node.
The first time this property is called, its value will be set and memoized by calling
TreeNode.calculate_total_size()
.- Returns
An arbitrary integer representing the size of this node.
- Return type
-
XMLElementAttribFormatter¶
-
class
graphtage.xml.
XMLElementAttribFormatter
(*args, **kwargs)¶ Bases:
graphtage.formatter.Formatter
[Union
[TreeNode
,Edit
]]-
DEFAULT_INSTANCE
: Formatter[T] = <graphtage.xml.XMLElementAttribFormatter object>¶
-
__init__
()¶ Initializes a sequence formatter.
- Parameters
start_symbol – The symbol to print at the start of the sequence.
end_symbol – The symbol to print at the end of the sequence.
delimiter – A delimiter to print between items.
delimiter_callback –
A callback for when a delimiter is to be printed. If omitted, this defaults to:
lambda p: p.write(delimiter)
-
static
__new__
(cls, *args, **kwargs) → graphtage.formatter.Formatter[T]¶ Instantiates a new formatter.
This automatically instantiates and populates
Formatter.sub_formatters
and sets theirparent
to this new formatter.
-
delimiter_callback
: Callable[[Printer], Any]¶
-
edit_print
(printer: graphtage.printer.Printer, edit: graphtage.Edit)¶ Called when the edit for an item is to be printed.
If the
SequenceNode
being printed either is not edited or has no edits, then the edit passed to this function will be aMatch(child, child, 0)
.This implementation simply delegates the print to the Formatting Protocol:
self.print(printer, edit)
-
get_formatter
(item: T) → Optional[Callable[[graphtage.printer.Printer, T], Any]]¶ Looks up a formatter for the given item using this formatter as a base.
Equivalent to:
get_formatter(item.__class__, base_formatter=self)
-
item_newline
(printer: graphtage.printer.Printer, is_first: bool = False, is_last: bool = False)¶ Called before each node is printed.
This is also called one extra time after the last node, if there is at least one node printed.
The default implementation is simply:
printer.newline()
-
items_indent
(printer: graphtage.printer.Printer) → graphtage.printer.Printer¶ Returns a Printer context with an indentation.
This is called as:
with self.items_indent(printer) as p:
immediately after the
self.start_symbol
is printed, but before any of the items have been printed.This default implementation is equivalent to:
return printer.indent()
-
parent
: Optional[Formatter[T]] = None¶
-
print
(printer: graphtage.printer.Printer, node_or_edit: Union[graphtage.TreeNode, graphtage.Edit], with_edits: bool = True)¶ Prints the given node or edit.
- Parameters
printer – The printer to which to write.
node_or_edit – The node or edit to print.
with_edits – If :keyword:True, print any edits associated with the node.
Note
The protocol for determining how a node or edit should be printed is very complex due to its extensibility. See the Printing Protocol for a detailed description.
-
print_KeyValuePairNode
(printer: graphtage.printer.Printer, node: graphtage.KeyValuePairNode)¶
-
print_MappingNode
(*args, **kwargs)¶
-
print_MultiSetNode
(*args, **kwargs)¶
-
print_SequenceNode
(printer: graphtage.printer.Printer, node: graphtage.sequences.SequenceNode)¶ Formats a sequence node.
The protocol for this function is as follows:
Print
self.start_symbol
- With the printer returned by
self.items_indent
: - For each
edit
in the sequence (or just a sequence ofgraphtage.Match
for each child, if the node is not edited): Call
self.item_newline(printer, is_first=index == 0)
Call
self.edit_print(printer, edit)
- For each
- With the printer returned by
If at least one edit was printed, then call
self.item_newline(printer, is_last=True)
Print
self.start_symbol
-
property
root
¶ Returns the root formatter.
-
sub_format_types
: Sequence[Type[Formatter[T]]] = ()¶
-
sub_formatters
: List[Formatter[T]] = []¶
-
XMLElementEdit¶
-
class
graphtage.xml.
XMLElementEdit
(from_node: graphtage.xml.XMLElement, to_node: graphtage.xml.XMLElement)¶ Bases:
graphtage.AbstractCompoundEdit
An edit on an XML element.
-
__init__
(from_node: graphtage.xml.XMLElement, to_node: graphtage.xml.XMLElement)¶ Initializes an XML element edit.
- Parameters
from_node – The node being edited.
to_node – The node to which
from_node
will be transformed.
-
__iter__
() → Iterator[graphtage.Edit]¶ Returns an iterator over this edit’s sub-edits.
- Returns
The result of
AbstractCompoundEdit.edits()
- Return type
Iterator[Edit]
-
__lt__
(other)¶ Tests whether the bounds of this edit are less than the bounds of
other
.
-
attrib_edit
: graphtage.Edit¶ The edit to transform this element’s attributes.
-
bounds
() → graphtage.bounds.Range¶ Returns the bounds of this edit.
This defaults to the bounds provided when this
AbstractEdit
was constructed. If an upper bound was not provided to the constructor, the upper bound defaults to:self.from_node.total_size + self.to_node.total_size + 1
- Returns
A range bounding the cost of this edit.
- Return type
-
child_edit
: graphtage.Edit¶ The edit to transform this node’s children.
-
edits
() → Iterator[graphtage.Edit]¶ Returns an iterator over this edit’s sub-edits
-
from_node
: graphtage.TreeNode¶
-
has_non_zero_cost
() → bool¶ Returns whether this edit has a non-zero cost.
This will tighten the edit’s bounds until either its lower bound is greater than zero or its bounds are definitive.
-
initial_bounds
: graphtage.bounds.Range¶
-
is_complete
() → bool¶ An edit is complete when no further calls to
Edit.tighten_bounds()
will change the nature of the edit.This implementation considers an edit complete if it is valid and its bounds are definitive:
return not self.valid or self.bounds().definitive()
If an edit is able to discern that it has a unique solution even if its final bounds are unknown, it should reimplement this method to define that check.
For example, in the case of a
CompoundEdit
, this method should only returnTrue
if no future calls toEdit.tighten_bounds()
will affect the result ofCompoundEdit.edits()
.- Returns
True
if subsequent calls toEdit.tighten_bounds()
will only serve to tighten the bounds of this edit and will not affect the semantics of the edit.- Return type
-
on_diff
(from_node: graphtage.EditedTreeNode)¶ A callback for when an edit is assigned to an
EditedTreeNode
inTreeNode.diff()
.This default implementation adds the edit to the node, and recursively calls
Edit.on_diff()
on all of the sub-edits:from_node.edit = self from_node.edit_list.append(self) for edit in self.edits(): edit.on_diff(edit.from_node)
- Parameters
from_node – The edited node that was added to the diff
-
print
(formatter: graphtage.GraphtageFormatter, printer: graphtage.printer.Printer)¶ Edits can optionally implement a printing method
This function is called automatically from the formatter in the Printing Protocol and should never be called directly unless you really know what you’re doing! Raising
NotImplementedError
will cause the formatter to fall back on its own printing implementations.This implementation is equivalent to:
for edit in self.edits(): edit.print(formatter, printer)
-
tag_edit
: graphtage.Edit¶ The edit to transform this element’s tag.
-
text_edit
: Optional[graphtage.Edit]¶ The edit to transform this element’s text.
-
tighten_bounds
() → bool¶ Tightens the
Edit.bounds()
on the cost of this edit, if possible.- Returns
True
if the bounds have been tightened.- Return type
Note
Implementations of this function should return
False
if and only ifself.bounds().definitive()
.
-
property
valid
¶ Returns whether this edit is valid
-
XMLElementObj¶
-
class
graphtage.xml.
XMLElementObj
(tag: str, attrib: Dict[str, str], text: Optional[str] = None, children: Optional[Sequence[graphtage.xml.XMLElementObj]] = ())¶ Bases:
object
An object for interacting with
XMLElement
from command line expressions.-
__init__
(tag: str, attrib: Dict[str, str], text: Optional[str] = None, children: Optional[Sequence[graphtage.xml.XMLElementObj]] = ())¶ Initializes an XML Element Object.
- Parameters
tag – The tag of the element.
attrib – The attributes of the element.
text – The text of the element.
children – The children of the element.
-
children
: Optional[Sequence[XMLElementObj]]¶ The children of this element.
-
XMLFormatter¶
-
class
graphtage.xml.
XMLFormatter
(*args, **kwargs)¶ Bases:
graphtage.formatter.Formatter
[Union
[TreeNode
,Edit
]]-
DEFAULT_INSTANCE
: Formatter[T] = <graphtage.xml.XMLFormatter object>¶
-
__init__
()¶ Initialize self. See help(type(self)) for accurate signature.
-
static
__new__
(cls, *args, **kwargs) → graphtage.formatter.Formatter[T]¶ Instantiates a new formatter.
This automatically instantiates and populates
Formatter.sub_formatters
and sets theirparent
to this new formatter.
-
get_formatter
(item: T) → Optional[Callable[[graphtage.printer.Printer, T], Any]]¶ Looks up a formatter for the given item using this formatter as a base.
Equivalent to:
get_formatter(item.__class__, base_formatter=self)
-
parent
: Optional[Formatter[T]] = None¶
-
print
(printer: graphtage.printer.Printer, node_or_edit: Union[graphtage.TreeNode, graphtage.Edit], with_edits: bool = True)¶ Prints the given node or edit.
- Parameters
printer – The printer to which to write.
node_or_edit – The node or edit to print.
with_edits – If :keyword:True, print any edits associated with the node.
Note
The protocol for determining how a node or edit should be printed is very complex due to its extensibility. See the Printing Protocol for a detailed description.
-
print_LeafNode
(printer: graphtage.printer.Printer, node: graphtage.LeafNode)¶
-
print_XMLElement
(printer: graphtage.printer.Printer, node: graphtage.xml.XMLElement)¶
-
property
root
¶ Returns the root formatter.
-
sub_format_types
: Sequence[Type[Formatter[T]]] = [<class 'graphtage.xml.XMLStringFormatter'>, <class 'graphtage.xml.XMLChildFormatter'>, <class 'graphtage.xml.XMLElementAttribFormatter'>]¶
-
sub_formatters
: List[Formatter[T]] = []¶
-
XMLStringFormatter¶
-
class
graphtage.xml.
XMLStringFormatter
(*args, **kwargs)¶ Bases:
graphtage.formatter.Formatter
[Union
[TreeNode
,Edit
]]-
DEFAULT_INSTANCE
: Formatter[T] = <graphtage.xml.XMLStringFormatter object>¶
-
__init__
()¶ Initialize self. See help(type(self)) for accurate signature.
-
static
__new__
(cls, *args, **kwargs) → graphtage.formatter.Formatter[T]¶ Instantiates a new formatter.
This automatically instantiates and populates
Formatter.sub_formatters
and sets theirparent
to this new formatter.
-
context
(printer: graphtage.printer.Printer)¶
-
escape
(c: str) → str¶ String escape.
This function is called once for each character in the string.
- Returns
The escaped version of c, or c itself if no escaping is required.
- Return type
This is equivalent to:
html.escape(c)
-
get_formatter
(item: T) → Optional[Callable[[graphtage.printer.Printer, T], Any]]¶ Looks up a formatter for the given item using this formatter as a base.
Equivalent to:
get_formatter(item.__class__, base_formatter=self)
-
parent
: Optional[Formatter[T]] = None¶
-
print
(printer: graphtage.printer.Printer, node_or_edit: Union[graphtage.TreeNode, graphtage.Edit], with_edits: bool = True)¶ Prints the given node or edit.
- Parameters
printer – The printer to which to write.
node_or_edit – The node or edit to print.
with_edits – If :keyword:True, print any edits associated with the node.
Note
The protocol for determining how a node or edit should be printed is very complex due to its extensibility. See the Printing Protocol for a detailed description.
-
print_StringEdit
(printer: graphtage.printer.Printer, edit: graphtage.StringEdit)¶
-
print_StringNode
(printer: graphtage.printer.Printer, node: graphtage.StringNode)¶
-
property
root
¶ Returns the root formatter.
-
sub_format_types
: Sequence[Type[Formatter[T]]] = ()¶
-
sub_formatters
: List[Formatter[T]] = []¶
-
write_char
(printer: graphtage.printer.Printer, c: str, index: int, num_edits: int, removed=False, inserted=False)¶ Writes a character to the printer.
Note
This function calls
graphtage.StringFormatter.escape()
; classes extendinggraphtage.StringFormatter
should also callgraphtage.StringFormatter.escape()
when reimplementing this function.Note
There is no need to specially format characters that have been removed or inserted; the printer will have already automatically been configured to format them prior to the call to
StringFormatter.write_char()
.- Parameters
printer – The printer to which to write the character.
c – The character to write.
index – The index of the character in the string.
num_edits – The total number of characters that will be printed.
removed – Whether this character was removed from the source string.
inserted – Whether this character is inserted into the source string.
-
write_end_quote
(printer: graphtage.printer.Printer, edit: graphtage.StringEdit)¶ Prints an ending quote for the string, if necessary
-
write_start_quote
(printer: graphtage.printer.Printer, edit: graphtage.StringEdit)¶ Prints a starting quote for the string, if necessary
-
xml functions¶
build_tree¶
-
graphtage.xml.
build_tree
(path_or_element_tree: Union[str, xml.etree.ElementTree.Element, xml.etree.ElementTree.ElementTree], options: Optional[graphtage.BuildOptions] = None) → graphtage.xml.XMLElement¶ Constructs an XML element node from an XML file.