graphtage.csv¶
A graphtage.Filetype
for parsing, diffing, and rendering CSV files.
csv classes¶
CSV¶
- class graphtage.csv.CSV¶
Bases:
Filetype
The CSV filetype.
- __init__()¶
Initializes the CSV filetype.
CSV identifies itself with the MIME types csv and text/csv.
- build_tree(path: str, options: BuildOptions | None = None) TreeNode ¶
Equivalent to
build_tree()
- build_tree_handling_errors(path: str, options: BuildOptions | None = None) TreeNode ¶
Same as
Filetype.build_tree()
, but it should return a human-readable error string on failure.This function should never throw an exception.
- get_default_formatter() CSVFormatter ¶
Returns the default formatter for printing files of this type.
CSVFormatter¶
- class graphtage.csv.CSVFormatter(*args, **kwargs)¶
Bases:
GraphtageFormatter
Top-level formatter for CSV files.
- DEFAULT_INSTANCE: Formatter[T] = <graphtage.csv.CSVFormatter object>¶
A default instance of this formatter, automatically instantiated by the
FormatterChecker
metaclass.
- __init__()¶
- static __new__(cls, *args, **kwargs) Formatter[T] ¶
Instantiates a new formatter.
This automatically instantiates and populates
Formatter.sub_formatters
and sets theirparent
to this new formatter.
- get_formatter(item: T) Callable[[Printer, T], Any] | None ¶
Looks up a formatter for the given item using this formatter as a base.
Equivalent to:
get_formatter(item.__class__, base_formatter=self)
- parent: Formatter[T] | None = None¶
The parent formatter for this formatter instance.
This is automatically populated by
Formatter.__new__()
and should never be manually modified.
- print(printer: Printer, node_or_edit: TreeNode | Edit, with_edits: bool = True)¶
Prints the given node or edit.
- Parameters:
printer – The printer to which to write.
node_or_edit – The node or edit to print.
with_edits – If :keyword:True, print any edits associated with the node.
Note
The protocol for determining how a node or edit should be printed is very complex due to its extensibility. See the Printing Protocol for a detailed description.
- print_LeafNode(printer: Printer, node: LeafNode)¶
Prints a leaf node, which should always be a column in a CSV row.
The node is escaped by first writing it to
csv.writer()
:csv.writer(...).writerow([node.object])
- sub_format_types: Sequence[Type[Formatter[T]]] = [<class 'graphtage.csv.CSVRows'>, <class 'graphtage.json.JSONFormatter'>]¶
A list of formatter types that should be used as sub-formatters in the Formatting Protocol.
CSVNode¶
- class graphtage.csv.CSVNode(nodes: Iterable[T], allow_list_edits: bool = True, allow_list_edits_when_same_length: bool = True)¶
-
A node representing zero or more CSV rows.
- __init__(nodes: Iterable[T], allow_list_edits: bool = True, allow_list_edits_when_same_length: bool = True)¶
Initializes a List node.
- Parameters:
nodes – The set of nodes in this list.
allow_list_edits – Whether to consider removal and insertion when editing this list.
allow_list_edits_when_same_length – Whether to consider removal and insertion when comparing this list to another list of the same length.
- __iter__() Iterator[TreeNode] ¶
Iterates over this sequence’s child nodes.
This is equivalent to:
return iter(self._children)
- __len__() int ¶
The number of children of this sequence.
This is equivalent to:
return len(self._children)
- all_children_are_leaves() bool ¶
Tests whether all of the children of this container are leaves.
Equivalent to:
all(c.is_leaf for c in self)
- Returns:
True
if all children are leaves.- Return type:
- calculate_total_size()¶
Calculates the total size of this sequence.
This is equivalent to:
return sum(c.total_size for c in self)
- children() T ¶
The children of this node.
Equivalent to:
list(self)
- property container_type: Type[Tuple[T, ...]]¶
The container type required by
graphtage.sequences.SequenceNode
- Returns:
- Return type:
Type[Tuple[T, …]]
- dfs() Iterator[TreeNode] ¶
Performs a depth-first traversal over all of this node’s descendants.
self
is always included and yielded first.This implementation is equivalent to:
stack = [self] while stack: node = stack.pop() yield node stack.extend(reversed(node.children()))
- diff(node: TreeNode) EditedTreeNode | T ¶
Performs a diff against the provided node.
- Parameters:
node – The node against which to perform the diff.
- Returns:
An edited version of this node with all edits being
completed
.- Return type:
Union[EditedTreeNode, T]
- editable_dict() Dict[str, Any] ¶
Copies
self.__dict__
, callingTreeNode.editable_dict()
on all children.This is equivalent to:
ret = dict(self.__dict__) ret['_children'] = self.container_type(n.make_edited() for n in self) return ret
This is used by
SequenceNode.make_edited()
.
- property edited: bool¶
Returns whether this node has been edited.
The default implementation returns
False
, whereasEditedTreeNode.edited()
returnsTrue
.
- edits(node: TreeNode) Edit ¶
Calculates the best edit to transform this node into the provided node.
- Parameters:
node – The node to which to transform.
- Returns:
The best possible edit.
- Return type:
- get_all_edit_contexts(node: TreeNode) Iterator[Tuple[Tuple[TreeNode, ...], Edit]] ¶
Returns an iterator over all edit contexts that will transform this node into the provided node.
- Parameters:
node – The node to which to transform this one.
- Returns:
An iterator over pairs of paths from node to the edited node, as well as its edit. Note that this iterator will automatically
explode
anyCompoundEdit
in the sequence.- Return type:
Iterator[Tuple[Tuple[“TreeNode”, …], Edit]
- get_all_edits(node: TreeNode) Iterator[Edit] ¶
Returns an iterator over all edits that will transform this node into the provided node.
- Parameters:
node – The node to which to transform this one.
- Returns:
An iterator over edits. Note that this iterator will automatically
explode
anyCompoundEdit
in the sequence.- Return type:
Iterator[Edit]
- property is_leaf: bool¶
Container nodes are never leaves, even if they have no children.
- Returns:
False
- Return type:
- make_edited() EditedTreeNode | T ¶
Returns a new, copied instance of this node that is also an instance of
EditedTreeNode
.This is equivalent to:
return self.__class__.edited_type()(self)
- Returns:
A copied version of this node that is also an instance of
EditedTreeNode
and thereby mutable.- Return type:
Union[EditedTreeNode, T]
- property parent: TreeNode | None¶
The parent node of this node, or
None
if it has no parent.The setter for this property should only be called by the parent node setting itself as the parent of its child.
ContainerNode
subclasses automatically set this property for all of their children. However, if you define a subclass ofTreeNode
does not extend off ofContainerNode
and for whichlen(self.children()) > 0
, then each child’s parent must be set.
- print(printer: Printer)¶
Prints a sequence node.
By default, sequence nodes are printed like lists:
SequenceFormatter('[', ']', ',').print(printer, self)
- print_parent_context(printer: Printer, for_child: TreeNode)¶
Prints the context for the given child node.
For example, if this node represents a list and the child is the element at index 3, then “[3]” might be printed.
The child is expected to be one of this node’s children, but this is not validated.
The default implementation prints nothing.
- to_obj()¶
Returns a pure Python representation of this node.
For example, a node representing a list, like
graphtage.ListNode
, should return a Pythonlist
. A node representing a mapping, likegraphtage.MappingNode
, should return a Pythondict
. Container nodes should recursively callTreeNode.to_obj()
on all of their children.This is used solely for the providing objects to operate on in the commandline expressions evaluation, for options like –match-if and –match-unless.
- property total_size: int¶
The size of this node.
This is an arbitrary, immutable value that is used to calculate the bounded costs of edits on this node.
The first time this property is called, its value will be set and memoized by calling
TreeNode.calculate_total_size()
.- Returns:
An arbitrary integer representing the size of this node.
- Return type:
CSVRow¶
- class graphtage.csv.CSVRow(nodes: Iterable[T], allow_list_edits: bool = True, allow_list_edits_when_same_length: bool = True)¶
-
A node representing a row of a CSV file.
- __init__(nodes: Iterable[T], allow_list_edits: bool = True, allow_list_edits_when_same_length: bool = True)¶
Initializes a List node.
- Parameters:
nodes – The set of nodes in this list.
allow_list_edits – Whether to consider removal and insertion when editing this list.
allow_list_edits_when_same_length – Whether to consider removal and insertion when comparing this list to another list of the same length.
- __iter__() Iterator[TreeNode] ¶
Iterates over this sequence’s child nodes.
This is equivalent to:
return iter(self._children)
- __len__() int ¶
The number of children of this sequence.
This is equivalent to:
return len(self._children)
- all_children_are_leaves() bool ¶
Tests whether all of the children of this container are leaves.
Equivalent to:
all(c.is_leaf for c in self)
- Returns:
True
if all children are leaves.- Return type:
- calculate_total_size()¶
Calculates the total size of this sequence.
This is equivalent to:
return sum(c.total_size for c in self)
- children() T ¶
The children of this node.
Equivalent to:
list(self)
- property container_type: Type[Tuple[T, ...]]¶
The container type required by
graphtage.sequences.SequenceNode
- Returns:
- Return type:
Type[Tuple[T, …]]
- dfs() Iterator[TreeNode] ¶
Performs a depth-first traversal over all of this node’s descendants.
self
is always included and yielded first.This implementation is equivalent to:
stack = [self] while stack: node = stack.pop() yield node stack.extend(reversed(node.children()))
- diff(node: TreeNode) EditedTreeNode | T ¶
Performs a diff against the provided node.
- Parameters:
node – The node against which to perform the diff.
- Returns:
An edited version of this node with all edits being
completed
.- Return type:
Union[EditedTreeNode, T]
- editable_dict() Dict[str, Any] ¶
Copies
self.__dict__
, callingTreeNode.editable_dict()
on all children.This is equivalent to:
ret = dict(self.__dict__) ret['_children'] = self.container_type(n.make_edited() for n in self) return ret
This is used by
SequenceNode.make_edited()
.
- property edited: bool¶
Returns whether this node has been edited.
The default implementation returns
False
, whereasEditedTreeNode.edited()
returnsTrue
.
- edits(node: TreeNode) Edit ¶
Calculates the best edit to transform this node into the provided node.
- Parameters:
node – The node to which to transform.
- Returns:
The best possible edit.
- Return type:
- get_all_edit_contexts(node: TreeNode) Iterator[Tuple[Tuple[TreeNode, ...], Edit]] ¶
Returns an iterator over all edit contexts that will transform this node into the provided node.
- Parameters:
node – The node to which to transform this one.
- Returns:
An iterator over pairs of paths from node to the edited node, as well as its edit. Note that this iterator will automatically
explode
anyCompoundEdit
in the sequence.- Return type:
Iterator[Tuple[Tuple[“TreeNode”, …], Edit]
- get_all_edits(node: TreeNode) Iterator[Edit] ¶
Returns an iterator over all edits that will transform this node into the provided node.
- Parameters:
node – The node to which to transform this one.
- Returns:
An iterator over edits. Note that this iterator will automatically
explode
anyCompoundEdit
in the sequence.- Return type:
Iterator[Edit]
- property is_leaf: bool¶
Container nodes are never leaves, even if they have no children.
- Returns:
False
- Return type:
- make_edited() EditedTreeNode | T ¶
Returns a new, copied instance of this node that is also an instance of
EditedTreeNode
.This is equivalent to:
return self.__class__.edited_type()(self)
- Returns:
A copied version of this node that is also an instance of
EditedTreeNode
and thereby mutable.- Return type:
Union[EditedTreeNode, T]
- property parent: TreeNode | None¶
The parent node of this node, or
None
if it has no parent.The setter for this property should only be called by the parent node setting itself as the parent of its child.
ContainerNode
subclasses automatically set this property for all of their children. However, if you define a subclass ofTreeNode
does not extend off ofContainerNode
and for whichlen(self.children()) > 0
, then each child’s parent must be set.
- print(printer: Printer)¶
Prints a sequence node.
By default, sequence nodes are printed like lists:
SequenceFormatter('[', ']', ',').print(printer, self)
- print_parent_context(printer: Printer, for_child: TreeNode)¶
Prints the context for the given child node.
For example, if this node represents a list and the child is the element at index 3, then “[3]” might be printed.
The child is expected to be one of this node’s children, but this is not validated.
The default implementation prints nothing.
- to_obj()¶
Returns a pure Python representation of this node.
For example, a node representing a list, like
graphtage.ListNode
, should return a Pythonlist
. A node representing a mapping, likegraphtage.MappingNode
, should return a Pythondict
. Container nodes should recursively callTreeNode.to_obj()
on all of their children.This is used solely for the providing objects to operate on in the commandline expressions evaluation, for options like –match-if and –match-unless.
- property total_size: int¶
The size of this node.
This is an arbitrary, immutable value that is used to calculate the bounded costs of edits on this node.
The first time this property is called, its value will be set and memoized by calling
TreeNode.calculate_total_size()
.- Returns:
An arbitrary integer representing the size of this node.
- Return type:
CSVRowFormatter¶
- class graphtage.csv.CSVRowFormatter(*args, **kwargs)¶
Bases:
SequenceFormatter
A formatter for CSV rows.
- DEFAULT_INSTANCE: Formatter[T] = <graphtage.csv.CSVRowFormatter object>¶
A default instance of this formatter, automatically instantiated by the
FormatterChecker
metaclass.
- __init__()¶
Initializes the formatter.
Equivalent to:
super().__init__('', '', ',')
- static __new__(cls, *args, **kwargs) Formatter[T] ¶
Instantiates a new formatter.
This automatically instantiates and populates
Formatter.sub_formatters
and sets theirparent
to this new formatter.
- edit_print(printer: Printer, edit: Edit)¶
Called when the edit for an item is to be printed.
If the
SequenceNode
being printed either is not edited or has no edits, then the edit passed to this function will be aMatch(child, child, 0)
.This implementation simply delegates the print to the Formatting Protocol:
self.print(printer, edit)
- get_formatter(item: T) Callable[[Printer, T], Any] | None ¶
Looks up a formatter for the given item using this formatter as a base.
Equivalent to:
get_formatter(item.__class__, base_formatter=self)
- is_partial: bool = True¶
This is a partial formatter; it will not be automatically used in the Formatting Protocol.
- item_newline(printer: Printer, is_first: bool = False, is_last: bool = False)¶
An empty implementation, since each row should be printed as a single line.
- items_indent(printer: Printer) Printer ¶
Returns a Printer context with an indentation.
This is called as:
with self.items_indent(printer) as p:
immediately after the
self.start_symbol
is printed, but before any of the items have been printed.This default implementation is equivalent to:
return printer.indent()
- parent: Formatter[T] | None = None¶
The parent formatter for this formatter instance.
This is automatically populated by
Formatter.__new__()
and should never be manually modified.
- print(printer: Printer, node_or_edit: TreeNode | Edit, with_edits: bool = True)¶
Prints the given node or edit.
- Parameters:
printer – The printer to which to write.
node_or_edit – The node or edit to print.
with_edits – If :keyword:True, print any edits associated with the node.
Note
The protocol for determining how a node or edit should be printed is very complex due to its extensibility. See the Printing Protocol for a detailed description.
- print_CSVRow(*args, **kwargs)¶
Prints a CSV row.
Equivalent to:
super().print_SequenceNode(*args, **kwargs)
- print_SequenceNode(printer: Printer, node: SequenceNode)¶
Formats a sequence node.
The protocol for this function is as follows:
Print
self.start_symbol
- With the printer returned by
self.items_indent
: - For each
edit
in the sequence (or just a sequence ofgraphtage.Match
for each child, if the node is not edited): Call
self.item_newline(printer, is_first=index == 0)
Call
self.edit_print(printer, edit)
- For each
- With the printer returned by
If at least one edit was printed, then call
self.item_newline(printer, is_last=True)
Print
self.start_symbol
- sub_format_types: Sequence[Type[Formatter[T]]] = ()¶
A list of formatter types that should be used as sub-formatters in the Formatting Protocol.
CSVRows¶
- class graphtage.csv.CSVRows(*args, **kwargs)¶
Bases:
SequenceFormatter
A sub formatter for printing the sequence of rows in a CSV file.
- DEFAULT_INSTANCE: Formatter[T] = <graphtage.csv.CSVRows object>¶
A default instance of this formatter, automatically instantiated by the
FormatterChecker
metaclass.
- __init__()¶
Initializes the formatter.
Equivalent to:
super().__init__('', '', '')
- static __new__(cls, *args, **kwargs) Formatter[T] ¶
Instantiates a new formatter.
This automatically instantiates and populates
Formatter.sub_formatters
and sets theirparent
to this new formatter.
- edit_print(printer: Printer, edit: Edit)¶
Called when the edit for an item is to be printed.
If the
SequenceNode
being printed either is not edited or has no edits, then the edit passed to this function will be aMatch(child, child, 0)
.This implementation simply delegates the print to the Formatting Protocol:
self.print(printer, edit)
- get_formatter(item: T) Callable[[Printer, T], Any] | None ¶
Looks up a formatter for the given item using this formatter as a base.
Equivalent to:
get_formatter(item.__class__, base_formatter=self)
- is_partial: bool = True¶
This is a partial formatter; it will not be automatically used in the Formatting Protocol.
- item_newline(printer: Printer, is_first: bool = False, is_last: bool = False)¶
Prints a newline on all but the first and last items.
- parent: Formatter[T] | None = None¶
The parent formatter for this formatter instance.
This is automatically populated by
Formatter.__new__()
and should never be manually modified.
- print(printer: Printer, node_or_edit: TreeNode | Edit, with_edits: bool = True)¶
Prints the given node or edit.
- Parameters:
printer – The printer to which to write.
node_or_edit – The node or edit to print.
with_edits – If :keyword:True, print any edits associated with the node.
Note
The protocol for determining how a node or edit should be printed is very complex due to its extensibility. See the Printing Protocol for a detailed description.
- print_CSVNode(*args, **kwargs)¶
Prints a CSV node.
Equivalent to:
super().print_SequenceNode(*args, **kwargs)
- print_SequenceNode(printer: Printer, node: SequenceNode)¶
Formats a sequence node.
The protocol for this function is as follows:
Print
self.start_symbol
- With the printer returned by
self.items_indent
: - For each
edit
in the sequence (or just a sequence ofgraphtage.Match
for each child, if the node is not edited): Call
self.item_newline(printer, is_first=index == 0)
Call
self.edit_print(printer, edit)
- For each
- With the printer returned by
If at least one edit was printed, then call
self.item_newline(printer, is_last=True)
Print
self.start_symbol
- sub_format_types: Sequence[Type[Formatter[T]]] = [<class 'graphtage.csv.CSVRowFormatter'>]¶
A list of formatter types that should be used as sub-formatters in the Formatting Protocol.
csv functions¶
build_tree¶
- graphtage.csv.build_tree(path: str, options: BuildOptions | None = None, *args, **kwargs) CSVNode ¶
Constructs a
CSVNode
from a CSV file.The file is parsed using Python’s
csv.reader()
. The elements in each row are constructed by delegating tographtage.json.build_tree()
:CSVRow([json.build_tree(i, options=options) for i in row])
- Parameters:
path – The path to the file to be parsed.
options – Optional build options to pass on to
graphtage.json.build_tree()
.*args – Any extra positional arguments are passed on to
csv.reader()
.**kwargs – Any extra keyword arguments are passed on to
csv.reader()
.
- Returns:
The resulting CSV node object.
- Return type: