graphtage.csv

A graphtage.Filetype for parsing, diffing, and rendering CSV files.

csv classes

CSV

class graphtage.csv.CSV

Bases: graphtage.Filetype

The CSV filetype.

__init__()

Initializes the CSV filetype.

CSV identifies itself with the MIME types csv and text/csv.

build_tree(path: str, options: Optional[graphtage.BuildOptions] = None)graphtage.TreeNode

Equivalent to build_tree()

build_tree_handling_errors(path: str, options: Optional[graphtage.BuildOptions] = None)graphtage.TreeNode

Same as Filetype.build_tree(), but it should return a human-readable error string on failure.

This function should never throw an exception.

Parameters
  • path – Path to the file to parse

  • options – An optional set of options for building the tree

Returns

On success, the root tree node, or a string containing the error message on failure.

Return type

Union[str, TreeNode]

get_default_formatter()graphtage.csv.CSVFormatter

Returns the default formatter for printing files of this type.

CSVFormatter

class graphtage.csv.CSVFormatter(*args, **kwargs)

Bases: graphtage.GraphtageFormatter

Top-level formatter for CSV files.

DEFAULT_INSTANCE: graphtage.formatter.Formatter[T] = <graphtage.csv.CSVFormatter object>
__init__()

Initialize self. See help(type(self)) for accurate signature.

static __new__(cls, *args, **kwargs)graphtage.formatter.Formatter[T]

Instantiates a new formatter.

This automatically instantiates and populates Formatter.sub_formatters and sets their parent to this new formatter.

get_formatter(item: T) → Optional[Callable[[graphtage.printer.Printer, T], Any]]

Looks up a formatter for the given item using this formatter as a base.

Equivalent to:

get_formatter(item.__class__, base_formatter=self)
is_partial: bool = False
parent: Optional[graphtage.formatter.Formatter[T]] = None
print(printer: graphtage.printer.Printer, node_or_edit: Union[TreeNode, Edit], with_edits: bool = True)

Prints the given node or edit.

Parameters
  • printer – The printer to which to write.

  • node_or_edit – The node or edit to print.

  • with_edits – If :keyword:True, print any edits associated with the node.

Note

The protocol for determining how a node or edit should be printed is very complex due to its extensibility. See the Printing Protocol for a detailed description.

print_LeafNode(printer: graphtage.printer.Printer, node: graphtage.LeafNode)

Prints a leaf node, which should always be a column in a CSV row.

The node is escaped by first writing it to csv.writer():

csv.writer(...).writerow([node.object])
property root

Returns the root formatter.

sub_format_types: Sequence[Type[graphtage.formatter.Formatter[T]]] = [<class 'graphtage.csv.CSVRows'>, <class 'graphtage.json.JSONFormatter'>]
sub_formatters: List[graphtage.formatter.Formatter[T]] = []

CSVNode

class graphtage.csv.CSVNode(nodes: Iterable[T], allow_list_edits: bool = True, allow_list_edits_when_same_length: bool = True)

Bases: graphtage.ListNode

A node representing zero or more CSV rows.

__init__(nodes: Iterable[T], allow_list_edits: bool = True, allow_list_edits_when_same_length: bool = True)

Initializes a List node.

Parameters
  • nodes – The set of nodes in this list.

  • allow_list_edits – Whether to consider removal and insertion when editing this list.

  • allow_list_edits_when_same_length – Whether to consider removal and insertion when comparing this list to another list of the same length.

__iter__() → Iterator[graphtage.TreeNode]

Iterates over this sequence’s child nodes.

This is equivalent to:

return iter(self._children)
__len__()int

The number of children of this sequence.

This is equivalent to:

return len(self._children)
all_children_are_leaves()bool

Tests whether all of the children of this container are leaves.

Equivalent to:

all(c.is_leaf for c in self)
Returns

True if all children are leaves.

Return type

bool

calculate_total_size()

Calculates the total size of this sequence.

This is equivalent to:

return sum(c.total_size for c in self)
children() → T

The children of this node.

Equivalent to:

list(self)
property container_type

The container type required by graphtage.sequences.SequenceNode

Returns

tuple

Return type

Type[Tuple[T, ..]]

dfs() → Iterator[graphtage.TreeNode]

Performs a depth-first traversal over all of this node’s descendants.

self is always included and yielded first.

This implementation is equivalent to:

stack = [self]
while stack:
    node = stack.pop()
    yield node
    stack.extend(reversed(node.children()))
diff(node: graphtage.TreeNode) → Union[graphtage.EditedTreeNode, T]

Performs a diff against the provided node.

Parameters

node – The node against which to perform the diff.

Returns

An edited version of this node with all edits being completed.

Return type

Union[EditedTreeNode, T]

edit_modifiers: Optional[List[Callable[[graphtage.TreeNode, graphtage.TreeNode], Optional[graphtage.Edit]]]] = None
editable_dict() → Dict[str, Any]

Copies self.__dict__, calling TreeNode.editable_dict() on all children.

This is equivalent to:

ret = dict(self.__dict__)
ret['_children'] = self.container_type(n.make_edited() for n in self)
return ret

This is used by SequenceNode.make_edited().

property edited

Returns whether this node has been edited.

The default implementation returns False, whereas EditedTreeNode.edited() returns True.

classmethod edited_type() → Type[Union[graphtage.EditedTreeNode, T]]

Dynamically constructs a new class that is both a TreeNode and an EditedTreeNode.

The edited type’s member variables are populated by the result of TreeNode.editable_dict() of the TreeNode it wraps:

new_node.__dict__ = dict(wrapped_tree_node.editable_dict())
Returns

A class that is both a TreeNode and an EditedTreeNode. Its constructor accepts a TreeNode that it will wrap.

Return type

Type[Union[EditedTreeNode, T]]

edits(node: graphtage.TreeNode)graphtage.Edit

Calculates the best edit to transform this node into the provided node.

Parameters

node – The node to which to transform.

Returns

The best possible edit.

Return type

Edit

get_all_edits(node: graphtage.TreeNode) → Iterator[graphtage.Edit]

Returns an iterator over all edits that will transform this node into the provided node.

Parameters

node – The node to which to transform this one.

Returns

An iterator over edits. Note that this iterator will automatically explode any CompoundEdit in the sequence.

Return type

Iterator[Edit]

property is_leaf

Container nodes are never leaves, even if they have no children.

Returns

False

Return type

bool

make_edited() → Union[graphtage.EditedTreeNode, T]

Returns a new, copied instance of this node that is also an instance of EditedTreeNode.

This is equivalent to:

return self.edited_type()(self)
Returns

A copied version of this node that is also an instance of EditedTreeNode and thereby mutable.

Return type

Union[EditedTreeNode, T]

print(printer: graphtage.printer.Printer)

Prints a sequence node.

By default, sequence nodes are printed like lists:

SequenceFormatter('[', ']', ',').print(printer, self)
to_obj()

Returns a pure Python representation of this node.

For example, a node representing a list, like graphtage.ListNode, should return a Python list. A node representing a mapping, like graphtage.MappingNode, should return a Python dict. Container nodes should recursively call TreeNode.to_obj() on all of their children.

This is used solely for the providing objects to operate on in the commandline expressions evaluation, for options like –match-if and –match-unless.

property total_size

The size of this node.

This is an arbitrary, immutable value that is used to calculate the bounded costs of edits on this node.

The first time this property is called, its value will be set and memoized by calling TreeNode.calculate_total_size().

Returns

An arbitrary integer representing the size of this node.

Return type

int

CSVRow

class graphtage.csv.CSVRow(nodes: Iterable[T], allow_list_edits: bool = True, allow_list_edits_when_same_length: bool = True)

Bases: graphtage.ListNode

A node representing a row of a CSV file.

__init__(nodes: Iterable[T], allow_list_edits: bool = True, allow_list_edits_when_same_length: bool = True)

Initializes a List node.

Parameters
  • nodes – The set of nodes in this list.

  • allow_list_edits – Whether to consider removal and insertion when editing this list.

  • allow_list_edits_when_same_length – Whether to consider removal and insertion when comparing this list to another list of the same length.

__iter__() → Iterator[graphtage.TreeNode]

Iterates over this sequence’s child nodes.

This is equivalent to:

return iter(self._children)
__len__()int

The number of children of this sequence.

This is equivalent to:

return len(self._children)
all_children_are_leaves()bool

Tests whether all of the children of this container are leaves.

Equivalent to:

all(c.is_leaf for c in self)
Returns

True if all children are leaves.

Return type

bool

calculate_total_size()

Calculates the total size of this sequence.

This is equivalent to:

return sum(c.total_size for c in self)
children() → T

The children of this node.

Equivalent to:

list(self)
property container_type

The container type required by graphtage.sequences.SequenceNode

Returns

tuple

Return type

Type[Tuple[T, ..]]

dfs() → Iterator[graphtage.TreeNode]

Performs a depth-first traversal over all of this node’s descendants.

self is always included and yielded first.

This implementation is equivalent to:

stack = [self]
while stack:
    node = stack.pop()
    yield node
    stack.extend(reversed(node.children()))
diff(node: graphtage.TreeNode) → Union[graphtage.EditedTreeNode, T]

Performs a diff against the provided node.

Parameters

node – The node against which to perform the diff.

Returns

An edited version of this node with all edits being completed.

Return type

Union[EditedTreeNode, T]

edit_modifiers: Optional[List[Callable[[graphtage.TreeNode, graphtage.TreeNode], Optional[graphtage.Edit]]]] = None
editable_dict() → Dict[str, Any]

Copies self.__dict__, calling TreeNode.editable_dict() on all children.

This is equivalent to:

ret = dict(self.__dict__)
ret['_children'] = self.container_type(n.make_edited() for n in self)
return ret

This is used by SequenceNode.make_edited().

property edited

Returns whether this node has been edited.

The default implementation returns False, whereas EditedTreeNode.edited() returns True.

classmethod edited_type() → Type[Union[graphtage.EditedTreeNode, T]]

Dynamically constructs a new class that is both a TreeNode and an EditedTreeNode.

The edited type’s member variables are populated by the result of TreeNode.editable_dict() of the TreeNode it wraps:

new_node.__dict__ = dict(wrapped_tree_node.editable_dict())
Returns

A class that is both a TreeNode and an EditedTreeNode. Its constructor accepts a TreeNode that it will wrap.

Return type

Type[Union[EditedTreeNode, T]]

edits(node: graphtage.TreeNode)graphtage.Edit

Calculates the best edit to transform this node into the provided node.

Parameters

node – The node to which to transform.

Returns

The best possible edit.

Return type

Edit

get_all_edits(node: graphtage.TreeNode) → Iterator[graphtage.Edit]

Returns an iterator over all edits that will transform this node into the provided node.

Parameters

node – The node to which to transform this one.

Returns

An iterator over edits. Note that this iterator will automatically explode any CompoundEdit in the sequence.

Return type

Iterator[Edit]

property is_leaf

Container nodes are never leaves, even if they have no children.

Returns

False

Return type

bool

make_edited() → Union[graphtage.EditedTreeNode, T]

Returns a new, copied instance of this node that is also an instance of EditedTreeNode.

This is equivalent to:

return self.edited_type()(self)
Returns

A copied version of this node that is also an instance of EditedTreeNode and thereby mutable.

Return type

Union[EditedTreeNode, T]

print(printer: graphtage.printer.Printer)

Prints a sequence node.

By default, sequence nodes are printed like lists:

SequenceFormatter('[', ']', ',').print(printer, self)
to_obj()

Returns a pure Python representation of this node.

For example, a node representing a list, like graphtage.ListNode, should return a Python list. A node representing a mapping, like graphtage.MappingNode, should return a Python dict. Container nodes should recursively call TreeNode.to_obj() on all of their children.

This is used solely for the providing objects to operate on in the commandline expressions evaluation, for options like –match-if and –match-unless.

property total_size

The size of this node.

This is an arbitrary, immutable value that is used to calculate the bounded costs of edits on this node.

The first time this property is called, its value will be set and memoized by calling TreeNode.calculate_total_size().

Returns

An arbitrary integer representing the size of this node.

Return type

int

CSVRowFormatter

class graphtage.csv.CSVRowFormatter(*args, **kwargs)

Bases: graphtage.sequences.SequenceFormatter

A formatter for CSV rows.

DEFAULT_INSTANCE: graphtage.formatter.Formatter[T] = <graphtage.csv.CSVRowFormatter object>
__init__()

Initializes the formatter.

Equivalent to:

super().__init__('', '', ',')
static __new__(cls, *args, **kwargs)graphtage.formatter.Formatter[T]

Instantiates a new formatter.

This automatically instantiates and populates Formatter.sub_formatters and sets their parent to this new formatter.

edit_print(printer: graphtage.printer.Printer, edit: graphtage.Edit)

Called when the edit for an item is to be printed.

If the SequenceNode being printed either is not edited or has no edits, then the edit passed to this function will be a Match(child, child, 0).

This implementation simply delegates the print to the Formatting Protocol:

self.print(printer, edit)
get_formatter(item: T) → Optional[Callable[[graphtage.printer.Printer, T], Any]]

Looks up a formatter for the given item using this formatter as a base.

Equivalent to:

get_formatter(item.__class__, base_formatter=self)
is_partial: bool = True
item_newline(printer: graphtage.printer.Printer, is_first: bool = False, is_last: bool = False)

An empty implementation, since each row should be printed as a single line.

items_indent(printer: graphtage.printer.Printer)graphtage.printer.Printer

Returns a Printer context with an indentation.

This is called as:

with self.items_indent(printer) as p:

immediately after the self.start_symbol is printed, but before any of the items have been printed.

This default implementation is equivalent to:

return printer.indent()
parent: Optional[graphtage.formatter.Formatter[T]] = None
print(printer: graphtage.printer.Printer, node_or_edit: Union[TreeNode, Edit], with_edits: bool = True)

Prints the given node or edit.

Parameters
  • printer – The printer to which to write.

  • node_or_edit – The node or edit to print.

  • with_edits – If :keyword:True, print any edits associated with the node.

Note

The protocol for determining how a node or edit should be printed is very complex due to its extensibility. See the Printing Protocol for a detailed description.

print_CSVRow(*args, **kwargs)

Prints a CSV row.

Equivalent to:

super().print_SequenceNode(*args, **kwargs)
print_SequenceNode(printer: graphtage.printer.Printer, node: graphtage.sequences.SequenceNode)

Formats a sequence node.

The protocol for this function is as follows:

  • Print self.start_symbol

  • With the printer returned by self.items_indent:
    • For each edit in the sequence (or just a sequence of graphtage.Match for each child, if the node is not edited):
      • Call self.item_newline(printer, is_first=index == 0)

      • Call self.edit_print(printer, edit)

  • If at least one edit was printed, then call self.item_newline(printer, is_last=True)

  • Print self.start_symbol

property root

Returns the root formatter.

sub_format_types: Sequence[Type[graphtage.formatter.Formatter[T]]] = ()
sub_formatters: List[graphtage.formatter.Formatter[T]] = []

CSVRows

class graphtage.csv.CSVRows(*args, **kwargs)

Bases: graphtage.sequences.SequenceFormatter

A sub formatter for printing the sequence of rows in a CSV file.

DEFAULT_INSTANCE: graphtage.formatter.Formatter[T] = <graphtage.csv.CSVRows object>
__init__()

Initializes the formatter.

Equivalent to:

super().__init__('', '', '')
static __new__(cls, *args, **kwargs)graphtage.formatter.Formatter[T]

Instantiates a new formatter.

This automatically instantiates and populates Formatter.sub_formatters and sets their parent to this new formatter.

edit_print(printer: graphtage.printer.Printer, edit: graphtage.Edit)

Called when the edit for an item is to be printed.

If the SequenceNode being printed either is not edited or has no edits, then the edit passed to this function will be a Match(child, child, 0).

This implementation simply delegates the print to the Formatting Protocol:

self.print(printer, edit)
get_formatter(item: T) → Optional[Callable[[graphtage.printer.Printer, T], Any]]

Looks up a formatter for the given item using this formatter as a base.

Equivalent to:

get_formatter(item.__class__, base_formatter=self)
is_partial: bool = True
item_newline(printer: graphtage.printer.Printer, is_first: bool = False, is_last: bool = False)

Prints a newline on all but the first and last items.

items_indent(printer: graphtage.printer.Printer)

Returns printer because CSV rows do not need to be indented.

parent: Optional[graphtage.formatter.Formatter[T]] = None
print(printer: graphtage.printer.Printer, node_or_edit: Union[TreeNode, Edit], with_edits: bool = True)

Prints the given node or edit.

Parameters
  • printer – The printer to which to write.

  • node_or_edit – The node or edit to print.

  • with_edits – If :keyword:True, print any edits associated with the node.

Note

The protocol for determining how a node or edit should be printed is very complex due to its extensibility. See the Printing Protocol for a detailed description.

print_CSVNode(*args, **kwargs)

Prints a CSV node.

Equivalent to:

super().print_SequenceNode(*args, **kwargs)
print_SequenceNode(printer: graphtage.printer.Printer, node: graphtage.sequences.SequenceNode)

Formats a sequence node.

The protocol for this function is as follows:

  • Print self.start_symbol

  • With the printer returned by self.items_indent:
    • For each edit in the sequence (or just a sequence of graphtage.Match for each child, if the node is not edited):
      • Call self.item_newline(printer, is_first=index == 0)

      • Call self.edit_print(printer, edit)

  • If at least one edit was printed, then call self.item_newline(printer, is_last=True)

  • Print self.start_symbol

property root

Returns the root formatter.

sub_format_types: Sequence[Type[graphtage.formatter.Formatter[T]]] = [<class 'graphtage.csv.CSVRowFormatter'>]
sub_formatters: List[graphtage.formatter.Formatter[T]] = []

csv functions

build_tree

graphtage.csv.build_tree(path: str, options: Optional[graphtage.BuildOptions] = None, *args, **kwargs)graphtage.csv.CSVNode

Constructs a CSVNode from a CSV file.

The file is parsed using Python’s csv.reader(). The elements in each row are constructed by delegating to graphtage.json.build_tree():

CSVRow([json.build_tree(i, options=options) for i in row])
Parameters
  • path – The path to the file to be parsed.

  • options – Optional build options to pass on to graphtage.json.build_tree().

  • *args – Any extra positional arguments are passed on to csv.reader().

  • **kwargs – Any extra keyword arguments are passed on to csv.reader().

Returns

The resulting CSV node object.

Return type

CSVNode