graphtage.formatter

A module for extensible and reusable textual formatting.

Why does the formatter module exist?

This module is completely generic, with no ties to Graphtage. However, it is easiest to see why it is necessary with an example from Graphtage: filetypes, edits, and nodes. The problem is that Graphtage is designed to be capable of use as a library to define new filetypes, node types, and edits. Graphtage also allows any input type to be output as if it were any other type. For example, two JSON files could be diffed and the output printed in YAML. Or a JSON file could be diffed against another JSON file and then output in some other format. This is enabled through use of an intermediate representation based on graphtage.TreeNode. Say a developer uses Graphtage as a library to develop support for some new file format as both input and output. The intermediate representation means that we would immediately be able to compare files in that new format to both JSON and YAML files. However, what if the user requests that a JSON file be output as the new format? Or what if an input file in the new format is to be output as YAML? How does the preexisting YAML formatter know how to deal with the new node and edit types defined for the new format?

What the formatter module can do

This is where the formatter module comes into play. It uses Python magic and wizardry, and a bit of dynamic type inference, to figure out the best formatter for a specific object. The following examples should make this more clear.

Examples

>>> from graphtage.printer import Printer
>>> from graphtage.formatter import BasicFormatter, get_formatter
>>> class StringFormatter(BasicFormatter[str]):
...     def print_str(self, printer: Printer, item: str):
...         printer.write(f"StringFormatter: {item}")
...         printer.newline()
...
>>> get_formatter(str)(Printer(), "foo")
StringFormatter: foo

The first thing to note here is that simply subclassing Formatter will register it with this module so it will be considered when resolving the best formatter for an object in calls to get_formatter(). This registration can be disabled by setting the class variable Formatter.is_partial to True.

>>> class IntFormatter(BasicFormatter[int]):
...     def print_int(self, printer: Printer, item: int):
...         printer.write(f"IntFormatter: {item}")
...         printer.newline()
...
>>> get_formatter(int)(Printer(), 1337)
IntFormatter: 1337

It works for any object type. It is not necessary to specify a generic type when subclassing Formatter or BasicFormatter; this is just available for convenience, readability, and automated type checking.

The next thing we will demonstrate is how formatter lookup works with inheritance:

>>> class Foo:
...     pass
...
>>> class FooFormatter(BasicFormatter[Foo]):
...     def print_Foo(self, printer: Printer, item: Foo):
...         printer.write("FooFormatter")
...         printer.newline()
...
>>> get_formatter(Foo)(Printer(), Foo())
FooFormatter
>>> class Bar(Foo):
...     def __init__(self, bar):
...         self.bar = bar
...
>>> get_formatter(Bar)(Printer(), Bar(None))
FooFormatter

Straightforward enough. But what if we define a separate formatter that handles objects of type Bar?

>>> class BarFormatter(BasicFormatter[Bar]):
...     def print_Bar(self, printer: Printer, item: Bar):
...         printer.write("BarFormatter: ")
...         self.print(printer, item.bar)
...
>>> get_formatter(Bar)(Printer(), Bar(None))
BarFormatter: None
>>> get_formatter(Bar)(Printer(), Bar(Foo()))
BarFormatter: FooFormatter
>>> get_formatter(Bar)(Printer(), Bar(Bar("foo")))
BarFormatter: BarFormatter: StringFormatter: foo
>>> get_formatter(Bar)(Printer(), Bar(Bar(1337)))
BarFormatter: BarFormatter: IntFormatter: 1337

Cool, huh? But what if there are collisions? Let’s extend BarFormatter to also handle strings:

>>> class BarFormatter(BasicFormatter[Any]):
...     def print_Bar(self, printer: Printer, item: Bar):
...         printer.write("BarFormatter: ")
...         self.print(printer, item.bar)
...
...     def print_str(self, printer: Printer, item: str):
...         printer.write(''.join(reversed(item.upper())))
...         printer.newline()
...
>>> get_formatter(Bar)(Printer(), Bar("foo"))
BarFormatter: OOF
>>> get_formatter(str)(Printer(), "foo")
StringFormatter: foo

As you can see, self.print(printer, item.bar) gives preference to locally defined implementations before doing a global lookup for a formatter with a print function.

We just got “lucky” with that last printout, though, because the print_str in BarFormatter has the same precedence in the global get_formatter() lookup as the implementation in StringFormatter. So:

>>> BarFormatter.DEFAULT_INSTANCE.print(Bar("foo"))
BarFormatter: OOF
>>> StringFormatter.DEFAULT_INSTANCE.print("foo")
StringFormatter: foo

that will always be true, however, the following might happen:

>>> get_formatter(str)(Printer(), "foo")
BarFormatter: OOF

That behavior might not be desirable. To prevent that (i.e., to compartmentalize the BarFormatter implementation of print_str and only use it when expanding a string inside of a Bar), Formtter classes can be organized hierarchically:

>>> class BarStringFormatter(BasicFormatter[str]):
...     is_partial = True # This prevents this class from being registered as a global formatter
...     def print_str(self, printer: Printer, item: str):
...         printer.write(''.join(reversed(item.upper())))
...         printer.newline()
...
>>> class BarFormatter(BasicFormatter[Bar]):
...     sub_format_types = [BarStringFormatter]
...     def print_Bar(self, printer: Printer, item: Bar):
...         printer.write("BarFormatter: ")
...         self.print(printer, item.bar)
...

Now,

>>> get_formatter(Bar)(Printer(), Bar("foo"))
BarFormatter: OOF
>>> get_formatter(str)(Printer(), "foo")
StringFormatter: foo

this will always be the case, and the final command will never invoke the BarFormatter implementation.

The sequence of function resolution happens in the self.print call in print_Bar follows the “Formatting Protocol”. It is described in the next section.

Formatting Protocol

The following describes how this module resolves the proper formatter and function to print a given item.

Given an optional formatter that is actively being used (e.g., when print_Bar calls self.print in the BarFormatter example, above; and the item that is to be formatted.

  • If formatter is given:
    • For each type in item.__class__.__mro__:
      • If a print function specifically associated with type.__name__ exists in formatter, then use that function.

      • Else, repeat this process recursively for any formatters in Formatter.sub_format_types.

      • If none of the subformatters is specialized in type, see if this formatter is the subformatter of another parent formatter. If so, repeat this process for the parent.

  • If no formatter has been found by this point, iterate over all other global registered formatters that have not yet been tested, and repeat this process given each one.

formatter classes

BasicFormatter

class graphtage.formatter.BasicFormatter(*args, **kwargs)

Bases: Generic[T], Formatter[T]

A basic formatter that falls back on an item’s natural string representation if no formatter is found.

DEFAULT_INSTANCE: Formatter[T] = <graphtage.formatter.BasicFormatter object>

A default instance of this formatter, automatically instantiated by the FormatterChecker metaclass.

__init__()
static __new__(cls, *args, **kwargs) Formatter[T]

Instantiates a new formatter.

This automatically instantiates and populates Formatter.sub_formatters and sets their parent to this new formatter.

get_formatter(item: T) Callable[[Printer, T], Any] | None

Looks up a formatter for the given item using this formatter as a base.

Equivalent to:

get_formatter(item.__class__, base_formatter=self)
is_partial: bool = False
parent: Formatter[T] | None = None

The parent formatter for this formatter instance.

This is automatically populated by Formatter.__new__() and should never be manually modified.

print(printer: Printer, item: T)

Prints the item to the printer.

This is equivalent to:

formatter = self.get_formatter(item)
if formatter is None:
    printer.write(str(item))
else:
    formatter(printer, item)
property root: Formatter[T]

Returns the root formatter.

sub_format_types: Sequence[Type[Formatter[T]]] = ()

A list of formatter types that should be used as sub-formatters in the Formatting Protocol.

sub_formatters: List[Formatter[T]] = []

The list of instantiated formatters corresponding to Formatter.sub_format_types.

This list is automatically populated by Formatter.__new__() and should never be manually modified.

Formatter

class graphtage.formatter.Formatter(*args, **kwargs)

Bases: Generic[T]

DEFAULT_INSTANCE: Formatter[T] = None

A default instance of this formatter, automatically instantiated by the FormatterChecker metaclass.

__init__()
static __new__(cls, *args, **kwargs) Formatter[T]

Instantiates a new formatter.

This automatically instantiates and populates Formatter.sub_formatters and sets their parent to this new formatter.

get_formatter(item: T) Callable[[Printer, T], Any] | None

Looks up a formatter for the given item using this formatter as a base.

Equivalent to:

get_formatter(item.__class__, base_formatter=self)
is_partial: bool = False
parent: Formatter[T] | None = None

The parent formatter for this formatter instance.

This is automatically populated by Formatter.__new__() and should never be manually modified.

abstract print(printer: Printer, item: T)

Prints an item to the printer using the proper formatter.

This method is abstract because subclasses should decide how to handle the case when a formatter was not found (i.e., when self.get_formatter returns None).

property root: Formatter[T]

Returns the root formatter.

sub_format_types: Sequence[Type[Formatter[T]]] = ()

A list of formatter types that should be used as sub-formatters in the Formatting Protocol.

sub_formatters: List[Formatter[T]] = []

The list of instantiated formatters corresponding to Formatter.sub_format_types.

This list is automatically populated by Formatter.__new__() and should never be manually modified.

FormatterChecker

class graphtage.formatter.FormatterChecker(name, bases, namespace, **kwargs)

Bases: ABCMeta

The metaclass for Formatter.

For every class that subclasses Formatter, if Formatter.is_partial is False (the default) and if the class is not abstract, then an instance of that class is automatically constructed and added to the global list of formatters. This same automatically constructed instance will be assigned to the Formatter.DEFAULT_INSTANCE attribute.

All methods of the subclass that begin with “print_” will be verified insofar as it is possible.

__init__(name, bases, clsdict)

Initializes the formatter checker.

Raises:

TypeError – If cls defines a method starting with “print” that has a keyword argument printer with a type hint that is not a subclass of graphtage.printer.Printer.

__instancecheck__(instance)

Override for isinstance(instance, cls).

__subclasscheck__(subclass)

Override for issubclass(subclass, cls).

_abc_caches_clear()

Clear the caches (for debugging or testing).

_abc_registry_clear()

Clear the registry (for debugging or testing).

_dump_registry(file=None)

Debug helper to print the ABC registry.

mro()

Return a type’s method resolution order.

register(subclass)

Register a virtual subclass of an ABC.

Returns the subclass, to allow usage as a class decorator.

formatter functions

get_formatter

graphtage.formatter.get_formatter(node_type: Type[T], base_formatter: Formatter | None = None) Callable[[Printer, T], Any] | None

Uses the Formatting Protocol to determine the correct formatter for a given type.

See this section for a number of examples.

Parameters:
  • node_type – The type of the object to be formatted.

  • base_formatter – An existing formatter from which the request is being made. This will affect the formatter resolution according to the Formatting Protocol.

Returns: The formatter for object type node_type, or None if none was found.