graphtage.formatter
A module for extensible and reusable textual formatting.
Why does the formatter module exist?
This module is completely generic, with no ties to Graphtage. However, it is easiest to see why it is necessary with an
example from Graphtage: filetypes, edits, and nodes. The problem is that Graphtage is designed to be capable of use as a
library to define new filetypes, node types, and edits. Graphtage also allows any input type to be output as if it
were any other type. For example, two JSON files could be diffed and the output printed in YAML. Or a JSON file could be
diffed against another JSON file and then output in some other format. This is enabled through use of an intermediate
representation based on graphtage.TreeNode
. Say a developer uses Graphtage as a library to develop support for
some new file format as both input and output. The intermediate representation means that we would immediately be able
to compare files in that new format to both JSON and YAML files. However, what if the user requests that a JSON file
be output as the new format? Or what if an input file in the new format is to be output as YAML? How does the
preexisting YAML formatter know how to deal with the new node and edit types defined for the new format?
What the formatter module can do
This is where the formatter module comes into play. It uses Python magic and wizardry, and a bit of dynamic type inference, to figure out the best formatter for a specific object. The following examples should make this more clear.
Examples
>>> from graphtage.printer import Printer
>>> from graphtage.formatter import BasicFormatter, get_formatter
>>> class StringFormatter(BasicFormatter[str]):
... def print_str(self, printer: Printer, item: str):
... printer.write(f"StringFormatter: {item}")
... printer.newline()
...
>>> get_formatter(str)(Printer(), "foo")
StringFormatter: foo
The first thing to note here is that simply subclassing Formatter
will register it with
this module so it will be considered when resolving the best formatter for an object in calls to
get_formatter()
. This registration can be disabled by setting the class variable Formatter.is_partial
to True
.
>>> class IntFormatter(BasicFormatter[int]):
... def print_int(self, printer: Printer, item: int):
... printer.write(f"IntFormatter: {item}")
... printer.newline()
...
>>> get_formatter(int)(Printer(), 1337)
IntFormatter: 1337
It works for any object type. It is not necessary to specify a generic type when subclassing Formatter
or
BasicFormatter
; this is just available for convenience, readability, and automated type checking.
The next thing we will demonstrate is how formatter lookup works with inheritance:
>>> class Foo:
... pass
...
>>> class FooFormatter(BasicFormatter[Foo]):
... def print_Foo(self, printer: Printer, item: Foo):
... printer.write("FooFormatter")
... printer.newline()
...
>>> get_formatter(Foo)(Printer(), Foo())
FooFormatter
>>> class Bar(Foo):
... def __init__(self, bar):
... self.bar = bar
...
>>> get_formatter(Bar)(Printer(), Bar(None))
FooFormatter
Straightforward enough. But what if we define a separate formatter that handles objects of type Bar
?
>>> class BarFormatter(BasicFormatter[Bar]):
... def print_Bar(self, printer: Printer, item: Bar):
... printer.write("BarFormatter: ")
... self.print(printer, item.bar)
...
>>> get_formatter(Bar)(Printer(), Bar(None))
BarFormatter: None
>>> get_formatter(Bar)(Printer(), Bar(Foo()))
BarFormatter: FooFormatter
>>> get_formatter(Bar)(Printer(), Bar(Bar("foo")))
BarFormatter: BarFormatter: StringFormatter: foo
>>> get_formatter(Bar)(Printer(), Bar(Bar(1337)))
BarFormatter: BarFormatter: IntFormatter: 1337
Cool, huh? But what if there are collisions? Let’s extend BarFormatter
to also handle strings:
>>> class BarFormatter(BasicFormatter[Any]):
... def print_Bar(self, printer: Printer, item: Bar):
... printer.write("BarFormatter: ")
... self.print(printer, item.bar)
...
... def print_str(self, printer: Printer, item: str):
... printer.write(''.join(reversed(item.upper())))
... printer.newline()
...
>>> get_formatter(Bar)(Printer(), Bar("foo"))
BarFormatter: OOF
>>> get_formatter(str)(Printer(), "foo")
StringFormatter: foo
As you can see, self.print(printer, item.bar)
gives preference to locally defined implementations before doing
a global lookup for a formatter with a print function.
We just got “lucky” with that last printout, though, because the print_str
in BarFormatter
has the
same precedence in the global get_formatter()
lookup as the implementation in StringFormatter
. So:
>>> BarFormatter.DEFAULT_INSTANCE.print(Bar("foo"))
BarFormatter: OOF
>>> StringFormatter.DEFAULT_INSTANCE.print("foo")
StringFormatter: foo
that will always be true, however, the following might happen:
>>> get_formatter(str)(Printer(), "foo")
BarFormatter: OOF
That behavior might not be desirable. To prevent that (i.e., to compartmentalize the BarFormatter
implementation of print_str
and only use it when expanding a string inside of a Bar
),
Formtter
classes can be organized hierarchically:
>>> class BarStringFormatter(BasicFormatter[str]):
... is_partial = True # This prevents this class from being registered as a global formatter
... def print_str(self, printer: Printer, item: str):
... printer.write(''.join(reversed(item.upper())))
... printer.newline()
...
>>> class BarFormatter(BasicFormatter[Bar]):
... sub_format_types = [BarStringFormatter]
... def print_Bar(self, printer: Printer, item: Bar):
... printer.write("BarFormatter: ")
... self.print(printer, item.bar)
...
Now,
>>> get_formatter(Bar)(Printer(), Bar("foo"))
BarFormatter: OOF
>>> get_formatter(str)(Printer(), "foo")
StringFormatter: foo
this will always be the case, and the final command will never invoke the BarFormatter
implementation.
The sequence of function resolution happens in the self.print
call in print_Bar
follows the
“Formatting Protocol”. It is described in the next section.
Formatting Protocol
The following describes how this module resolves the proper formatter and function to print a given item.
Given an optional formatter
that is actively being used (e.g., when print_Bar
calls
self.print
in the BarFormatter
example, above; and the item
that is to be formatted.
- If
formatter
is given: - For each
type
initem.__class__.__mro__
: If a print function specifically associated with
type.__name__
exists informatter
, then use that function.Else, repeat this process recursively for any formatters in
Formatter.sub_format_types
.If none of the subformatters is specialized in
type
, see if this formatter is the subformatter of another parent formatter. If so, repeat this process for the parent.
- For each
- If
If no formatter has been found by this point, iterate over all other global registered formatters that have not yet been tested, and repeat this process given each one.
formatter classes
BasicFormatter
- class graphtage.formatter.BasicFormatter(*args, **kwargs)
Bases:
Generic
[T
],Formatter
[T
]A basic formatter that falls back on an item’s natural string representation if no formatter is found.
- DEFAULT_INSTANCE: Formatter[T] = <graphtage.formatter.BasicFormatter object>
A default instance of this formatter, automatically instantiated by the
FormatterChecker
metaclass.
- __init__()
- static __new__(cls, *args, **kwargs) Formatter[T]
Instantiates a new formatter.
This automatically instantiates and populates
Formatter.sub_formatters
and sets theirparent
to this new formatter.
- get_formatter(item: T) Callable[[Printer, T], Any] | None
Looks up a formatter for the given item using this formatter as a base.
Equivalent to:
get_formatter(item.__class__, base_formatter=self)
- parent: Formatter[T] | None = None
The parent formatter for this formatter instance.
This is automatically populated by
Formatter.__new__()
and should never be manually modified.
- print(printer: Printer, item: T)
Prints the item to the printer.
This is equivalent to:
formatter = self.get_formatter(item) if formatter is None: printer.write(str(item)) else: formatter(printer, item)
- sub_format_types: Sequence[Type[Formatter[T]]] = ()
A list of formatter types that should be used as sub-formatters in the Formatting Protocol.
- sub_formatters: List[Formatter[T]] = []
The list of instantiated formatters corresponding to
Formatter.sub_format_types
.This list is automatically populated by
Formatter.__new__()
and should never be manually modified.
Formatter
- class graphtage.formatter.Formatter(*args, **kwargs)
Bases:
Generic
[T
]- DEFAULT_INSTANCE: Formatter[T] = None
A default instance of this formatter, automatically instantiated by the
FormatterChecker
metaclass.
- __init__()
- static __new__(cls, *args, **kwargs) Formatter[T]
Instantiates a new formatter.
This automatically instantiates and populates
Formatter.sub_formatters
and sets theirparent
to this new formatter.
- get_formatter(item: T) Callable[[Printer, T], Any] | None
Looks up a formatter for the given item using this formatter as a base.
Equivalent to:
get_formatter(item.__class__, base_formatter=self)
- parent: Formatter[T] | None = None
The parent formatter for this formatter instance.
This is automatically populated by
Formatter.__new__()
and should never be manually modified.
- abstract print(printer: Printer, item: T)
Prints an item to the printer using the proper formatter.
This method is abstract because subclasses should decide how to handle the case when a formatter was not found (i.e., when
self.get_formatter
returnsNone
).
- sub_format_types: Sequence[Type[Formatter[T]]] = ()
A list of formatter types that should be used as sub-formatters in the Formatting Protocol.
- sub_formatters: List[Formatter[T]] = []
The list of instantiated formatters corresponding to
Formatter.sub_format_types
.This list is automatically populated by
Formatter.__new__()
and should never be manually modified.
FormatterChecker
- class graphtage.formatter.FormatterChecker(name, bases, namespace, **kwargs)
Bases:
ABCMeta
The metaclass for
Formatter
.For every class that subclasses
Formatter
, ifFormatter.is_partial
isFalse
(the default) and if the class is not abstract, then an instance of that class is automatically constructed and added to the global list of formatters. This same automatically constructed instance will be assigned to theFormatter.DEFAULT_INSTANCE
attribute.All methods of the subclass that begin with “
print_
” will be verified insofar as it is possible.- __init__(name, bases, clsdict)
Initializes the formatter checker.
- Raises:
TypeError – If
cls
defines a method starting with “print
” that has a keyword argumentprinter
with a type hint that is not a subclass ofgraphtage.printer.Printer
.
- __instancecheck__(instance)
Override for isinstance(instance, cls).
- __subclasscheck__(subclass)
Override for issubclass(subclass, cls).
- _abc_caches_clear()
Clear the caches (for debugging or testing).
- _abc_registry_clear()
Clear the registry (for debugging or testing).
- _dump_registry(file=None)
Debug helper to print the ABC registry.
- mro()
Return a type’s method resolution order.
- register(subclass)
Register a virtual subclass of an ABC.
Returns the subclass, to allow usage as a class decorator.
formatter functions
get_formatter
- graphtage.formatter.get_formatter(node_type: Type[T], base_formatter: Formatter | None = None) Callable[[Printer, T], Any] | None
Uses the Formatting Protocol to determine the correct formatter for a given type.
See this section for a number of examples.
- Parameters:
node_type – The type of the object to be formatted.
base_formatter – An existing formatter from which the request is being made. This will affect the formatter resolution according to the Formatting Protocol.
Returns: The formatter for object type
node_type
, orNone
if none was found.