graphtage.matching

A module for solving variants of the assignment problem.

Much of the code in this module is a nearly complete (but partial) implementation of Karp’s solution to the minimum weight bipartite matching problem [Karp78]. This is the problem of finding an edge between all pairs of nodes such that the sum of their weights is minimized. It is only a partial implementation because, part way through developing it, it was discovered that scipy’s implementation—while asymptotically inferior—is, in practice, almost always superior because it is compiled and not implemented in pure Python.

The two components of this module in which you will most likely be interested are min_weight_bipartite_matching() and a class that wraps it, WeightedBipartiteMatcher.

Example

>>> from graphtage.matching import min_weight_bipartite_matching
>>> from_nodes = ['a', 'b', 'c', 'd', 'e']
>>> to_nodes = range(10)
>>> edge_weights = lambda c, n: ord(c) + n
>>> min_weight_bipartite_matching(from_nodes, to_nodes, edge_weights)
{0: (0, 97), 1: (1, 99), 2: (2, 101), 3: (3, 103), 4: (4, 105)}
# The output format here is: "from_nodes_index: (matched_to_nodes_index, matched_edge_weight)"
>>> min_weight_bipartite_matching(range(5), range(10, 20), lambda a, b: a + b)
{0: (0, 10), 1: (1, 12), 2: (2, 14), 3: (3, 16), 4: (4, 18)}

matching classes

Edge

class graphtage.matching.Edge(from_node: MatchingFromNode[T], to_node: MatchingToNode[T], weight: Bounded)

Bases: Bounded, Generic[T]

Edge data structure used in the implementation of [Karp78].

__init__(from_node: MatchingFromNode[T], to_node: MatchingToNode[T], weight: Bounded)
bounds() Range

Returns the bounds of this object.

property cost_bar: int
property cost_star: int
tighten_bounds() bool

Attempts to shrink the bounds of this object.

Returns:

True if the bounds were tightened.

Return type:

bool

Matching

class graphtage.matching.Matching

Bases: Generic[T], Set, Bounded, Set[Edge[T]]

An abstract base class used by the partial implementation of [Karp78].

__init__()
classmethod _from_iterable(it)

Construct an instance of the class from any iterable input.

Must override this method if the class constructor signature does not accept an iterable for an input.

_hash()

Compute the hash value of a set.

Note that we don’t define __hash__: not all sets are hashable. But if you define a hashable set type, its __hash__ should call this function.

This must be compatible __eq__.

All sets ought to compare equal if they contain the same elements, regardless of how they are implemented, and regardless of the order of the elements; so there’s not much freedom for __eq__ or __hash__. We match the algorithm used by the built-in frozenset type.

add(edge: Edge[T])

Add an element to a set.

This has no effect if the element is already present.

bounds() Range

Returns the bounds of this object.

clear()

Remove all elements from this set.

copy()

Return a shallow copy of a set.

difference()

Return the difference of two or more sets as a new set.

(i.e. all elements that are in this set but not the others.)

difference_update()

Remove all elements of another set from this set.

discard()

Remove an element from a set if it is a member.

If the element is not a member, do nothing.

intersection()

Return the intersection of two sets as a new set.

(i.e. all elements that are in both sets.)

intersection_update()

Update a set with the intersection of itself and another.

isdisjoint(other)

Return True if two sets have a null intersection.

issubset()

Report whether another set contains this set.

issuperset()

Report whether this set contains another set.

pop()

Remove and return an arbitrary set element. Raises KeyError if the set is empty.

remove()

Remove an element from a set; it must be a member.

If the element is not a member, raise a KeyError.

symmetric_difference(matching: Set[Edge[T]]) Matching[T]

Return the symmetric difference of two sets as a new set.

(i.e. all elements that are in exactly one of the sets.)

symmetric_difference_update()

Update a set with the symmetric difference of itself and another.

tighten_bounds() bool

Attempts to shrink the bounds of this object.

Returns:

True if the bounds were tightened.

Return type:

bool

union()

Return the union of sets as a new set.

(i.e. all elements that are in either set.)

update()

Update a set with the union of itself and others.

MatchingFromNode

class graphtage.matching.MatchingFromNode(*args, **kwargs)

Bases: Generic[T], MatchingNode[T]

__init__(*args, **kwargs)
construct_edges() Dict[MatchingNode[T], Edge[T]]
edges() Iterable[Edge[T]]
property sorted_neighbors: SortedEdges[T]

MatchingNode

class graphtage.matching.MatchingNode(matcher: WeightedBipartiteMatcher[T], node: T)

Bases: Generic[T]

Node data structure used in the implementation of [Karp78].

__init__(matcher: WeightedBipartiteMatcher[T], node: T)
abstract construct_edges() Dict[MatchingNode[T], Edge[T]]
edges() Iterable[Edge[T]]

MatchingToNode

class graphtage.matching.MatchingToNode(matcher: WeightedBipartiteMatcher[T], node: T)

Bases: Generic[T], MatchingNode[T]

A node type used in the implementation of [Karp78].

__init__(matcher: WeightedBipartiteMatcher[T], node: T)
construct_edges() Dict[MatchingNode[T], Edge[T]]
edges() Iterable[Edge[T]]

PathSet

class graphtage.matching.PathSet

Bases: Matching[T], Generic[T]

A version of a Matching with edge directions overridden, used for [Karp78]

__init__()
classmethod _from_iterable(it)

Construct an instance of the class from any iterable input.

Must override this method if the class constructor signature does not accept an iterable for an input.

_hash()

Compute the hash value of a set.

Note that we don’t define __hash__: not all sets are hashable. But if you define a hashable set type, its __hash__ should call this function.

This must be compatible __eq__.

All sets ought to compare equal if they contain the same elements, regardless of how they are implemented, and regardless of the order of the elements; so there’s not much freedom for __eq__ or __hash__. We match the algorithm used by the built-in frozenset type.

add(edge: Edge[T], flip_direction: bool)

Add an element to a set.

This has no effect if the element is already present.

bounds() Range

Returns the bounds of this object.

clear()

Remove all elements from this set.

copy()

Return a shallow copy of a set.

difference()

Return the difference of two or more sets as a new set.

(i.e. all elements that are in this set but not the others.)

difference_update()

Remove all elements of another set from this set.

discard()

Remove an element from a set if it is a member.

If the element is not a member, do nothing.

intersection()

Return the intersection of two sets as a new set.

(i.e. all elements that are in both sets.)

intersection_update()

Update a set with the intersection of itself and another.

isdisjoint(other)

Return True if two sets have a null intersection.

issubset()

Report whether another set contains this set.

issuperset()

Report whether this set contains another set.

path_to(from_any_of: Set[MatchingFromNode[T]], node: MatchingToNode[T]) Set[Edge[T]]
pop()

Remove and return an arbitrary set element. Raises KeyError if the set is empty.

remove()

Remove an element from a set; it must be a member.

If the element is not a member, raise a KeyError.

symmetric_difference(matching: Set[Edge[T]]) Matching[T]

Return the symmetric difference of two sets as a new set.

(i.e. all elements that are in exactly one of the sets.)

symmetric_difference_update()

Update a set with the symmetric difference of itself and another.

tighten_bounds() bool

Attempts to shrink the bounds of this object.

Returns:

True if the bounds were tightened.

Return type:

bool

union()

Return the union of sets as a new set.

(i.e. all elements that are in either set.)

update()

Update a set with the union of itself and others.

QueueElement

class graphtage.matching.QueueElement(edge: Edge[T], cost: int, is_special: bool)

Bases: object

A helper datastructure used by WeightedBipartiteMatcherPARTIAL_IMPLEMENTATION

__init__(edge: Edge[T], cost: int, is_special: bool)

SortedEdges

class graphtage.matching.SortedEdges(edges: Iterable[Edge[T]])

Bases: Generic[T]

A sorted collection of edges.

__init__(edges: Iterable[Edge[T]])
head() Edge[T]
tail() Edge[T]

WeightedBipartiteMatcher

class graphtage.matching.WeightedBipartiteMatcher(from_nodes: Iterable[T], to_nodes: Iterable[T], get_edge: Callable[[T, T], Bounded | None])

Bases: Bounded, Generic[T]

A graphtage.TreeNode matcher built atop min_weight_bipartite_matching().

It works by iteratively and selectively tightening the bipartite graph’s edges’ bounds until they are all “distinct” (i.e., their bounded ranges are all either graphtage.bounds.Range.definitive() or non-overlapping). It does this using graphtage.bounds.make_distinct(). Then min_weight_bipartite_matching() is used to solve the minimum weight matching problem.

This is used by graphtage.multiset.MultiSetEdit to match graphtage.MultiSetNode.

__init__(from_nodes: Iterable[T], to_nodes: Iterable[T], get_edge: Callable[[T, T], Bounded | None])

Initializes the weighted bipartite matcher.

Parameters:
  • from_nodes – The nodes from which to match.

  • to_nodes – The nodes to which to match.

  • get_edge – A function returning a bounded edge between two nodes (or None if there is no edge). For example, this could be a graphtage.Edge.

bounds() Range

Returns the bounds of this object.

property edges: List[List[Bounded | None]]

Returns a dense matrix of the edges in the graph.

This property lazily constructs the edge matrix and memoizes the result.

is_complete() bool

Whether the matching has been completed, regardless of whether the bounds have been fully tightened.

property matching: Mapping[T, Tuple[T, Bounded]]

Returns the minimum weight matching.

Returns:

A mapping from self.from_nodes

to a tuples with the matched node in self.to_nodes and the edge between them.

Return type:

Mapping[T, Tuple[T, Bounded]]

Note

This function will perform any necessary computation to determine the mapping in the event that it has not already been completed through prior calls to WeightedBipartiteMatcher.tighten_bounds().

tighten_bounds() bool

Tightens the bounds on the minimum weight matching.

Returns: True if the bounds were able to be tightened.

WeightedBipartiteMatcherPARTIAL_IMPLEMENTATION

class graphtage.matching.WeightedBipartiteMatcherPARTIAL_IMPLEMENTATION(from_nodes: Iterable[T], to_nodes: Iterable[T], get_edge: Callable[[T, T], Bounded | None])

Bases: Bounded, Generic[T]

Partial implementation of An Algorithm to Solve the \(m \times n\) Assignment Problem in Expected Time \(O(mn \log n)\) [Karp78].

The implementation is partial because I realized partway through that, even though this implementation has better asymptotic bounds, scipy.optimize.linear_sum_assignment() will almost always be much faster since it is implemented in C++ and not pure Python.

It is retained here in the event that it is ever needed in the future.

__init__(from_nodes: Iterable[T], to_nodes: Iterable[T], get_edge: Callable[[T, T], Bounded | None])
bounds() Range

Returns the bounds of this object.

free_destinations() Iterator[MatchingToNode[T]]
free_sources() Iterator[MatchingFromNode[T]]
tighten_bounds() bool

Attempts to shrink the bounds of this object.

Returns:

True if the bounds were tightened.

Return type:

bool

matching functions

get_dtype

graphtage.matching.get_dtype(min_value: int, max_value: int) dtype

Returns the smallest numpy dtype capable of storing integers in the range [min_value, max_value]

min_weight_bipartite_matching

graphtage.matching.min_weight_bipartite_matching(from_nodes: Sequence[T], to_nodes: Sequence[T], get_edges: Callable[[T, T], W | None]) Mapping[int, Tuple[int, bool | int | float]]

Calculates the minimum weight bipartite matching between two sequences.

Parameters:
  • from_nodes – The nodes in the first bipartite set.

  • to_nodes – The nodes in the second bipartite set.

  • get_edges – A function mapping pairs of edges to the weight of the edge between them, or None if there is no edge between them.

Warning

get_edges is expected to return all edge weights using the same type, either int, float, or bool.

Warning

get_edges can only return None for an edge weight if the rest of the edge weights are of type either int or float. Graphs with bool edge weights must be complete.

Warning

This function uses native types for computational efficiency. Edge weights larger than 2**64 or less than -2**63 will result in undefined behavior.

Returns:

A mapping from from_node indices to pairs (matched_to_node_index, edge_weight).

Raises:
  • ValueError – If not all of the edges are the same type.

  • ValueError – If a boolean edge weighted graph is not complete.