graphtage.utils¶
Generic utility functions and classes.
utils classes¶
HashableCounter¶
- class graphtage.utils.HashableCounter(*args, **kwargs)¶
Bases:
Generic
[T
],Counter
[T
],Counter
A
Counter
that supports being hashed even though it is mutable.- __add__(other)¶
Add counts from two counters.
>>> Counter('abbb') + Counter('bcc') Counter({'b': 4, 'c': 2, 'a': 1})
- __and__(other)¶
Intersection is the minimum of corresponding counts.
>>> Counter('abbb') & Counter('bcc') Counter({'b': 1})
- __delitem__(elem)¶
Like dict.__delitem__() but does not raise KeyError for missing values.
- __iadd__(other)¶
Inplace add from another counter, keeping only positive counts.
>>> c = Counter('abbb') >>> c += Counter('bcc') >>> c Counter({'b': 4, 'c': 2, 'a': 1})
- __iand__(other)¶
Inplace intersection is the minimum of corresponding counts.
>>> c = Counter('abbb') >>> c &= Counter('bcc') >>> c Counter({'b': 1})
- __init__(*args, **kwargs)¶
- __ior__(other)¶
Inplace union is the maximum of value from either counter.
>>> c = Counter('abbb') >>> c |= Counter('bcc') >>> c Counter({'b': 3, 'c': 2, 'a': 1})
- __isub__(other)¶
Inplace subtract counter, but keep only results with positive counts.
>>> c = Counter('abbbc') >>> c -= Counter('bccd') >>> c Counter({'b': 2, 'a': 1})
- __missing__(key)¶
The count of elements not in the Counter is zero.
- __neg__()¶
Subtracts from an empty counter. Strips positive and zero counts, and flips the sign on negative counts.
- __or__(other)¶
Union is the maximum of value in either of the input counters.
>>> Counter('abbb') | Counter('bcc') Counter({'b': 3, 'c': 2, 'a': 1})
- __pos__()¶
Adds an empty counter, effectively stripping negative and zero counts
- __sub__(other)¶
Subtract count, but keep only results with positive counts.
>>> Counter('abbbc') - Counter('bccd') Counter({'b': 2, 'a': 1})
- _keep_positive()¶
Internal method to strip elements with a negative or zero count
- clear() None. Remove all items from D. ¶
- copy()¶
Return a shallow copy.
- elements() Iterator ¶
Iterator over elements repeating each as many times as its count.
Examples:
>>> c = Counter('ABCABC') >>> sorted(c.elements()) ['A', 'A', 'B', 'B', 'C', 'C'] # Knuth's example for prime factors of 1836: 2**2 * 3**3 * 17**1 >>> prime_factors = Counter({2: 2, 3: 3, 17: 1}) >>> product = 1 >>> for factor in prime_factors.elements(): # loop over factors ... product *= factor # and multiply them >>> product 1836
Note
If an element’s count has been set to zero or is a negative number, elements() will ignore it.
- classmethod fromkeys(iterable, v=None)¶
Create a new dictionary with keys from iterable and values set to value.
- get(key, default=None, /)¶
Return the value for key if key is in the dictionary, else default.
- items() a set-like object providing a view on D's items ¶
- keys() a set-like object providing a view on D's keys ¶
- most_common(n=None)¶
List the n most common elements and their counts from the most common to the least. If n is None, then list all element counts.
>>> Counter('abracadabra').most_common(3) [('a', 5), ('b', 2), ('r', 2)]
- pop(k[, d]) v, remove specified key and return the corresponding value. ¶
If key is not found, d is returned if given, otherwise KeyError is raised
- popitem()¶
Remove and return a (key, value) pair as a 2-tuple.
Pairs are returned in LIFO (last-in, first-out) order. Raises KeyError if the dict is empty.
- setdefault(key, default=None, /)¶
Insert key with a value of default if key is not in the dictionary.
Return the value for key if key is in the dictionary, else default.
- subtract(iterable=None, /, **kwds)¶
Like dict.update() but subtracts counts instead of replacing them. Counts can be reduced below zero. Both the inputs and outputs are allowed to contain zero and negative counts.
Source can be an iterable, a dictionary, or another Counter instance.
>>> c = Counter('which') >>> c.subtract('witch') # subtract elements from another iterable >>> c.subtract(Counter('watch')) # subtract elements from another counter >>> c['h'] # 2 in which, minus 1 in witch, minus 1 in watch 0 >>> c['w'] # 1 in which, minus 1 in witch, minus 1 in watch -1
- update(iterable=None, /, **kwds)¶
Like dict.update() but add counts instead of replacing them.
Source can be an iterable, a dictionary, or another Counter instance.
>>> c = Counter('which') >>> c.update('witch') # add elements from another iterable >>> d = Counter('watch') >>> c.update(d) # add elements from another counter >>> c['h'] # four 'h' in which, witch, and watch 4
- values() an object providing a view on D's values ¶
OrderedCounter¶
- class graphtage.utils.OrderedCounter(iterable=None, /, **kwds)¶
Bases:
Counter
,OrderedDict
A counter that remembers the order elements are first encountered
- __add__(other)¶
Add counts from two counters.
>>> Counter('abbb') + Counter('bcc') Counter({'b': 4, 'c': 2, 'a': 1})
- __and__(other)¶
Intersection is the minimum of corresponding counts.
>>> Counter('abbb') & Counter('bcc') Counter({'b': 1})
- __delitem__(elem)¶
Like dict.__delitem__() but does not raise KeyError for missing values.
- __iadd__(other)¶
Inplace add from another counter, keeping only positive counts.
>>> c = Counter('abbb') >>> c += Counter('bcc') >>> c Counter({'b': 4, 'c': 2, 'a': 1})
- __iand__(other)¶
Inplace intersection is the minimum of corresponding counts.
>>> c = Counter('abbb') >>> c &= Counter('bcc') >>> c Counter({'b': 1})
- __init__(iterable=None, /, **kwds)¶
Create a new, empty Counter object. And if given, count elements from an input iterable. Or, initialize the count from another mapping of elements to their counts.
>>> c = Counter() # a new, empty counter >>> c = Counter('gallahad') # a new counter from an iterable >>> c = Counter({'a': 4, 'b': 2}) # a new counter from a mapping >>> c = Counter(a=4, b=2) # a new counter from keyword args
- __ior__(other)¶
Inplace union is the maximum of value from either counter.
>>> c = Counter('abbb') >>> c |= Counter('bcc') >>> c Counter({'b': 3, 'c': 2, 'a': 1})
- __isub__(other)¶
Inplace subtract counter, but keep only results with positive counts.
>>> c = Counter('abbbc') >>> c -= Counter('bccd') >>> c Counter({'b': 2, 'a': 1})
- __missing__(key)¶
The count of elements not in the Counter is zero.
- __neg__()¶
Subtracts from an empty counter. Strips positive and zero counts, and flips the sign on negative counts.
- __or__(other)¶
Union is the maximum of value in either of the input counters.
>>> Counter('abbb') | Counter('bcc') Counter({'b': 3, 'c': 2, 'a': 1})
- __pos__()¶
Adds an empty counter, effectively stripping negative and zero counts
- __sub__(other)¶
Subtract count, but keep only results with positive counts.
>>> Counter('abbbc') - Counter('bccd') Counter({'b': 2, 'a': 1})
- _keep_positive()¶
Internal method to strip elements with a negative or zero count
- clear() None. Remove all items from od. ¶
- copy()¶
Return a shallow copy.
- elements() Iterator ¶
Iterator over elements repeating each as many times as its count.
Examples:
>>> c = Counter('ABCABC') >>> sorted(c.elements()) ['A', 'A', 'B', 'B', 'C', 'C'] # Knuth's example for prime factors of 1836: 2**2 * 3**3 * 17**1 >>> prime_factors = Counter({2: 2, 3: 3, 17: 1}) >>> product = 1 >>> for factor in prime_factors.elements(): # loop over factors ... product *= factor # and multiply them >>> product 1836
Note
If an element’s count has been set to zero or is a negative number, elements() will ignore it.
- classmethod fromkeys(iterable, v=None)¶
Create a new ordered dictionary with keys from iterable and values set to value.
- get(key, default=None, /)¶
Return the value for key if key is in the dictionary, else default.
- items() a set-like object providing a view on D's items ¶
- keys() a set-like object providing a view on D's keys ¶
- most_common(n=None)¶
List the n most common elements and their counts from the most common to the least. If n is None, then list all element counts.
>>> Counter('abracadabra').most_common(3) [('a', 5), ('b', 2), ('r', 2)]
- move_to_end(key, last=True)¶
Move an existing element to the end (or beginning if last is false).
Raise KeyError if the element does not exist.
- pop(k[, d]) v, remove specified key and return the corresponding ¶
value. If key is not found, d is returned if given, otherwise KeyError is raised.
- popitem(last=True)¶
Remove and return a (key, value) pair from the dictionary.
Pairs are returned in LIFO order if last is true or FIFO order if false.
- setdefault(key, default=None)¶
Insert key with a value of default if key is not in the dictionary.
Return the value for key if key is in the dictionary, else default.
- subtract(iterable=None, /, **kwds)¶
Like dict.update() but subtracts counts instead of replacing them. Counts can be reduced below zero. Both the inputs and outputs are allowed to contain zero and negative counts.
Source can be an iterable, a dictionary, or another Counter instance.
>>> c = Counter('which') >>> c.subtract('witch') # subtract elements from another iterable >>> c.subtract(Counter('watch')) # subtract elements from another counter >>> c['h'] # 2 in which, minus 1 in witch, minus 1 in watch 0 >>> c['w'] # 1 in which, minus 1 in witch, minus 1 in watch -1
- update(iterable=None, /, **kwds)¶
Like dict.update() but add counts instead of replacing them.
Source can be an iterable, a dictionary, or another Counter instance.
>>> c = Counter('which') >>> c.update('witch') # add elements from another iterable >>> d = Counter('watch') >>> c.update(d) # add elements from another counter >>> c['h'] # four 'h' in which, witch, and watch 4
- values() an object providing a view on D's values ¶
Sized¶
SparseMatrix¶
- class graphtage.utils.SparseMatrix(num_rows: int | None = None, num_cols: int | None = None, default_value: T | None = None)¶
Bases:
Sized
,Generic
[T
],Mapping
[int
,MutableMapping
[int
,Optional
[T
]]]A sparse matrix that can store arbitrary Python objects.
For sparse matrices storing homogeneous items and/or native types, it is more efficient to use an implementation like a scipy sparse matrix.
- class SparseMatrixRow(row_num: int, num_cols: int | None = None, default_value: T | None = None)¶
Bases:
Sized
,MutableMapping
[int
,Optional
[T
]]A row of a sparse matrix.
- __getitem__(col: int) T | None ¶
Returns the value of the given column of this row.
- Parameters:
col – The index of the column to get.
- Returns:
The value of the column, or the default value if it has not yet been set.
- Return type:
Optional[T]
- Raises:
IndexError – If
self.num_cols
is notNone
andcol
is greater than or equal to it.
- __init__(row_num: int, num_cols: int | None = None, default_value: T | None = None)¶
Initializes a sparse matrix row.
- Parameters:
row_num – The index of this row.
num_cols – An optional number of columns in this row. If omitted, this row will be unbounded and allow insertion and access at any column index.
default_value – An optional default value to use if a column is accessed before it is assigned.
- clear()¶
Clears the contents of this row.
- get(k[, d]) D[k] if k in D, else d. d defaults to None. ¶
- getsizeof() int ¶
Returns an approximation of the number of bytes used by this row in memory.
This is equivalent to:
return sys.getsizeof(self) + getsizeof(self.row)
- items() a set-like object providing a view on D's items ¶
- keys() a set-like object providing a view on D's keys ¶
- pop(k[, d]) v, remove specified key and return the corresponding value. ¶
If key is not found, d is returned if given, otherwise KeyError is raised.
- popitem() (k, v), remove and return some (key, value) pair ¶
as a 2-tuple; but raise KeyError if D is empty.
- setdefault(k[, d]) D.get(k,d), also set D[k]=d if k not in D ¶
- shape() int ¶
The number of columns in this row.
This is equivalent to:
if self.num_cols is None: if self.row: return max(self.row.keys()) + 1 else: return 0 else: return self.num_cols
- update([E, ]**F) None. Update D from mapping/iterable E and F. ¶
If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v
- values() an object providing a view on D's values ¶
- __getitem__(row: int) MutableMapping[int, T | None] ¶
Returns the value of the given row of this matrix.
- Parameters:
row – The index of the row to get.
- Returns:
The contents of the row.
- Return type:
MutableMapping[int, Optional[T]]
- Raises:
IndexError – If
self.num_rows
is notNone
androw
is greater than or equal to it.
- __init__(num_rows: int | None = None, num_cols: int | None = None, default_value: T | None = None)¶
Initializes a sparse matrix.
- Parameters:
num_rows – An optional bound on the number of rows.
num_cols – An optional bound on the number of columns.
default_value – An optional default value to return if cells are accessed before they are set.
- __len__() int ¶
Returns the number of rows in this matrix.
If
self.num_rows
isNone
, this will return the maximum row index of a cell that has been set, plus one.
- clear()¶
Clears the contents of this matrix.
- get(k[, d]) D[k] if k in D, else d. d defaults to None. ¶
- items() a set-like object providing a view on D's items ¶
- keys() a set-like object providing a view on D's keys ¶
- num_filled_elements() int ¶
Counts the number of elements in this matrix that have been explicitly set.
- rows: Dict[int, SparseMatrixRow]¶
The rows of this matrix.
- values() an object providing a view on D's values ¶
Tempfile¶
- class graphtage.utils.Tempfile(contents: bytes, prefix: str | None = None, suffix: str | None = None)¶
Bases:
object
Creates a temporary file containing a given byte sequence.
The file will automatically be cleaned up after its use. This is useful for interacting with functions and libraries that only accept a path and not a stream object.
Examples
>>> from graphtage.utils import Tempfile >>> with Tempfile(b"foo") as tmp: ... print(tmp) /var/folders/bs/hrvzrctx6tg2_j17gb6wckph0000gn/T/tmpkza5fvr_ >>> with Tempfile(b"foo") as tmp: ... with open(tmp, 'r') as f: ... print(f.read()) foo
- __exit__(type, value, traceback)¶
Automatically cleans up the tempfile.
utils functions¶
getsizeof¶
- graphtage.utils.getsizeof(obj) int ¶
A function to calculate the memory footprint of an object.
If the object implements the
Sized
protocol (i.e., if it implements agetsizeof
method), then this is used:if hasattr(obj, 'getsizeof'): return obj.getsizeof()
If the object is a list or tuple, then:
return sys.getsizeof(obj) + sum(getsizeof(i) for i in obj)
If the object is a dict, then:
return sys.getsizeof(obj) + sum(getsizeof(key) + getsizeof(value) for key, value in obj.items())
Otherwise:
return sys.getsizeof(obj)
- Parameters:
obj – The object to measure.
- Returns:
An approximation of the memory footprint of the object, in bytes.
- Return type: