mirror of
https://github.com/jbranchaud/til
synced 2026-03-03 22:48:45 +00:00
Add Easy Key-Value Aggregates With defaultdict as a Python TIL
This commit is contained in:
@@ -10,7 +10,7 @@ working across different projects via [VisualMode](https://www.visualmode.dev/).
|
|||||||
|
|
||||||
For a steady stream of TILs, [sign up for my newsletter](https://visualmode.kit.com/newsletter).
|
For a steady stream of TILs, [sign up for my newsletter](https://visualmode.kit.com/newsletter).
|
||||||
|
|
||||||
_1748 TILs and counting..._
|
_1749 TILs and counting..._
|
||||||
|
|
||||||
See some of the other learning resources I work on:
|
See some of the other learning resources I work on:
|
||||||
|
|
||||||
@@ -1044,6 +1044,7 @@ If you've learned something here, support my efforts writing daily TILs by
|
|||||||
- [Check If Package Is Installed With Pip](python/check-if-package-is-installed-with-pip.md)
|
- [Check If Package Is Installed With Pip](python/check-if-package-is-installed-with-pip.md)
|
||||||
- [Create A Dummy DataFrame In Pandas](python/create-a-dummy-dataframe-in-pandas.md)
|
- [Create A Dummy DataFrame In Pandas](python/create-a-dummy-dataframe-in-pandas.md)
|
||||||
- [Dunder Methods](python/dunder-methods.md)
|
- [Dunder Methods](python/dunder-methods.md)
|
||||||
|
- [Easy Key-Value Aggregates With defaultdict](python/easy-key-value-aggregates-with-defaultdict.md)
|
||||||
- [Install With PIP For Specific Interpreter](python/install-with-pip-for-specific-interpreter.md)
|
- [Install With PIP For Specific Interpreter](python/install-with-pip-for-specific-interpreter.md)
|
||||||
- [Iterate First N Items From Enumerable](python/iterate-first-n-items-from-enumerable.md)
|
- [Iterate First N Items From Enumerable](python/iterate-first-n-items-from-enumerable.md)
|
||||||
- [Keep A Tally With collections.Counter](python/keep-a-tally-with-collections-counter.md)
|
- [Keep A Tally With collections.Counter](python/keep-a-tally-with-collections-counter.md)
|
||||||
|
|||||||
53
python/easy-key-value-aggregates-with-defaultdict.md
Normal file
53
python/easy-key-value-aggregates-with-defaultdict.md
Normal file
@@ -0,0 +1,53 @@
|
|||||||
|
# Easy Key-Value Aggregates With defaultdict
|
||||||
|
|
||||||
|
The `collections` module has the `defaultdict` object that can be used to
|
||||||
|
aggregate values tied to a key. What sets this apart from simply using a `dict`
|
||||||
|
is that we get the base value for free. So if our aggregate value is a list,
|
||||||
|
then we get `[]` by default for each new key. In the same way, we'd get `0` if
|
||||||
|
it was constructed with `int`.
|
||||||
|
|
||||||
|
Here is the counter example from [Keep A Tally With
|
||||||
|
collections.Counter](keep-a-tally-with-collections-counter.md)
|
||||||
|
|
||||||
|
```python
|
||||||
|
from collections import defaultdict
|
||||||
|
|
||||||
|
def get_pair_counts(token_ids: list[int]) -> Counter:
|
||||||
|
"""Count how often each adjacent pair appears"""
|
||||||
|
counts = defaultdict(int)
|
||||||
|
for i in range(len(token_ids) - 1):
|
||||||
|
pair = (token_ids[i], token_ids[i + 1])
|
||||||
|
counts[pair] += 1
|
||||||
|
return counts
|
||||||
|
```
|
||||||
|
|
||||||
|
We never have to initially set a key to `0`. If the key is not yet present, then
|
||||||
|
`int()` (the zero-value constructor) is used as the `__missing__` value.
|
||||||
|
|
||||||
|
We can do the same with `list`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
>>> import collections
|
||||||
|
>>> stuff = collections.defaultdict(list)
|
||||||
|
>>> stuff['alpha'].append(1)
|
||||||
|
>>> stuff['alpha']
|
||||||
|
[1]
|
||||||
|
>>> stuff['beta']
|
||||||
|
[]
|
||||||
|
```
|
||||||
|
|
||||||
|
In the same way, this uses `list()` as the `__missing__` value to start of each
|
||||||
|
key with an `[]`.
|
||||||
|
|
||||||
|
I find this so handy because in other languages I've typically had to do
|
||||||
|
something more like this:
|
||||||
|
|
||||||
|
```python
|
||||||
|
words_by_length = {}
|
||||||
|
for item in items:
|
||||||
|
if len(item) not in words_by_length:
|
||||||
|
words_by_length[len(item)] = []
|
||||||
|
words_by_length[len(item)].append(item)
|
||||||
|
```
|
||||||
|
|
||||||
|
This is much clunkier.
|
||||||
Reference in New Issue
Block a user