mirror of
https://github.com/jbranchaud/til
synced 2026-03-07 00:18:46 +00:00
Add Use Verbose Flag To Get More Diff as a Python TIL
This commit is contained in:
@@ -10,7 +10,7 @@ working across different projects via [VisualMode](https://www.visualmode.dev/).
|
|||||||
|
|
||||||
For a steady stream of TILs, [sign up for my newsletter](https://visualmode.kit.com/newsletter).
|
For a steady stream of TILs, [sign up for my newsletter](https://visualmode.kit.com/newsletter).
|
||||||
|
|
||||||
_1752 TILs and counting..._
|
_1753 TILs and counting..._
|
||||||
|
|
||||||
See some of the other learning resources I work on:
|
See some of the other learning resources I work on:
|
||||||
|
|
||||||
@@ -1055,6 +1055,7 @@ If you've learned something here, support my efforts writing daily TILs by
|
|||||||
- [Store And Access Immutable Data In A Tuple](python/store-and-access-immutable-data-in-a-tuple.md)
|
- [Store And Access Immutable Data In A Tuple](python/store-and-access-immutable-data-in-a-tuple.md)
|
||||||
- [Test A Function With Pytest](python/test-a-function-with-pytest.md)
|
- [Test A Function With Pytest](python/test-a-function-with-pytest.md)
|
||||||
- [Use pipx To Install End User Apps](python/use-pipx-to-install-end-user-apps.md)
|
- [Use pipx To Install End User Apps](python/use-pipx-to-install-end-user-apps.md)
|
||||||
|
- [Use Verbose Flag To Get More Diff](python/use-verbose-flag-to-get-more-diff.md)
|
||||||
|
|
||||||
### Rails
|
### Rails
|
||||||
|
|
||||||
|
|||||||
161
python/use-verbose-flag-to-get-more-diff.md
Normal file
161
python/use-verbose-flag-to-get-more-diff.md
Normal file
@@ -0,0 +1,161 @@
|
|||||||
|
# Use Verbose Flag To Get More Diff
|
||||||
|
|
||||||
|
Here is the output of running some `pytest` unit tests. A couple of the tests
|
||||||
|
pass, which produces little output. But I get a big block of details for the one
|
||||||
|
failing test. In this case the failure is an assertion between two lists that
|
||||||
|
don't match.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
❯ uv run pytest
|
||||||
|
========================================== test session starts ==========================================
|
||||||
|
platform darwin -- Python 3.12.12, pytest-9.0.2, pluggy-1.6.0
|
||||||
|
rootdir: /Users/lastword/dev/misc/build-an-llm
|
||||||
|
configfile: pyproject.toml
|
||||||
|
collected 3 items
|
||||||
|
|
||||||
|
tests/chapter_02/test_bpe_tokenizer.py .F. [100%]
|
||||||
|
|
||||||
|
=============================================== FAILURES ================================================
|
||||||
|
_____________________________________ test_merge_with_byte_sequence _____________________________________
|
||||||
|
|
||||||
|
def test_merge_with_byte_sequence():
|
||||||
|
token_ids = [1, 2, 3, 4, 5, 2, 3, 1, 2, 3, 4, 1]
|
||||||
|
merged_tokens = BPETokenizer._merge(token_ids, [2, 3, 4], 256)
|
||||||
|
# assert merged_tokens == [1, 256, 5, 2, 3, 1, 256, 1]
|
||||||
|
> assert merged_tokens == [1, 256, 5, 4, 5, 1, 256, 1]
|
||||||
|
E assert [1, 256, 5, 2, 3, 1, ...] == [1, 256, 5, 4, 5, 1, ...]
|
||||||
|
E
|
||||||
|
E At index 3 diff: 2 != 4
|
||||||
|
E Use -v to get more diff
|
||||||
|
|
||||||
|
tests/chapter_02/test_bpe_tokenizer.py:13: AssertionError
|
||||||
|
======================================== short test summary info ========================================
|
||||||
|
FAILED tests/chapter_02/test_bpe_tokenizer.py::test_merge_with_byte_sequence - assert [1, 256, 5, 2, 3, 1, ...] == [1, 256, 5, 4, 5, 1, ...]
|
||||||
|
====================================== 1 failed, 2 passed in 0.02s ======================================
|
||||||
|
```
|
||||||
|
|
||||||
|
The lists are too long to fully display in the failure output. `pytest` is able
|
||||||
|
to tell us two useful things though. First, it mentions that the first
|
||||||
|
discrepancy in the lists is at index `3` where `2 != 4`. Second, it says `Use -v
|
||||||
|
to get more diff`.
|
||||||
|
|
||||||
|
Let's try rerunning the tests with `-v`.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
❯ uv run pytest -v
|
||||||
|
========================================== test session starts ==========================================
|
||||||
|
platform darwin -- Python 3.12.12, pytest-9.0.2, pluggy-1.6.0 -- /Users/lastword/dev/misc/build-an-llm/.venv/bin/python3
|
||||||
|
cachedir: .pytest_cache
|
||||||
|
rootdir: /Users/lastword/dev/misc/build-an-llm
|
||||||
|
configfile: pyproject.toml
|
||||||
|
collected 3 items
|
||||||
|
|
||||||
|
tests/chapter_02/test_bpe_tokenizer.py::test_merge_with_byte_pair PASSED [ 33%]
|
||||||
|
tests/chapter_02/test_bpe_tokenizer.py::test_merge_with_byte_sequence FAILED [ 66%]
|
||||||
|
tests/chapter_02/test_bpe_tokenizer.py::test_subsequence_at_index PASSED [100%]
|
||||||
|
|
||||||
|
=============================================== FAILURES ================================================
|
||||||
|
_____________________________________ test_merge_with_byte_sequence _____________________________________
|
||||||
|
|
||||||
|
def test_merge_with_byte_sequence():
|
||||||
|
token_ids = [1, 2, 3, 4, 5, 2, 3, 1, 2, 3, 4, 1]
|
||||||
|
merged_tokens = BPETokenizer._merge(token_ids, [2, 3, 4], 256)
|
||||||
|
# assert merged_tokens == [1, 256, 5, 2, 3, 1, 256, 1]
|
||||||
|
> assert merged_tokens == [1, 256, 5, 4, 5, 1, 256, 1]
|
||||||
|
E AssertionError: assert [1, 256, 5, 2, 3, 1, ...] == [1, 256, 5, 4, 5, 1, ...]
|
||||||
|
E
|
||||||
|
E At index 3 diff: 2 != 4
|
||||||
|
E
|
||||||
|
E Full diff:
|
||||||
|
E [
|
||||||
|
E 1,
|
||||||
|
E 256,...
|
||||||
|
E
|
||||||
|
E ...Full output truncated (13 lines hidden), use '-vv' to show
|
||||||
|
|
||||||
|
tests/chapter_02/test_bpe_tokenizer.py:13: AssertionError
|
||||||
|
======================================== short test summary info ========================================
|
||||||
|
FAILED tests/chapter_02/test_bpe_tokenizer.py::test_merge_with_byte_sequence - AssertionError: assert [1, 256, 5, 2, 3, 1, ...] == [1, 256, 5, 4, 5, 1, ...]
|
||||||
|
====================================== 1 failed, 2 passed in 0.02s ======================================
|
||||||
|
```
|
||||||
|
|
||||||
|
That was sort of a tease because it starts to display a "Full diff", but that
|
||||||
|
gets quickly truncated. `pytest` then tells us that we can `use '-vv' to show`
|
||||||
|
the full diff.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
❯ uv run pytest -vv
|
||||||
|
========================================== test session starts ==========================================
|
||||||
|
platform darwin -- Python 3.12.12, pytest-9.0.2, pluggy-1.6.0 -- /Users/lastword/dev/misc/build-an-llm/.venv/bin/python3
|
||||||
|
cachedir: .pytest_cache
|
||||||
|
rootdir: /Users/lastword/dev/misc/build-an-llm
|
||||||
|
configfile: pyproject.toml
|
||||||
|
collected 3 items
|
||||||
|
|
||||||
|
tests/chapter_02/test_bpe_tokenizer.py::test_merge_with_byte_pair PASSED [ 33%]
|
||||||
|
tests/chapter_02/test_bpe_tokenizer.py::test_merge_with_byte_sequence FAILED [ 66%]
|
||||||
|
tests/chapter_02/test_bpe_tokenizer.py::test_subsequence_at_index PASSED [100%]
|
||||||
|
|
||||||
|
=============================================== FAILURES ================================================
|
||||||
|
_____________________________________ test_merge_with_byte_sequence _____________________________________
|
||||||
|
|
||||||
|
def test_merge_with_byte_sequence():
|
||||||
|
token_ids = [1, 2, 3, 4, 5, 2, 3, 1, 2, 3, 4, 1]
|
||||||
|
merged_tokens = BPETokenizer._merge(token_ids, [2, 3, 4], 256)
|
||||||
|
# assert merged_tokens == [1, 256, 5, 2, 3, 1, 256, 1]
|
||||||
|
> assert merged_tokens == [1, 256, 5, 4, 5, 1, 256, 1]
|
||||||
|
E assert [1, 256, 5, 2, 3, 1, 256, 1] == [1, 256, 5, 4, 5, 1, 256, 1]
|
||||||
|
E
|
||||||
|
E At index 3 diff: 2 != 4
|
||||||
|
E
|
||||||
|
E Full diff:
|
||||||
|
E [
|
||||||
|
E 1,
|
||||||
|
E 256,
|
||||||
|
E 5,
|
||||||
|
E - 4,
|
||||||
|
E ? ^
|
||||||
|
E + 2,
|
||||||
|
E ? ^
|
||||||
|
E - 5,
|
||||||
|
E ? ^
|
||||||
|
E + 3,
|
||||||
|
E ? ^
|
||||||
|
E 1,
|
||||||
|
E 256,
|
||||||
|
E 1,
|
||||||
|
E ]
|
||||||
|
|
||||||
|
tests/chapter_02/test_bpe_tokenizer.py:13: AssertionError
|
||||||
|
======================================== short test summary info ========================================
|
||||||
|
FAILED tests/chapter_02/test_bpe_tokenizer.py::test_merge_with_byte_sequence - assert [1, 256, 5, 2, 3, 1, 256, 1] == [1, 256, 5, 4, 5, 1, 256, 1]
|
||||||
|
|
||||||
|
At index 3 diff: 2 != 4
|
||||||
|
|
||||||
|
Full diff:
|
||||||
|
[
|
||||||
|
1,
|
||||||
|
256,
|
||||||
|
5,
|
||||||
|
- 4,
|
||||||
|
? ^
|
||||||
|
+ 2,
|
||||||
|
? ^
|
||||||
|
- 5,
|
||||||
|
? ^
|
||||||
|
+ 3,
|
||||||
|
? ^
|
||||||
|
1,
|
||||||
|
256,
|
||||||
|
1,
|
||||||
|
]
|
||||||
|
====================================== 1 failed, 2 passed in 0.02s ======================================
|
||||||
|
```
|
||||||
|
|
||||||
|
This is a lot more output to look at. What we can perhaps see more clearly now
|
||||||
|
is that the lists match up until there is a mismatch between `2` and `4` at the
|
||||||
|
third index. And then right after that is another mismatch between `3` and `5`.
|
||||||
|
|
||||||
|
This kind of output can only scale so much, so use it when it works and when the
|
||||||
|
diff view starts to fall short, rework the assertions to get more readable and
|
||||||
|
actionable test output.
|
||||||
Reference in New Issue
Block a user