1
0
mirror of https://github.com/jbranchaud/til synced 2026-03-04 06:58:45 +00:00

Compare commits

17 Commits

Author SHA1 Message Date
jbranchaud
c1f046d196 Add Iterate Over A Dictionary as a Python TIL 2026-03-04 00:20:58 -06:00
jbranchaud
55a6691681 Add Combine All My TILs Into A Single File as a Unix TIL 2026-03-01 12:06:07 -06:00
jbranchaud
3632cfbe1b Add Easy Key-Value Aggregates With defaultdict as a Python TIL 2026-02-26 21:51:08 -06:00
jbranchaud
e82bc873b9 Add Set, Get, And Unset Env Vars With Dokku as a Devops TIL 2026-02-25 19:05:40 -06:00
jbranchaud
43ade88fab Add Access Most Recent Return Value In REPL as a Python TIL 2026-02-24 20:37:12 -06:00
jbranchaud
8f99085e4b Add Edit The Current Command Prompt as a Bash TIL 2026-02-23 20:52:05 -06:00
jbranchaud
f20428b06a Add Keep A Tally With collections.Counter as a Python TIL 2026-02-22 14:28:37 -06:00
jbranchaud
e41802653d Add Inspect EXIF Data For An Image File as a Unix TIL 2026-02-22 11:52:47 -06:00
jbranchaud
2cc52bf8bc Add Search Through Bin Paths For Tool Locations as a mise TIL 2026-02-22 10:53:56 -06:00
jbranchaud
df418b5718 Add Load A File Into The Python REPL as a Python TIL 2026-02-21 16:58:57 -06:00
jbranchaud
d084e0ffe0 Add Check If Package Is Installed With Pip as a Python TIL 2026-02-19 13:54:50 -06:00
jbranchaud
72b466a8b3 Add Override Your Project Mise File as a Mise TIL 2026-02-18 15:33:46 -06:00
jbranchaud
be18f387ed Add Install With PIP For Specific Interpreter as a Python TIL 2026-02-16 14:40:27 -06:00
jbranchaud
efb83050ab Add Iterate First N Items From Enumerable as a Python TIL 2026-02-16 13:32:35 -06:00
jbranchaud
ec12f7ea80 Add Make A Long String Of Text Readable as a Ruby TIL 2026-02-12 17:30:15 -06:00
jbranchaud
f186d5977d Add Compute Median Instead of Average as a PostgreSQL TIL 2026-02-02 16:55:50 -06:00
jbranchaud
f967520fa3 Add Specify Default For Data Definition as a Ruby TIL 2026-02-02 08:26:05 -06:00
18 changed files with 632 additions and 1 deletions

View File

@@ -10,7 +10,7 @@ working across different projects via [VisualMode](https://www.visualmode.dev/).
For a steady stream of TILs, [sign up for my newsletter](https://visualmode.kit.com/newsletter).
_1734 TILs and counting..._
_1751 TILs and counting..._
See some of the other learning resources I work on:
@@ -29,6 +29,7 @@ If you've learned something here, support my efforts writing daily TILs by
* [Ansible](#ansible)
* [Astro](#astro)
* [AWS](#aws)
* [Bash](#bash)
* [Brew](#brew)
* [Chrome](#chrome)
* [Claude Code](#claude-code)
@@ -126,6 +127,10 @@ If you've learned something here, support my efforts writing daily TILs by
- [Turn Off Output Pager For A Command](aws/turn-off-output-pager-for-a-command.md)
- [Use Specific AWS Profile With CLI](aws/use-specific-aws-profile-with-cli.md)
### Bash
- [Edit The Current Command Prompt](bash/edit-the-current-command-prompt.md)
### Brew
- [Clean Up Your Brew Installations](brew/clean-up-your-brew-installations.md)
@@ -241,6 +246,7 @@ If you've learned something here, support my efforts writing daily TILs by
- [Reload The nginx Configuration](devops/reload-the-nginx-configuration.md)
- [Resolve The Public IP Of A URL](devops/resolve-the-public-ip-of-a-url.md)
- [Running Out Of inode Space](devops/running-out-of-inode-space.md)
- [Set, Get, And Unset Env Vars With Dokku](devops/set-get-and-unset-env-vars-with-dokku.md)
- [Set Up Domain For Hatchbox Rails App](devops/set-up-domain-for-hatchbox-rails-app.md)
- [SSH Into A Docker Container](devops/ssh-into-a-docker-container.md)
- [SSL Certificates Can Cover Multiple Domains](devops/ssl-certificates-can-cover-multiple-domains.md)
@@ -755,9 +761,11 @@ If you've learned something here, support my efforts writing daily TILs by
- [Create Umbrella Task For All Test Tasks](mise/create-umbrella-task-for-all-test-tasks.md)
- [List The Files Being Loaded By Mise](mise/list-the-files-being-loaded-by-mise.md)
- [Look In Ruby Version Dotfile](mise/look-in-ruby-version-dotfile.md)
- [Override Your Project Mise File](mise/override-your-project-mise-file.md)
- [Preserve Color Output For Task Command](mise/preserve-color-output-for-task-command.md)
- [Read Existing Dot Env File Into Env Vars](mise/read-existing-dot-env-file-into-env-vars.md)
- [Run A Command With Specific Tool Version](mise/run-a-command-with-specific-tool-version.md)
- [Search Through Bin Paths For Tool Locations](mise/search-through-bin-paths-for-tool-locations.md)
### MongoDB
@@ -863,6 +871,7 @@ If you've learned something here, support my efforts writing daily TILs by
- [Clear The Screen In psql](postgres/clear-the-screen-in-psql.md)
- [Clear The Screen In psql (2)](postgres/clear-the-screen-in-psql-2.md)
- [Compute Hashes With pgcrypto](postgres/compute-hashes-with-pgcrypto.md)
- [Compute Median Instead Of Average](postgres/compute-median-instead-of-average.md)
- [Compute The Levenshtein Distance Of Two Strings](postgres/compute-the-levenshtein-distance-of-two-strings.md)
- [Compute The md5 Hash Of A String](postgres/compute-the-md5-hash-of-a-string.md)
- [Concatenate Strings With A Separator](postgres/concatenate-strings-with-a-separator.md)
@@ -1030,9 +1039,17 @@ If you've learned something here, support my efforts writing daily TILs by
### Python
- [Access Instance Variables](python/access-instance-variables.md)
- [Access Most Recent Return Value In REPL](python/access-most-recent-return-value-in-repl.md)
- [Break Debugger On First Line Of Program](python/break-debugger-on-first-line-of-program.md)
- [Check If Package Is Installed With Pip](python/check-if-package-is-installed-with-pip.md)
- [Create A Dummy DataFrame In Pandas](python/create-a-dummy-dataframe-in-pandas.md)
- [Dunder Methods](python/dunder-methods.md)
- [Easy Key-Value Aggregates With defaultdict](python/easy-key-value-aggregates-with-defaultdict.md)
- [Install With PIP For Specific Interpreter](python/install-with-pip-for-specific-interpreter.md)
- [Iterate First N Items From Enumerable](python/iterate-first-n-items-from-enumerable.md)
- [Iterate Over A Dictionary](python/iterate-over-a-dictionary.md)
- [Keep A Tally With collections.Counter](python/keep-a-tally-with-collections-counter.md)
- [Load A File Into The Python REPL](python/load-a-file-into-the-python-repl.md)
- [Override The Boolean Context Of A Class](python/override-the-boolean-context-of-a-class.md)
- [Store And Access Immutable Data In A Tuple](python/store-and-access-immutable-data-in-a-tuple.md)
- [Test A Function With Pytest](python/test-a-function-with-pytest.md)
@@ -1430,6 +1447,7 @@ If you've learned something here, support my efforts writing daily TILs by
- [Limit Split](ruby/limit-split.md)
- [List The Running Ruby Version](ruby/list-the-running-ruby-version.md)
- [Listing Local Variables](ruby/listing-local-variables.md)
- [Make A Long String Of Text Readable](ruby/make-a-long-string-of-text-readable.md)
- [Make An Executable Ruby Script](ruby/make-an-executable-ruby-script.md)
- [Make Structs Easier To Use With Keyword Initialization](ruby/make-structs-easier-to-use-with-keyword-initialization.md)
- [Map With Index Over An Array](ruby/map-with-index-over-an-array.md)
@@ -1483,6 +1501,7 @@ If you've learned something here, support my efforts writing daily TILs by
- [Single And Double Quoted String Notation](ruby/single-and-double-quoted-string-notation.md)
- [Skip Specific CVEs When Auditing Your Bundle](ruby/skip-specific-cves-when-auditing-your-bundle.md)
- [Skip The Front Of An Array With Drop](ruby/skip-the-front-of-an-array-with-drop.md)
- [Specify Default For Data Definition](ruby/specify-default-for-data-definition.md)
- [Specify Dependencies For A Rake Task](ruby/specify-dependencies-for-a-rake-task.md)
- [Specify How Random Array#sample Is](ruby/specify-how-random-array-sample-is.md)
- [Split A Float Into Its Integer And Decimal](ruby/split-a-float-into-its-integer-and-decimal.md)
@@ -1630,6 +1649,7 @@ If you've learned something here, support my efforts writing daily TILs by
- [Check The Current Working Directory](unix/check-the-current-working-directory.md)
- [Check The Installed OpenSSL Version](unix/check-the-installed-openssl-version.md)
- [Clear The Screen](unix/clear-the-screen.md)
- [Combine All My TILs Into A Single File](unix/combine-all-my-tils-into-a-single-file.md)
- [Command Line Length Limitations](unix/command-line-length-limitations.md)
- [Compare Two Variables In A Bash Script](unix/compare-two-variables-in-a-bash-script.md)
- [Configure cd To Behave Like pushd In Zsh](unix/configure-cd-to-behave-like-pushd-in-zsh.md)
@@ -1701,6 +1721,7 @@ If you've learned something here, support my efforts writing daily TILs by
- [Ignore A Directory During ripgrep Search](unix/ignore-a-directory-during-ripgrep-search.md)
- [Ignore The Alias When Running A Command](unix/ignore-the-alias-when-running-a-command.md)
- [Include Ignore Files In Ripgrep Search](unix/include-ignore-files-in-ripgrep-search.md)
- [Inspect EXIF Data For An Image File](unix/inspect-exif-data-for-an-image-file.md)
- [Interactively Browse Available Node Versions](unix/interactively-browse-availabile-node-versions.md)
- [Interactively Switch asdf Package Versions](unix/interactively-switch-asdf-package-versions.md)
- [Interpret Cron Schedule From The CLI](unix/interpret-cron-schedule-from-the-cli.md)

View File

@@ -0,0 +1,18 @@
# Edit The Current Command Prompt
A neat feature of `bash` is the ability to open whatever the current state of
the command prompt is into your default editor.
Let's say we have a really long command that we've just tried to run, but it
failed and we need to make a small change somewhere in the middle. Instead of
holding the left arrow key for 30 seconds, we can instead hit `CTRL-X CTRL-E`.
This pops us into our `EDITOR` (or maybe `VISUAL`, not sure which). In my case,
that is `nvim`. I now have access to all the features I'm used to in `nvim` for
quickly navigating to and editing, searching and replacing, or whatever.
Once I've got the command how I like it, I can save and exit (`:wq`) and the
updated command will be executed.
This is similar to [the `fc` builtin](unix/fix-previous-command-with-fc.md),
which also happens to be available for `zsh`.

View File

@@ -0,0 +1,29 @@
# Set, Get, And Unset Env Vars With Dokku
The `dokku` CLI provides `config` subcommands for managing environment variables
for the target container.
An env var can be set for an active container with `config:set`:
```bash
$ dokku config:set app-name JEMALLOC_ENABLED=true MALLOC_CONF="stats_print:true"
```
Notice I'm able to set multiple env vars at once if needed.
If I ever need to check what an env var is currently set to for one of my app
containers, I can use `config:get`:
```bash
$ dokku config:get app-name JEMALLOC_ENABLED
true
```
I can always override any value with another `config:set`. However, if I need to
entirely remove the env var, I can use `config:unset`:
```bash
$ dokku config:unset app-name MALLOC_CONF
```
[source](https://dokku.com/docs/configuration/environment-variables/)

View File

@@ -0,0 +1,37 @@
# Override Your Project Mise File
A project I'm working on has a version-controlled `.mise.toml` file in it. Some
changes were made to that recently that introduce some env vars that conflict
with my setup. If I make edits to that file, then I have a modified version of
`.mise.toml` sitting in my Git working copy.
```
# .mise.toml
[env]
CONFIG_SETTING = "project"
```
Instead, I can rely on the loading precedence rules of `mise` to override those
project settings with my individual settings. I can do that with the
`.mise.local.toml` file which is played on top of any `mise` configuration from
files further down the precedence chain.
```
# .mise.local.toml
[env]
CONFIG_SETTING = "override"
```
Assuming I have `mise` setup with my shell environment to automatically load in
these files, I can now check what takes precedence:
```bash
$ echo $CONFIG_SETTING
override
```
Make sure `.mise.local.toml` is included in the `.gitignore` file to avoid
checking in your personal environment overrides.
To be sure about what files are loaded and in what order, give `mise cfg` a try.
I discuss that in more detail in [List The Files Being Loaded By Mise](list-the-files-being-loaded-by-mise.md).

View File

@@ -0,0 +1,29 @@
# Search Through Bin Paths For Tool Locations
The `mise bin-paths` command will list all the bin paths that are managed by
`mise`. When you tell `mise` to install a tool, it installs a specific version
at a location where its binaries can be made accessible on the system path.
While `mise ls` is useful for seeing what is installed by `mise` and at what
version, the `bin-paths` command can tell you where those tool installations
with their binaries are located.
Combine this with `grep` or `rg` to narrow down the results to tools by a
specific name:
```bash
mise bin-paths | rg 'neovim'
/Users/lastword/.local/share/mise/installs/npm-neovim/5.4.0/bin
/Users/lastword/.local/share/mise/installs/pipx-neovim-remote/2.5.1/bin
/Users/lastword/.local/share/mise/installs/neovim/0.11.6/bin
```
I can then look in one of these directories to see the one or more binaries that
they include. For instance, here is what is in the `node` bin path:
```bash
ls /Users/lastword/.local/share/mise/installs/node/22.22.0/bin
 ./  ../  claude@  corepack@  node*  npm*  npx@
```
See `mise bin-paths --help` for more details.

View File

@@ -0,0 +1,44 @@
# Compute Median Instead Of Average
One of the first aggregate functions we might use in PostgreSQL, besides `sum`,
is `avg`.
```sql
select avg(book_count) as average_books_read
from (
select users.id, count(books.id) as book_count
from users
left join books
on books.user_id = users.id
where books.read_in_year = 2025
group by users.id
) as user_book_counts;
```
This computes the average of the set of values which sums them all up
and divides by the count. The average (maybe you've heard this also called the
_mean_) is not always the best way to understand data, especially when there are
outliers.
Instead, we might want to compute the _median_ value of our set of data. There
is no easily identifiable `median` aggregate function. Instead, we can use
`percentile_cont` with a value of `0.5`. This gets us the 50th percentile of our
set of data which is the definition of the _median_.
```sql
select percentile_cont(0.5) within group (
order by book_count
) as median_books_read
from (
select users.id, count(books.id) as book_count
from users
left join books on books.user_id = users.id and books.read_in_year = 2025
group by users.id
) as user_book_counts;
```
The full syntax for `percentile_cont` is `percentile_cong(precision) within
group (order by ...)` because this is an aggregiate that has to work with an
ordered-set of data.
[source](https://www.postgresql.org/docs/current/functions-aggregate.html)

View File

@@ -0,0 +1,34 @@
# Access Most Recent Return Value In REPL
One of my favorite features of Ruby's `irb` and `pry` are that you can use `_`
to reference the most recent return value. Often as we use an interpreter or
REPL, we end up with _intermediate_ values. That is, we've execute some kind of
statement which returned a value and we now want to use that resulting value in
our next statement. Python also supports `_`.
Let's say I've run a statement that took a while to process, but I forgot to
assign it to a variable. Instead of re-running the whole thing, I can create a
variable that references the previous return value using `_`.
```python
>>> BytePairEncoding.train_bpe(long_text)
{'merge_rules': [...], 'vocab': {...}}
>>> result = _
>>> list(result.keys())
['merge_rules', 'vocab']
```
Even if I don't necessarily want to assign it a variable, it can be nice to
reference the previous value as I continue with what I'm doing:
```python
>>> result['merge_rules'][0][1]
256
>>> result['vocab'][_]
b'e '
```
Notice how the value from the first statement gets used as part of a `dict`
access.
[source](https://docs.python.org/3/tutorial/introduction.html#numbers)

View File

@@ -0,0 +1,50 @@
# Check If Package Is Installed With Pip
I recently installed PyTorch, but when I tried using it, I was getting an error
about `numpy` not being installed. I was kind of surprised by that because I
thought I would have already had that.
I wanted to check, so I asked with `pip show`:
```bash
python3 -m pip show numpy
WARNING: Package(s) not found: numpy
```
I can even list everything that is installed with `pip` using `pip list` like
so:
```bash
python3 -m pip list
Package Version Build
------------------ --------- -----
certifi 2026.1.4
cffi 2.0.0
charset-normalizer 3.4.4
click 8.3.1
commonmark 0.9.1
cryptography 46.0.3
docutils 0.22.4
filelock 3.24.2
fsspec 2026.2.0
idna 3.11
Jinja2 3.1.6
...
```
I then installed `numpy` (`python3 -m pip install numpy`) and how I can use `pip
show` again to confirm that.
```bash
python3 -m pip show numpy
Name: numpy
Version: 2.4.2
Summary: Fundamental package for array computing in Python
Home-page: https://numpy.org
Author: Travis E. Oliphant et al.
Author-email:
License-Expression: BSD-3-Clause AND 0BSD AND MIT AND Zlib AND CC0-1.0
Location: /Users/lastword/.local/share/mise/installs/python/3.12.12/lib/python3.12/site-packages
Requires:
Required-by:
```

View File

@@ -0,0 +1,53 @@
# Easy Key-Value Aggregates With defaultdict
The `collections` module has the `defaultdict` object that can be used to
aggregate values tied to a key. What sets this apart from simply using a `dict`
is that we get the base value for free. So if our aggregate value is a list,
then we get `[]` by default for each new key. In the same way, we'd get `0` if
it was constructed with `int`.
Here is the counter example from [Keep A Tally With
collections.Counter](keep-a-tally-with-collections-counter.md)
```python
from collections import defaultdict
def get_pair_counts(token_ids: list[int]) -> Counter:
"""Count how often each adjacent pair appears"""
counts = defaultdict(int)
for i in range(len(token_ids) - 1):
pair = (token_ids[i], token_ids[i + 1])
counts[pair] += 1
return counts
```
We never have to initially set a key to `0`. If the key is not yet present, then
`int()` (the zero-value constructor) is used as the `__missing__` value.
We can do the same with `list`:
```python
>>> import collections
>>> stuff = collections.defaultdict(list)
>>> stuff['alpha'].append(1)
>>> stuff['alpha']
[1]
>>> stuff['beta']
[]
```
In the same way, this uses `list()` as the `__missing__` value to start of each
key with an `[]`.
I find this so handy because in other languages I've typically had to do
something more like this:
```python
words_by_length = {}
for item in items:
if len(item) not in words_by_length:
words_by_length[len(item)] = []
words_by_length[len(item)].append(item)
```
This is much clunkier.

View File

@@ -0,0 +1,17 @@
# Install With PIP For Specific Interpreter
The `pip` module can be invoked for any of its commands, such as install, using
a specific Python interpreter like so:
```bash
$ python3 -m pip install black
```
This avoid ambiguity between the version of Python I am using and version of the
package manager I'm using.
Similarly if I need to upgrade `pip`, I can do the following:
```bash
$ python3 -m pip install --upgrade pip
```

View File

@@ -0,0 +1,27 @@
# Iterate First N Items From Enumerable
As I'm working through the 2nd chapter of [Build a Large Language Model (from
scratch)](https://still.visualmode.dev/blogmarks/227), I came across a code
example processing a dictionary of words. This example used a for loop to print
out each dictionary entry until an index of 50 was reached on then it did a
`break`.
This struck me as an odd way to grab and process N items from a list. I did some
searching and found `itertools` which provides
[`islice`](https://docs.python.org/3/library/itertools.html#itertools.islice).
```python
from itertools import islice
# preprocess words from a file into a word list
all_words = ... # not shown here
vocab = {token: integer for integer, token in enumerate(all_words)}
for item in islice(enumerate(vocab.items()), 50):
print(item)
```
The `islice` function is a better approach because the intention (to grab the
first 50 things) is encoded in the function call rather than buried in a loop
body. It also has equivalent memory efficiency to the original example because
it lazily processes the list of `vocab` items.

View File

@@ -0,0 +1,34 @@
# Iterate Over A Dictionary
Let's say we have a `dict` that contains counts of occurrences for each word in
some sample text:
```python
words_frequency = {
"the": 4,
"a": 3,
"dog": 1,
"bone": 1,
"wants": 1,
...
}
```
Here is how we can iterate over the `dict`, accessing both the keys and values:
```python
for word, count in word_frequency.items():
print(f"- {word} appears {count} time{'' if count == 1 else 's'}")
```
Using the
[`items()`](https://docs.python.org/3/library/stdtypes.html#dict.items) method,
we're able to access both _key_ and _value_ with the for loop as it iterates.
Another approach is to loop directly on the `dict` which implicitly surfaces the
_key_ for iteration. This can then be used to get the value from the `dict`:
```python
for word in word_frequency:
print(f"- {word}: {word_frequency[word]}
```

View File

@@ -0,0 +1,40 @@
# Keep A Tally With collections.Counter
Python's `collections` module comes with a
[`Counter`](https://docs.python.org/3/library/collections.html#collections.Counter)
object which is a specialized dict subclass focussed on tallying counts of keys.
> It is a collection where elements are stored as dictionary keys and their
> counts are stored as dictionary values. Counts are allowed to be any integer
> value including zero or negative counts.
I used it recently while doing an exploratory implementation of a Byte-Pair
Encoding (BPE):
```python
from collections import Counter
def get_pair_counts(token_ids: list[int]) -> Counter:
"""Count how often each adjacent pair appears"""
counts = Counter()
for i in range(len(token_ids) - 1):
pair = (token_ids[i], token_ids[i + 1])
counts[pair] += 1
return counts
```
Here I'm able to count the number of occurrences of each pair of bytes from the
input text. A tuple of `int` values is hashable, so they work great as keys for
a `Counter`.
The count value of any key will default to `0`. That makes it straightforward to
increment from there as you iterating over occurrences.
```python
>>> counts = Counter()
>>> counts['hello']
0
>>> count['hello'] += 1
>>> count['hello']
1
```

View File

@@ -0,0 +1,35 @@
# Load A File Into The Python REPL
I opened up a Python REPL to try some things out.
```
$ python3
>>> import math
>>> math.floor(5/2)
2
```
Now, I want to reference a Python file I've been working on so that I can
manually test the behavior of what I'm building. To do this, I can import a file
by its name in the same way that I would import any module. Then I can use that
namespace for class and method references. Crucially, the file should exist in
the same directory the REPL was started from.
First, here is the file:
```python
# bpe.py
class BytePairEncoding:
def text_to_bytes(text: str) -> list[int]:
"""Convert a string to a list of byte values (0-255)"""
return list(text.encode("utf-8"))
```
Now to use it from the REPL:
```
$ python
>>> import bpe
>>> bpe.BytePairEncoding.text_to_bytes("Gimme some bytes!")
[71, 105, 109, 109, 101, 32, 115, 111, 109, 101, 32, 98, 121, 116, 101, 115, 33]
```

View File

@@ -0,0 +1,38 @@
# Make A Long String Of Text Readable
I have a paragraph of text that interpolates a couple user-specific values
before being included in an API request. Because it is being passed to an API,
it is a single-line string value. However, in the editor it is hard to read like
that because it overflows way past the edge of the viewport.
```ruby
description = "This is the description we need to provide for #{user.name} as part of an API request dealing with compliance and registration for a service. If you need to contact them, their email is #{user.email}."
```
I'd rather make this easier on myself and others to read from the editor while
still being able to submit a single-line string to the API. That can be
accomplished with a heredoc and some combination or `gsub`, `strip`, and
`squish`.
If we are in a strictly Ruby-only context, we can use `gsub` and `strip` to
collapse line breaks and remove surrounding white space.
```ruby
description = <<~MSG.gsub(/\s+/, ' ').strip
This is the description we need to provide for #{user.name} as part
of an API request dealing with compliance and registration for a
service. If you need to contact them, their email is #{user.email}.
MSG
#=> "This is the description we need to provide for #{user.name} as part of an API request dealing with compliance and registration for a service. If you need to contact them, their email is #{user.email}."
```
Or in a Rails context, I can instead just use `squish`:
```ruby
description = <<~MSG.squish
This is the description we need to provide for #{user.name} as part
of an API request dealing with compliance and registration for a
service. If you need to contact them, their email is #{user.email}.
MSG
#=> "This is the description we need to provide for #{user.name} as part of an API request dealing with compliance and registration for a service. If you need to contact them, their email is #{user.email}."
```

View File

@@ -0,0 +1,43 @@
# Specify Default For Data Definition
Here is what a `Data` definition for the concept of a `Permission` might look
like:
```ruby
Permission = Data.define(:id, :name, :description, :enabled)
perm1 = Permission.new(
id: 123,
name: :can_edit,
description: "User is allowed to edit.",
enabled: true
)
```
However, as we're creating various `Permission` entities, we may find that the
vast majority of them are _enabled_ by default and so we'd like to apply `true`
as a default value.
We cannot do this directly in the `Data` definition, but we can open a block to
override the `initialize` method.
```ruby
Permission = Data.define(:id, :name, :description, :enabled) do
def initialize(:id, :name, :description, enabled: true)
super
end
end
perm1 = Permission.new(
id: 123,
name: :can_edit,
description: "User is allowed to edit."
)
perm1.enabled #=> true
```
Now we're able to create a `Permission` without specifying the `enabled`
attribute and it takes on the default of `true`.
[source](https://dev.to/baweaver/new-in-ruby-32-datadefine-2819#comment-254o8)

View File

@@ -0,0 +1,35 @@
# Combine All My TILs Into A Single File
In [Build A Small Text-based Training
Dataset](https://www.visualmode.dev/build-a-small-text-training-dataset), I went
over my need for a sizeable and interesting corpus of text that I could use as a
training dataset I could use to run against [my own naive Byte Pair Encoding
implementation](https://github.com/jbranchaud/build-an-llm-from-scratch/blob/main/chapter-02/bpe_tokenizer.py).
My repo of hand-written TILs is a great candidate, but I need those smashed all
into one file.
Here is a formatted version of the one-liner I ended up with:
```bash
{
cat README.md; \
find */ -name '*.md' -print0 \
| sort -z \
| xargs -0 -I{} sh -c 'echo "<|endoftext|>"; cat "$1"' _ {}; \
} > combined.md
```
This combines all 1700+ of my TILs into a single file separated by the
`<|endoftext|>` delimiter.
The two things I find most interesting about this command are:
1. The use of a null byte (`\0`) separator between the filenames in case there
is anything weird (like spaces) in those filenames. This starts with
`-print0`. The `-z` of `sort` maintains that null byte separator. And then
`xargs` knows to handle it by the `-0` flag.
2. We can coerce `xargs` into running multiple commands by having it spawn a
single shell process that runs each of those commands. To reliably pass the
filename into that shell process, we have `xargs` constitute it as the second
argument (`$1`) by substituting in the filename where `{}` appears.

View File

@@ -0,0 +1,47 @@
# Inspect EXIF Data For An Image File
The `exiftool` CLI (which can be downloaded via `brew`) is a useful tool for
inspecting all the EXIF data attached to a media file. A media file like an
image has a bunch of additional details embedded in it like timestamps, image
metadata, and sometimes location information.
Here is all the data attached to a screenshot I found on my desktop:
```bash
exiftool ~/Desktop/CleanShot\ 2025-11-17\ at\ 11.22.18@2x.png
ExifTool Version Number : 13.50
File Name : CleanShot 2025-11-17 at 11.22.18@2x.png
Directory : /Users/lastword/Desktop
File Size : 1194 kB
File Modification Date/Time : 2025:11:17 11:22:21-06:00
File Access Date/Time : 2025:12:15 10:43:55-06:00
File Inode Change Date/Time : 2025:12:05 15:37:48-06:00
File Permissions : -rw-r--r--
File Type : PNG
File Type Extension : png
MIME Type : image/png
Image Width : 2502
Image Height : 1232
Bit Depth : 8
Color Type : RGB with Alpha
Compression : Deflate/Inflate
Filter : Adaptive
Interlace : Noninterlaced
XMP Toolkit : XMP Core 6.0.0
Y Resolution : 144
Resolution Unit : inches
X Resolution : 144
Exif Image Width : 2502
Color Space : sRGB
User Comment : Screenshot
Exif Image Height : 1232
SRGB Rendering : Perceptual
Image Size : 2502x1232
Megapixels : 3.1
```
This works with other kinds of media files. For instance, I ran this against an
MP4 screen recording file which contained even more metadata.
In addition to reading data, `exiftool` can also write it. See `man exiftool`
for more details on what else it can do.