diff --git a/README.md b/README.md index ceae0b9..40f6870 100644 --- a/README.md +++ b/README.md @@ -9,7 +9,7 @@ and pairing with smart people at Hashrocket. For a steady stream of TILs, [sign up for my newsletter](https://tinyletter.com/jbranchaud). -_899 TILs and counting..._ +_900 TILs and counting..._ --- @@ -744,6 +744,7 @@ _899 TILs and counting..._ - [Turning Things Into Hashes](ruby/turning-things-into-hashes.md) - [Uncaught Exceptions In Pry](ruby/uncaught-exceptions-in-pry.md) - [`undef_method` And The Inheritance Hierarchy](ruby/undef-method-and-the-inheritance-hierarchy.md) +- [Unpacking Strings Into Binary](ruby/unpacking-strings-into-binary.md) - [Up And Down With Integers](ruby/up-and-down-with-integers.md) - [Use A Case Statement As A Cond Statement](ruby/use-a-case-statement-as-a-cond-statement.md) - [Use dotenv In A Non-Rails Project](ruby/use-dotenv-in-a-non-rails-project.md) diff --git a/ruby/unpacking-strings-into-binary.md b/ruby/unpacking-strings-into-binary.md new file mode 100644 index 0000000..5b4b8b1 --- /dev/null +++ b/ruby/unpacking-strings-into-binary.md @@ -0,0 +1,47 @@ +# Unpacking Strings Into Binary + +You can find the binary representation of a given string by decoding it. Ruby +comes equipped with the [`#unpack`](https://apidock.com/ruby/String/unpack) +method on the `String` class that can do this decoding. + +Though there are a variety of formats to decode a string into, here are some +example of decoding different characters into binary. + +```ruby +> "A".unpack("B*") +=> ["01000001"] +``` + +The `B*` says _unpack_ this into as many *B*inary digits as are needed. The +UTF-8 encoding, means only a single byte (8-bits) are needed to represent +`"A"`. + +```ruby +irb(main):002:0> "Æ".unpack("B*") +=> ["1100001110000110"] +irb(main):003:0> "Æ".unpack("B8 B8") +=> ["11000011", "10000110"] +``` + +`"Æ"` is represented by two bytes. We can unpack each byte seprarately using +`"B8 B8"`. + +```ruby +irb(main):004:0> "木".unpack("B*") +=> ["111001101001110010101000"] +irb(main):005:0> "木".unpack("B8 B8 B8") +=> ["11100110", "10011100", "10101000"] +``` + +Similarly, this Japanese character is represented by three bytes of data. + +```ruby +irb(main):006:0> "👻".unpack("B*") +=> ["11110000100111111001000110111011"] +irb(main):007:0> "👻".unpack("B8 B8 B8 B8") +=> ["11110000", "10011111", "10010001", "10111011"] +``` + +Lastly, emojis generally require four bytes of data. + +[source](https://www.honeybadger.io/blog/the-rubyist-guide-to-unicode-utf8/)