diff --git a/README.md b/README.md index cc68830..c952bb5 100644 --- a/README.md +++ b/README.md @@ -7,6 +7,10 @@ variety of languages and technologies. These are things that don't really warrant a full blog post. These are mostly things I learn by pairing with smart people at [Hashrocket](http://hashrocket.com/). +### clojure + +- [Splitting On Whitespace](clojure/splitting-on-whitespace.md) + ### git - [Checkout Previous Branch](git/checkout-previous-branch.md) diff --git a/clojure/splitting-on-whitespace.md b/clojure/splitting-on-whitespace.md new file mode 100644 index 0000000..c83df77 --- /dev/null +++ b/clojure/splitting-on-whitespace.md @@ -0,0 +1,34 @@ +# Splitting On Whitespace + +If you have a string with spaces and you want to split the string into a +vector of strings (delimited by the spaces), then you can do something like +this: + +```clojure +(clojure.string/split "split me up" #" ") +; ["split" "me" "up"] +``` + +However, if you have extra spaces in your string, the output may not be quite +what you want: + +```clojure +(clojure.string/split "double spacing wtf?" #" ") +; ["double" "" "spacing" "" "wtf?"] +``` + +A quick fix might look like this: + +```clojure +(clojure.string/split "double spacing wtf?" #"[ ]+") +; ["double" "spacing" "wtf?"] +``` + +That's nice, but it is going to fall over as soon as we run into input with +tabs and new lines. Assuming we want to split on all whitespace, we should +tell our regular expression to do just that: + +```clojure +(clojure.string/split "we\thave new\nlines and tabs\n" #"[\s]+") +; ["we" "have" "new" "lines" "and" "tabs"] +```