Add Find Duplicate Lines In A File as a Unix TIL

2026-07-05 00:58:22 +00:00 · 2023-05-17 13:57:39 -05:00
parent 629d8b7f7b
commit 8bac2d832d
2 changed files with 22 additions and 1 deletions
@@ -0,0 +1,20 @@
+# Find Duplicate Lines In A File
+
+Let's say I have a large file in a Ruby project. I want to find instances of a
+`field` declaration being duplicated throughout the file. Just searching for
+duplicate lines within the file is going to result in all kinds of false
+positives (think, lots of duplicate `end` lines).
+
+What I can do is `grep` for a pattern that will just match on the lines that
+are `field` declarations. The results of the `grep` can then be piped to `sort`
+which will order them. This ordering will mean that any duplicates are placed
+next to each other. Lastly, I'll pipe the sorted lines to `uniq` with the `-d`
+flag which will filter the results down to just those lines that are repeated.
+
+Here is what the whole thing looks like:
+
+```
+$ grep -o "field :[a-zA-Z_][a-zA-Z_0-9]*" file.rb | sort | uniq -d
+```
+
+See `man uniq` for more details on the available flags.