I have a script that runs a grep
command and formats the results nicely for me, asking if I want to open any of the resulting files in an editor etc.
The core of my script is a command like this:
grep -HinER -B 0 -A 2 --include "*.txt" "eight" /tmp/searchTest/letters | sed -re "s/:([0-9]+:)/\n\1/" -e "s/-([0-9]+-)/\n\1/" -e "s/^[.]/\\n./"
It runs the grep
, outputting the file name on every line and then runs some processing to put the file names on a different line from the results.
> grep -HinER -B 0 -A 2 --include "*.txt" "eight" /tmp/searchTest/letters | sed -re "s/:([0-9]+:)/\n\1/" -e "s/-([0-9]+-)/\n\1/" -e "s/^[.]/\\n./"
/tmp/searchTest/letters/aaa-aaa-aaa/aaa-aaa-aaa.txt
3:seven eight nine
--
/tmp/searchTest/letters/bbb-bbb-bbb/bbb-bbb-bbb.txt
2:seven eight nine
/tmp/searchTest/letters/bbb-bbb-bbb/bbb-bbb-bbb.txt
3-ten eleven twelve
I have more processing that pretties up the results further and gives me a list of the matching files, asking me which I want to open up in an editor.
I am having trouble with files whose name/path include hyphens and numbers (e.g. "/tmp/searchTest/numbers/222-222-222/222-222-222.txt") which means my sed command fails to pick out the file name from the hyphen/colon delimited line numbers.
Here is a script that sets up a test case showing this:
#!/bin/bash
rm -rf /tmp/searchTest 2> /dev/null
mkdir -p /tmp/searchTest/numbers/111-111-111
mkdir -p /tmp/searchTest/numbers/222-222-222
mkdir -p /tmp/searchTest/letters/aaa-aaa-aaa
mkdir -p /tmp/searchTest/letters/bbb-bbb-bbb
cat << EOF > /tmp/searchTest/numbers/111-111-111/111-111-111.txt
one two three
four five six
seven eight nine
EOF
cat << EOF > /tmp/searchTest/numbers/222-222-222/222-222-222.txt
four five six
seven eight nine
ten eleven twelve
EOF
cat << EOF > /tmp/searchTest/letters/aaa-aaa-aaa/aaa-aaa-aaa.txt
one two three
four five six
seven eight nine
EOF
cat << EOF > /tmp/searchTest/letters/bbb-bbb-bbb/bbb-bbb-bbb.txt
four five six
seven eight nine
ten eleven twelve
EOF
echo "Contents of /tmp/searchTest"
tree /tmp/searchTest
echo -e "\nFirst search, looking for \"eight\".\n---"
grep -HinER -B 0 -A 2 --include "*.txt" "eight" /tmp/searchTest/letters
echo -e "\nExtending first search, looking for \"eight\" and extracting file names.\n---"
grep -HinER -B 0 -A 2 --include "*.txt" "eight" /tmp/searchTest/letters | sed -re "s/:([0-9]+:)/\n\1/" -e "s/-([0-9]+-)/\n\1/" -e "s/^[.]/\\n./"
echo -e "\nSecond search, looking for \"eight\".\n---"
grep -HinER -B 0 -A 2 --include "*.txt" "eight" /tmp/searchTest/numbers
echo -e "\nExtending second search, looking for \"eight\" and extracting file names - but fails.\n---"
grep -HinER -B 0 -A 2 --include "*.txt" "eight" /tmp/searchTest/numbers | sed -re "s/:([0-9]+:)/\n\1/" -e "s/-([0-9]+-)/\n\1/" -e "s/^[.]/\\n./"
The results for the second search shows how the file names break the sed
command.
First search, looking for "eight".
---
/tmp/searchTest/letters/aaa-aaa-aaa/aaa-aaa-aaa.txt:3:seven eight nine
--
/tmp/searchTest/letters/bbb-bbb-bbb/bbb-bbb-bbb.txt:2:seven eight nine
/tmp/searchTest/letters/bbb-bbb-bbb/bbb-bbb-bbb.txt-3-ten eleven twelve
Extending first search, looking for "eight" and extracting file names.
---
/tmp/searchTest/letters/aaa-aaa-aaa/aaa-aaa-aaa.txt
3:seven eight nine
--
/tmp/searchTest/letters/bbb-bbb-bbb/bbb-bbb-bbb.txt
2:seven eight nine
/tmp/searchTest/letters/bbb-bbb-bbb/bbb-bbb-bbb.txt
3-ten eleven twelve
Second search, looking for "eight".
---
/tmp/searchTest/numbers/111-111-111/111-111-111.txt:3:seven eight nine
--
/tmp/searchTest/numbers/222-222-222/222-222-222.txt:2:seven eight nine
/tmp/searchTest/numbers/222-222-222/222-222-222.txt-3-ten eleven twelve
Extending second search, looking for "eight" and extracting file names - but fails.
---
/tmp/searchTest/numbers/111
111-111/111-111-111.txt
3:seven eight nine
--
/tmp/searchTest/numbers/222
222-222/222-222-222.txt
2:seven eight nine
/tmp/searchTest/numbers/222
222-222/222-222-222.txt-3-ten eleven twelve
Is there a better way to pick out the file names? This is a general purpose script, so there is no set pattern I can rely on for file names: spaces, digits, letters, no extension etc are all possible.
It seems like the only way to do this reliably would be to run grep
twice, with the first being a grep -l
just to get the file names alone, which I can then map to the results.. But this is pretty exteme, especially for a big search space.
Update: Thursday 20 March 2025, 06:00:22 pm
Adding more detail on actual use in response to a comment from @Yokai.
Here is an example of how I use this script already. This works quite well for me, showing me search results and asking what files I want to open in a text editor.
> search.sh -d /Users/rob.bram/DirTechTips -y e -t "junit temporary" -A2
Search for pattern "junit temporary" in dir /Users/rob.bram/DirTechTips through file pattern "*.*"
====
./Java/cheat_Java-Junit.md
17:- [JUnit Temporary Files](#junit-temporary-files)
18- - [Listing files in temp dir during debugging](#listing-files-in-temp-dir-during-debugging)
19-- [Parallel Test Execution for JUnit 5](#parallel-test-execution-for-junit-5)
--
445:## JUnit Temporary Files
446-
447:This section: [JUnit Temporary Files](cheat_Java-Junit.md#junit-temporary-files) | [Back to top](#top)
448-
449-From: [Working and unit testing with temporary files in Java](https://blogs.oracle.com/javamagazine/working-and-unit-testing-with-temporary-files-in-java).
--
614:- Added section `JUnit Temporary Files`.
615-
616-Wednesday, 27th of October 2021, 10
46:26 AM
--
====
./Java/cheat_Java-File-System.md
43:1. Temp files in JUnit. See [JUnit Temporary Files](cheat_Java-Junit.md#junit-temporary-files).
44-2. Create temp file or directory with Java via `java.nio.file.Files` (Java 7).
45-
Do you want to view any of the matching files?
============
File 0: ./Java/cheat_Java-Junit.md
File 1: ./Java/cheat_Java-File-System.md
----
Specify files to open. [A]ll, [N]one or [x y z] space separated indexes.
Can also override editor choice. EDITOR can be one of favourite [t]ext editor (VS Code), [e]clipse, [l]ess, n[o]tepad, [v]im, co[n]sole or c[y]gstart.
This ends up running the following core grep
command: grep -HE --text -i -B 0 -A 2 -n -H "junit temporary"
Use -Z
to have the filenames followed by a \0
byte rather than a :
or -
character, then instruct sed to look for that \0
byte:
# have sed replace the first 0 byte with \n
grep -Z -HinER ... | sed -e 's/\x00/\n/'
This should lift the ambiguity