You can modify the pdfgrep
output like follows to make it usable in xargs
:
$ echo 'RE/2011-01-RE_60822079000168_23022016_153923(1).PDF: Fatima Alves' | grep --perl-regexp --only-matching '.*(?=: Fatima Alves$)' RE/2011-01-RE_60822079000168_23022016_153923(1).PDF
So for any given regular expression and pdfgrep
output, you can do this:
regex='Fatima Alves' pdfgrep -H "$regex" RE/* | grep --perl-regexp --only-matching ".*(?=: $regex\$)"
Edit:
I originally thought only the matching part of the line was printed by pdfgrep
. Since it prints the whole line we have to simply remove everything including and following the colon separator:
pdfgrep -H "$regex" RE/* | sed 's/:.*//'