regex - "partial grep" to accelerate grep speed? -
this thinking: grep
program tries pattern-match every pattern occurrence in line, like:
echo "abc abc abc" | grep abc --color
the result 3 abc
red colored, grep did full pattern matching line.
but think in scenario, have many big files process, words interested occur in first few words. job find lines without words in them. if grep
program can continue next line when words have been found without having check rest of line, maybe faster.
is there partial match
option maybe in grep this?
like:
echo abc abc abc | grep --partial abc --color
with first abc colored red.
see nice introduction grep internals:
http://lists.freebsd.org/pipermail/freebsd-current/2010-august/019310.html
in particular:
gnu grep avoids breaking input lines. looking newlines slow grep down factor of several times, because find newlines have @ every byte!
so instead of using line-oriented input, gnu grep reads raw data large buffer, searches buffer using boyer-moore, , when finds match go , bounding newlines. (certain command line options -n disable optimization.)
so answer is: no. way faster grep
next occurrence of search string, rather new line.
edit: regarding speculation in comments color=never
trick: had quick glance @ source code. variable color_option
not used anywhere near the actual search regex or previous , upcoming newline in case match has been found.
it might 1 save few cpu cycles when searching line terminators. possibly real world difference shows pathological long lines , short search string.
Comments
Post a Comment