Advanced Tips for Search-and-Replace in Linux

Let’s say that you’ve just decided to rename a variable from foo to fooOne. In Vim, hit Esc for command mode, then use this command:

:%s/foo/fooOne/g

% means that the operation should be carried out throughout the whole document. The important part is s/foo/fooOne/, which means “replace every instance of ‘foo’ with ‘fooOne’”. The final g means “global”; without this you’ll just replace the first instance on every line, but with it, you replace every occurrence.

To use this search-and-replace pattern in Emacs, hit M-x then type replace-string RET foo RET fooOne.

However, while this non-regexp operation would replace foo with fooOne, it would also replace foobar with fooOnebar, which you probably didn’t want. To get around this, use the word boundary markers \< and \>:

:%s/\/fooOne/g

This restricts the replacement to occur only when ‘foo’ exists as a word on its own (with a word boundary character on each side of it). In Emacs:

M-x replace-regexp RET \ RET fooOne

Backreferences

Backreferences (as used in the previous tutorial) can also be very useful. For example, say you wanted to change all the date references in a file from US-style (09/22/09) to UK style, with long year and a dot instead of a slash (22.09.2009). This regexp would do the trick in Vim:

:%s#\< \(\d\+\)/\(\d\+\)/\(\d\{2\}\)\>#\2.\1.20\3#g

For Emacs, use:

M-x replace-regexp RET \< \([[:digit:]]+\)/\([[:digit:]]+\)/\([[:digit:]]\{2\}\)\> RET \2.\1.20\3

OK, that looks quite complicated! First of all, let’s note that in vim, we use # rather than /, giving us s###g rather than s///g. This makes it easier to read if you’re looking for / in the pattern, and also means that you don’t need to escape any / characters.

As discussed in the previous article, each pair of escaped brackets, \(PATT\), store a backreference to PATT. Here we have three backreferences, with a word boundary in front and afterwards (the \< and \>), and separated by a slash between each of the backreferences (as in 09/22/09).

The first pattern we’re looking for is \d\+: this means at least one digit character (\d). So this will match 9, 09, 12, etc. In Emacs, this is written [[:digit:]]+ (there is no need to escape the + in Emacs regexp syntax, as you must do in Vim). You can also use [[:digit:]] instead of \d in vim if you prefer.

The second backreference pattern is the same as the first one, to match the number of days. The third pattern, \d\{2\} matches exactly 2 digit characters (\{n\} matches exactly n of the previous character type), because years aren’t usually written as single digits.

The replace string is then straightforward: reorder the three backreferences so that the day digits come first, then the month, then the year with 20 in front of it, all separated by a period.

Share
This entry was posted in About LPI. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>