sed的基本概念被我忽视了很长时间

时间：2006-11-22 来源：gladness

今天为了解决同事的一个问题，又学习了一下sed。先是发现一篇很好的文章：
http://pegasus.rutgers.edu/~elflord/unix/sed.html
看到有些用法我明显不知道。于是又看了一下info sed的内容，发现我一直就忽略了手册里非常重要的基本概念。我把它翻译出来：

3.1 How `sed' Works
===================

`sed' maintains two data buffers: the active _pattern_ space, and the auxiliary _hold_ space. Both are initially empty.
sed维护两个数据缓冲区：活动的pattern空间，和辅助的hold空间。初始时都是空的。

   `sed' operates by performing the following cycle on each lines of input: first, `sed' reads one line from the input stream, removes any trailing newline, and places it in the pattern space. Then commands are executed; each command can have an address associated to it: addresses are a kind of condition code, and a command is only executed if the condition is verified before the command is to be executed.
sed以如下的方式循环处理每一行输入：首先，sed从输入流读入一行，去掉结尾的换行，放入pattern空间。然后开始执行命令；每一条命令可以有一个相关联的地址：地址是一种条件代码，并且每一条命令只在条件满足的时候才执行。

   When the end of the script is reached, unless the `-n' option is in use, the contents of pattern space are printed out to the output stream, adding back the trailing newline if it was removed.(1) Then the next cycle starts for the next input line.
当到达脚本的末尾时，除非使用了-n选项，pattern空间的内容会被输出到输出流，并且把去掉的换行加上。之后下一轮循环又开始了。

   Unless special commands (like `D') are used, the pattern space is deleted between two cycles. The hold space, on the other hand, keeps its data between cycles (see commands `h', `H', `x', `g', `G' to move data between both buffers).
除非使用了特殊的命令（像D），pattern空间会在两次循环之间被清除。另一方面，hold空间在循环之间保留它的数据（参见命令h,H,X,g,G）

sed -e 's/aaaa/bbbb/' filename
这样的用法，实际上是忽略了对pattern空间的过滤，是把所有内容逐行放入pattern空间。而s及后面的内容就是一条命令了。
因此
sed -e '/ccccc/s/aaaa/bbbb/ filename
的用法也就不奇怪了，匹配ccccc的行才被放入pattern空间，然后再执行s命令。

待续

相关阅读更多 +