awk学习笔记
时间:2009-07-16 来源:corryzhu
#vi employees
Tom Jones 4424 5/12/66 553354
Mary Adams 5346 11/4/63 28765
Sally Chang 1654 7/22/54 650000
Billy Black 1683 9/23/44 336500
#sed 's/\t/:/g' employees > employees2
#more employees2
Tom Jones:4424:5/12/66:553354
Mary Adams:5346:11/4/63:28765
Sally Chang:1654:7/22/54:650000
Billy Black:1683:9/23/44:336500
#awk -F: 'Tom Jones/{print $1,$2}' employees2
Tom Jones 4424
#awk -F'[ :\t] 'Tom Jones/{print $1,$2,$3}' employees2
Tom Jones 4424
Mary Adams 5346
Sally Chang 1654
Billy Black 1683
#awk -F: '/Tom Jones/{print $1,$2,$3,$4}' employees2
Tom JOnes 4424 5/12/66 543354
#awk -F: '/Tom Jones/{print $0}' employees2
Tom Jones:4424:5/12/66:543354
#awk '/Tom Jones/{print $0}' employees2
Tom Jones:4424:5/12/66:543354 模式与操作 1)模式 awk模式用来控制awk对输入的文本行执行什么操作。操作由正则表达式,判别条件真伪的表达式或两者的组合构成。awk的默认操作是打印所有使表达式结果为真的文本行。 注意:模式表达式中暗含着if语句。如果模式表达式含有if的意思,就不必用花括号把它括起来。当if是显式 的给出时,这个表达式就成了操作语句。语法将不同。 #cat employees Tom Jones 4424 5/12/66 553354 Mary Adams 5346 11/4/63 28765 Sally Chang 1654 7/22/54 650000 Billy Black 1683 9/23/44 336500 #awk '/Tom/' employees Tom Jones 4424 5/12/66 543354 注:如果输入文件中匹配到模式Tom,则打印Tom所在的记录。如果没有显式的指定操作,默认操作是打印文本 行,等价于命令 awk '$0 ~ /Tom/{print $0}' employees 而awk '$0 !~ /Tom/{print $0}' employees 则打印除Tom Jones以外的其他记录 Mary Adams 5346 11/4/63 28765 Sally Chang 1654 7/22/54 650000 Billy Black 1683 9/23/44 336500 #awk '$3<4000' employees Sally Chang 1654 7/22/54 650000 Billy Black 1683 9/23/44 336500 注:如果第3个字段的值小于4000,则打印该记录 2)操作 操作是以{}分隔的语句。如果操作前面有个模式,则模式控制着操作的范围。同一行内的多条语句由分号分隔,独占一行的语句则以换行符分隔 模式{操作语句;操作语句;。。。} 或 模式{ 操作语句 操作语句 } 注:操作如果紧跟在某个模式后面,它的第一个左花括号就必须与该模式同处一行。模式永远不会出现在花括 号中。 #awk ’/Tom/{print "Hello there, " $1}' employees Hello there,Tom 注:如果没有为模式指定操作,就会打印所有匹配该模式的文本行。 # awk '/^Mary/' employees
Mary Adama 5346 11/4/63 28765 # awk '/^[A-Z][a-z]+ /' employees
Tom Jones 4424 5/12/66 543354
Mary Adama 5346 11/4/63 28765
Sally chang 1654 7/22/54 650000
Billy Black 1683 9/23/44 336500 匹配操作符(~)用于对记录或字段的表达式进行匹配
# cat employees
Tom Jones 4424 5/12/66 543354
Mary Adama 5346 11/4/63 28765
Sally chang 1654 7/22/54 650000
Billy Black 1683 9/23/44 336500 # awk '$1 ~ /[Bb]ill/' employees
Billy Black 1683 9/23/44 336500 # awk '$1 !~ /ly$/' employees
Tom Jones 4424 5/12/66 543354
Mary Adama 5346 11/4/63 28765 # more employees2
Tom Jones:4424:5/12/66:543354
Mary Adama:5346:11/4/63:28765
Sally chang:1654:7/22/54:650000
Billy Black:1683:9/23/44:336500 # more info
/Tom/{print "Tom's birthday is" $3}
/Mary/{print NR,$0}
/^Sally/{print "Hi Sally." $1 "has a salary of $"$4 "."} # cat info
/Tom/{print "Tom's birthday is" $3}
/Mary/{print NR,$0}
/^Sally/{print "Hi Sally." $1 "has a salary of $"$4 "."} # awk -F: -f info employees2
Tom's birthday is5/12/66
2 Mary Adama:5346:11/4/63:28765
Hi Sally.Sally changhas a salary of $650000. 复习
# awk '/west/' datafile
northwest NW Charles Main 3.0 .98 3 34
western WE Sharon Gray 5.3 .97 5 23
southwest SW Lewis Dalsass 2.7 .8 2 18 # awk '/^north/' datafile
northwest NW Charles Main 3.0 .98 3 34
northeast NE AM Main Jr. 5.1 .94 3 13
north NO Margot Weber 4.5 .89 5 9 # awk '/^(no|so)/' datafile
northwest NW Charles Main 3.0 .98 3 34
southwest SW Lewis Dalsass 2.7 .8 2 18
southern SO Suan Chin 5.1 .95 4 15
southeast SE Patricia Hemenway 4.0 .7 4 17
northeast NE AM Main Jr. 5.1 .94 3 13
north NO Margot Weber 4.5 .89 5 9 # awk '{print $3,$2}' datafile
Charles NW
Sharon WE
Lewis SW
Suan SO
Patricia SE
TB EA
AM NE
Margot NO
Ann CT # awk 'print S1' datafile
syntax error The source line is 1.
The error context is
>>> print <<< S1
awk: Quitting
The source line is 1. # awk '{print $1}' datafile
northwest
western
southwest
southern
southeast
eastern
northeast
north
central # awk '{print $0}' datafile
northwest NW Charles Main 3.0 .98 3 34
western WE Sharon Gray 5.3 .97 5 23 southwest SW Lewis Dalsass 2.7 .8 2 18
southern SO Suan Chin 5.1 .95 4 15
southeast SE Patricia Hemenway 4.0 .7 4 17
eastern EA TB Savage 4.4 .84 5 20
northeast NE AM Main Jr. 5.1 .94 3 13
north NO Margot Weber 4.5 .89 5 9
central CT Ann Stephens 5.7 .94 5 13 # awk '{print "Number of fields: " NF}' datafile
Number of fields: 8
Number of fields: 8
Number of fields: 0
Number of fields: 8
Number of fields: 8
Number of fields: 8
Number of fields: 8
Number of fields: 9
Number of fields: 8
Number of fields: 8
Number of fields: 0 # awk '/northeast/{print $3,$2}' datafile
AM NE # awk '/E/' datafile
western WE Sharon Gray 5.3 .97 5 23
southeast SE Patricia Hemenway 4.0 .7 4 17
eastern EA TB Savage 4.4 .84 5 20
northeast NE AM Main Jr. 5.1 .94 3 13 # awk '/^[ns]/{print $1}' datafile
northwest
southwest
southern
southeast
northeast
north # awk '$5 ~/\.[7-9]+/' datafile
southwest SW Lewis Dalsass 2.7 .8 2 18
central CT Ann Stephens 5.7 .94 5 13 # awk '$2 !~ /E/{print $1,$2}' datafile
northwest NW
southwest SW
southern SO
north NO
central CT # awk '$3 ~/^Suan/{print $3 "is a nice girl."}' datafile
Suanis a nice girl. # awk '$8 ~ /[0-9][0-9]$/{print $8}' datafile
34
23
18
15
17
20
13 # awk '$4 ~ /Chin$/{printf "The price is $" $8 ".\n"}' datafile
The price is $15. # awk '/NO/{print $0}' datafile
north NO Margot Weber 4.5 .89 5 9
# cp datafile datafile.test
# vi datafile.test
"datafile.test" 11 lines, 346 characters
northwest NW Charles Main 3.0 .98 3 34
western WE Sharon Gray 5.3 .97 5 23 southwest SW Lewis Dalsass 2.7 .8 2 18
southern SO Suan Chin 5.1 .95 4 15
southeast SE Patricia Hemenway 4.0 .7 4 17
eastern EA TB Savage 4.4 .84 5 20
northeast NE AM Main Jr. 5.1 .94 3 13
north NO Margot Weber 4.5 .89 5 9
central CT Ann Stephens 5.7 .94 5 13
通过vi修改此文件,获得所需要表现形式
当前字段之间是制表符(除$3,$4之间为空格),利用vi的:1,$s/\t/:/g替换 # cat datafile.test
northwest:NW:Charles Main:3.0:.98:3:34
western:WE:Sharon Gray:5.3:.97:5:23
southwest:SW:Lewis Dalsass:2.7:.8:2:18
southern:SO:Suan Chin:5.1:.95:4:15
southeast:SE:Patricia Hemenway:4.0:.7:4:17
eastern:EA:TB Savage:4.4:.84:5:20
northeast:NE:AM Main:Jr.:5.1:.94:3
north:NO:Margot Weber:4.5:.89:5:9
central:CT:Ann Stephens:5.7:.94:5:13 #mv datafile.test datafile2 # awk -F[:] '{print $3,$1}' datafile2
Charles Main northwest
Sharon Gray western
Lewis Dalsass southwest
Suan Chin southern
Patricia Hemenway southeast
TB Savage eastern
AM Main northeast
Margot Weber north
Ann Stephens central
Tom JOnes 4424 5/12/66 543354
#awk -F: '/Tom Jones/{print $0}' employees2
Tom Jones:4424:5/12/66:543354
#awk '/Tom Jones/{print $0}' employees2
Tom Jones:4424:5/12/66:543354 模式与操作 1)模式 awk模式用来控制awk对输入的文本行执行什么操作。操作由正则表达式,判别条件真伪的表达式或两者的组合构成。awk的默认操作是打印所有使表达式结果为真的文本行。 注意:模式表达式中暗含着if语句。如果模式表达式含有if的意思,就不必用花括号把它括起来。当if是显式 的给出时,这个表达式就成了操作语句。语法将不同。 #cat employees Tom Jones 4424 5/12/66 553354 Mary Adams 5346 11/4/63 28765 Sally Chang 1654 7/22/54 650000 Billy Black 1683 9/23/44 336500 #awk '/Tom/' employees Tom Jones 4424 5/12/66 543354 注:如果输入文件中匹配到模式Tom,则打印Tom所在的记录。如果没有显式的指定操作,默认操作是打印文本 行,等价于命令 awk '$0 ~ /Tom/{print $0}' employees 而awk '$0 !~ /Tom/{print $0}' employees 则打印除Tom Jones以外的其他记录 Mary Adams 5346 11/4/63 28765 Sally Chang 1654 7/22/54 650000 Billy Black 1683 9/23/44 336500 #awk '$3<4000' employees Sally Chang 1654 7/22/54 650000 Billy Black 1683 9/23/44 336500 注:如果第3个字段的值小于4000,则打印该记录 2)操作 操作是以{}分隔的语句。如果操作前面有个模式,则模式控制着操作的范围。同一行内的多条语句由分号分隔,独占一行的语句则以换行符分隔 模式{操作语句;操作语句;。。。} 或 模式{ 操作语句 操作语句 } 注:操作如果紧跟在某个模式后面,它的第一个左花括号就必须与该模式同处一行。模式永远不会出现在花括 号中。 #awk ’/Tom/{print "Hello there, " $1}' employees Hello there,Tom 注:如果没有为模式指定操作,就会打印所有匹配该模式的文本行。 # awk '/^Mary/' employees
Mary Adama 5346 11/4/63 28765 # awk '/^[A-Z][a-z]+ /' employees
Tom Jones 4424 5/12/66 543354
Mary Adama 5346 11/4/63 28765
Sally chang 1654 7/22/54 650000
Billy Black 1683 9/23/44 336500 匹配操作符(~)用于对记录或字段的表达式进行匹配
# cat employees
Tom Jones 4424 5/12/66 543354
Mary Adama 5346 11/4/63 28765
Sally chang 1654 7/22/54 650000
Billy Black 1683 9/23/44 336500 # awk '$1 ~ /[Bb]ill/' employees
Billy Black 1683 9/23/44 336500 # awk '$1 !~ /ly$/' employees
Tom Jones 4424 5/12/66 543354
Mary Adama 5346 11/4/63 28765 # more employees2
Tom Jones:4424:5/12/66:543354
Mary Adama:5346:11/4/63:28765
Sally chang:1654:7/22/54:650000
Billy Black:1683:9/23/44:336500 # more info
/Tom/{print "Tom's birthday is" $3}
/Mary/{print NR,$0}
/^Sally/{print "Hi Sally." $1 "has a salary of $"$4 "."} # cat info
/Tom/{print "Tom's birthday is" $3}
/Mary/{print NR,$0}
/^Sally/{print "Hi Sally." $1 "has a salary of $"$4 "."} # awk -F: -f info employees2
Tom's birthday is5/12/66
2 Mary Adama:5346:11/4/63:28765
Hi Sally.Sally changhas a salary of $650000. 复习
# awk '/west/' datafile
northwest NW Charles Main 3.0 .98 3 34
western WE Sharon Gray 5.3 .97 5 23
southwest SW Lewis Dalsass 2.7 .8 2 18 # awk '/^north/' datafile
northwest NW Charles Main 3.0 .98 3 34
northeast NE AM Main Jr. 5.1 .94 3 13
north NO Margot Weber 4.5 .89 5 9 # awk '/^(no|so)/' datafile
northwest NW Charles Main 3.0 .98 3 34
southwest SW Lewis Dalsass 2.7 .8 2 18
southern SO Suan Chin 5.1 .95 4 15
southeast SE Patricia Hemenway 4.0 .7 4 17
northeast NE AM Main Jr. 5.1 .94 3 13
north NO Margot Weber 4.5 .89 5 9 # awk '{print $3,$2}' datafile
Charles NW
Sharon WE
Lewis SW
Suan SO
Patricia SE
TB EA
AM NE
Margot NO
Ann CT # awk 'print S1' datafile
syntax error The source line is 1.
The error context is
>>> print <<< S1
awk: Quitting
The source line is 1. # awk '{print $1}' datafile
northwest
western
southwest
southern
southeast
eastern
northeast
north
central # awk '{print $0}' datafile
northwest NW Charles Main 3.0 .98 3 34
western WE Sharon Gray 5.3 .97 5 23 southwest SW Lewis Dalsass 2.7 .8 2 18
southern SO Suan Chin 5.1 .95 4 15
southeast SE Patricia Hemenway 4.0 .7 4 17
eastern EA TB Savage 4.4 .84 5 20
northeast NE AM Main Jr. 5.1 .94 3 13
north NO Margot Weber 4.5 .89 5 9
central CT Ann Stephens 5.7 .94 5 13 # awk '{print "Number of fields: " NF}' datafile
Number of fields: 8
Number of fields: 8
Number of fields: 0
Number of fields: 8
Number of fields: 8
Number of fields: 8
Number of fields: 8
Number of fields: 9
Number of fields: 8
Number of fields: 8
Number of fields: 0 # awk '/northeast/{print $3,$2}' datafile
AM NE # awk '/E/' datafile
western WE Sharon Gray 5.3 .97 5 23
southeast SE Patricia Hemenway 4.0 .7 4 17
eastern EA TB Savage 4.4 .84 5 20
northeast NE AM Main Jr. 5.1 .94 3 13 # awk '/^[ns]/{print $1}' datafile
northwest
southwest
southern
southeast
northeast
north # awk '$5 ~/\.[7-9]+/' datafile
southwest SW Lewis Dalsass 2.7 .8 2 18
central CT Ann Stephens 5.7 .94 5 13 # awk '$2 !~ /E/{print $1,$2}' datafile
northwest NW
southwest SW
southern SO
north NO
central CT # awk '$3 ~/^Suan/{print $3 "is a nice girl."}' datafile
Suanis a nice girl. # awk '$8 ~ /[0-9][0-9]$/{print $8}' datafile
34
23
18
15
17
20
13 # awk '$4 ~ /Chin$/{printf "The price is $" $8 ".\n"}' datafile
The price is $15. # awk '/NO/{print $0}' datafile
north NO Margot Weber 4.5 .89 5 9
# cp datafile datafile.test
# vi datafile.test
"datafile.test" 11 lines, 346 characters
northwest NW Charles Main 3.0 .98 3 34
western WE Sharon Gray 5.3 .97 5 23 southwest SW Lewis Dalsass 2.7 .8 2 18
southern SO Suan Chin 5.1 .95 4 15
southeast SE Patricia Hemenway 4.0 .7 4 17
eastern EA TB Savage 4.4 .84 5 20
northeast NE AM Main Jr. 5.1 .94 3 13
north NO Margot Weber 4.5 .89 5 9
central CT Ann Stephens 5.7 .94 5 13
通过vi修改此文件,获得所需要表现形式
当前字段之间是制表符(除$3,$4之间为空格),利用vi的:1,$s/\t/:/g替换 # cat datafile.test
northwest:NW:Charles Main:3.0:.98:3:34
western:WE:Sharon Gray:5.3:.97:5:23
southwest:SW:Lewis Dalsass:2.7:.8:2:18
southern:SO:Suan Chin:5.1:.95:4:15
southeast:SE:Patricia Hemenway:4.0:.7:4:17
eastern:EA:TB Savage:4.4:.84:5:20
northeast:NE:AM Main:Jr.:5.1:.94:3
north:NO:Margot Weber:4.5:.89:5:9
central:CT:Ann Stephens:5.7:.94:5:13 #mv datafile.test datafile2 # awk -F[:] '{print $3,$1}' datafile2
Charles Main northwest
Sharon Gray western
Lewis Dalsass southwest
Suan Chin southern
Patricia Hemenway southeast
TB Savage eastern
AM Main northeast
Margot Weber north
Ann Stephens central
相关阅读 更多 +