文章详情

  • 游戏榜单
  • 软件榜单
关闭导航
热搜榜
热门下载
热门标签
php爱好者> php文档>awk

awk

时间:2006-10-18  来源:anima

awk

You should see the contents of your /etc/passwd file appear before your eyes. Now, for an explanation of what awk did. When we called awk, we specified /etc/passwd as our input file. When we executed awk, it evaluated the print command for each line in /etc/passwd, in order. All output is sent to stdout, and we get a result identical to catting /etc/passwd.

Now, for an explanation of the { print } code block. In awk, curly braces are used to group blocks of code together, similar to C. Inside our block of code, we have a single print command. In awk, when a print command appears by itself, the full contents of the current line are printed.

$ awk '{ print $0 }' /etc/passwd

In awk, the $0 variable represents the entire current line, so print and print $0 do exactly the same thing.

$ awk '{ print "" }' /etc/passwd

$ awk '{ print "hiya" }' /etc/passwd

Running this script will fill your screen with hiya's. :)

Multiple fields

print $1

$ awk -F":" '{ print $1 $3 }' /etc/passwd

halt7
operator11
root0
shutdown6
sync5
bin1
....etc.


print $1 $3

$ awk -F":" '{ print $1 " " $3 }' /etc/passwd

$1$3

$ awk -F":" '{ print "username: " $1 "ttuid:" $3" }' /etc/passwd

username: halt uid:7
username: operator uid:11
username: root uid:0
username: shutdown uid:6
username: sync uid:5
username: bin uid:1
....etc.


External scripts

BEGIN {
FS=":"
}
{ print $1 }



The difference between these two methods has to do with how we set the field separator. In this script, the field separator is specified within the code itself (by setting the FS variable), while our previous example set FS by passing the -F":" option to awk on the command line. It's generally best to set the field separator inside the script itself, simply because it means you have one less command line argument to remember to type. We'll cover the FS variable in more detail later in this article.

The BEGIN and END blocks

Normally, awk executes each block of your script's code once for each input line. However, there are many programming situations where you may need to execute initialization code before awk begins processing the text from the input file. For such situations, awk allows you to define a BEGIN block. We used a BEGIN block in the previous example. Because the BEGIN block is evaluated before awk starts processing the input file, it's an excellent place to initialize the FS (field separator) variable, print a heading, or initialize other global variables that you'll reference later in the program.

Awk also provides another special block, called the END block. Awk executes this block after all lines in the input file have been processed. Typically, the END block is used to perform final calculations or print summaries that should appear at the end of the output stream.

Regular expressions and blocks

/foo/ { print }

/[0-9]+.[0-9]*/ { print }

Expressions and blocks

fredprint

$1 == "fred" { print $3 }

root

$5 ~ /root/ { print $3 }

Conditional statements

if
{
if ( $5 ~ /root/ ) {
print $3
}
}



Both scripts function identically. In the first example, the boolean expression is placed outside the block, while in the second example, the block is executed for every input line, and we selectively perform the print command by using an if statement. Both methods are available, and you can choose the one that best meshes with the other parts of your script.

if if
{
if ( $1 == "foo" ) {
if ( $2 == "foo" ) {
print "uno"
} else {
print "one"
}
} else if ($1 == "bar" ) {
print "two"
} else {
print "three"
}
}

if
! /matchme/ { print $1 $3 $4 }
{
if ( $0 !~ /matchme/ ) {
print $1 $3 $4
}
}


Both scripts will output only those lines that don't contain a matchme character sequence. Again, you can choose the method that works best for your code. They both do the same thing.

( $1 == "foo" ) && ( $2 == "bar" ) { print }

This example will print only those lines where field one equals foo and field two equals bar.

Numeric variables!

In the BEGIN block, we initialize our integer variable x to zero. Then, each time awk encounters a blank line, awk will execute the x=x+1 statement, incrementing x. After all the lines have been processed, the END block will execute, and awk will print out a final summary, specifying the number of blank lines it found.

Stringy variables

2.01

1.01x$( )1.01

{ print ($1^2)+1 }

If you do a little experimenting, you'll find that if a particular variable doesn't contain a valid number, awk will treat that variable as a numerical zero when it evaluates your mathematical expression.

Lots of operators

Another nice thing about awk is its full complement of mathematical operators. In addition to standard addition, subtraction, multiplication, and division, awk allows us to use the previously demonstrated exponent operator "^", the modulo (remainder) operator "%", and a bunch of other handy assignment operators borrowed from C.

These include pre- and post-increment/decrement ( i++, --foo ), add/sub/mult/div assign operators ( a+=3, b*=2, c/=2.2, d-=6.2 ). But that's not all -- we also get handy modulo/exponent assign ops as well ( a^=2, b%=4 ).

Field separators (FS)

Awk has its own complement of special variables. Some of them allow you to fine-tune how awk functions, while others can be read to glean valuable information about the input. We've already touched on one of these special variables, FS. As mentioned earlier, this variable allows you to set the character sequence that awk expects to find between fields. When we were using /etc/passwd as input, FS was set to ":". While this did the trick, FS allows us even more flexibility.

FS="t+"

Above, we use the special "+" regular expression character, which means "one or more of the previous character".

FS="[[:space:]+]"

While this assignment will do the trick, it's not necessary. Why? Because by default, FS is set to a single space character, which awk interprets to mean "one or more spaces or tabs." In this particular example, the default FS setting was exactly what you wanted in the first place!

FS="foo[0-9][0-9][0-9]"

Number of fields (NF)

{ 
if ( NF > 2 ) {
print $1 " " $2 ":" $3
}
}



Record number

{
#skip header
if ( NR > 10 ) {
print "ok, now for the real information!"
}
}



Awk provides additional variables that can be used for a variety of purposes. We'll cover more of these variables in later articles.

Examples

# Print every line after erasing the 2nd field
awk '{$2 = ""; print}' file

# Print hi 48 times
yes | head -48 | awk '{ print "hi" }'

# Print hi.0010 to hi.0099
yes | head -90 | awk '{printf("hi00%2.0f n", NR+9)}'
相关阅读 更多 +
排行榜 更多 +
我是汪汪小画家

我是汪汪小画家

休闲益智 下载
中厦全供

中厦全供

购物比价 下载
越砍越低价

越砍越低价

购物比价 下载