awk应用(问题来源--CU论坛)
时间:2008-11-12 来源:north423
题如下
The file top10_mktval.csv contains the top 10 stocks by % of index market value from 1980 - 2007 in comma-delimited format. It has the following fields:
1. Year
2. Stock name
3. Rank
4. Market value (in millions)
5. % of index
这个文件在我上传的包里有
问题1。 List the years since 2000 when Amer Intl Group (AIG) is NOT in the top 10 in market value.
问题2。 List all the companies that were NOT ranked # 1 while commanding at least 3% of the index market value. //# 指的是NUMBER
贴出一些top10_mktval.csv 的内容,
2007,Exxon Mobil,1,511887,3.86%
2007,Genl Electric,2,379826,2.86%
2007,Microsoft Corp,3,333054,2.51%
2007,AT&T Inc,4,252051,1.90%
2007,Procter & Gamble,5,228016,1.72%
2007,Google Inc'A',6,216323,1.63%
2007,Chevron Corp,7,197061,1.49%
2007,Johnson & Johnson,8,190879,1.44%
2007,Wal-Mart Stores,9,190349,1.44%
2007,Bank of America,10,183068,1.38%
2006,Exxon Mobil,1,446944,3.43%
2006,Genl Electric,2,383564,2.94%
2006,Microsoft Corp,3,293538,2.25%
2006,Citigroup Inc,4,273691,2.10%
2006,Bank of America,5,239758,1.84%
2006,Procter & Gamble,6,203656,1.56%
2006,Wal-Mart Stores,7,192479,1.48%
2006,Johnson & Johnson,8,191415,1.47%
2006,Pfizer Inc,9,186751,1.43%
2006,Amer Intl Group,10,186296,1.43%
2005,Genl Electric,1,370344,3.21%
2005,Exxon Mobil,2,349512,3.03%
2005,Microsoft Corp,3,278358,2.41%
2005,Citigroup Inc,4,245512,2.13%
2005,Procter & Gamble,5,197801,1.71%
2005,Wal-Mart Stores,6,194851,1.69%
2005,Bank of America,7,185342,1.61%
2005,Johnson & Johnson,8,178793,1.55%
2005,Amer Intl Group,9,177098,1.54%
2005,Pfizer Inc,10,171901,1.49%
2004,Genl Electric,1,385883,3.45%
2004,Exxon Mobil,2,330693,2.96%
2004,Microsoft Corp,3,290489,2.60%
2004,Citigroup Inc,4,250042,2.24%
2004,Wal-Mart Stores,5,223686,2.00%
2004,Pfizer Inc,6,202508,1.81%
2004,Bank of America,7,189801,1.70%
2004,Johnson & Johnson,8,188213,1.68%
2004,Amer Intl Group,9,171042,1.53%
2004,Intl Bus. Machines,10,164106,1.47%
2003,Genl Electric,1,310384,3.05%
2003,Microsoft Corp,2,295937,2.91%
2003,Exxon Mobil,3,271002,2.67%
2003,Pfizer Inc,4,269622,2.65%
2003,Citigroup Inc,5,250402,2.46%
2003,Wal-Mart Stores,6,229589,2.26%
。
。
。
————————————my solution------------------
虽然要求使用grep解决问题,但本人最近在学习awk,一下是awk的代码:
1: awk -F, '$1 > 2000 && $2 !~ "Amer Intl Group" {print $1}' top10_mktval.csv | uniq -c |awk '$1 ~ 10 {print $2}'
2:awk -F, '$3 !~ 1&& $5 ~ "^[3-9]\.*\%$" {print $2}' top10_mktval.csv
grep代码:(试用了uniq命令,:)偷懒)
1: grep -v 'Amer Intl Group' top10_mktval.csv |grep -o '^[2-9][0-9]\{3\}' |uniq -c |grep '10'|grep -o '[2-9][0-9]\{3\}'
2: grep -Ei '[3-9]\.[0-9][0-9]\%$' top10_mktval.csv| grep ',[^1],'
希望有人可以提出更好的代码,多指点,谢谢
The file top10_mktval.csv contains the top 10 stocks by % of index market value from 1980 - 2007 in comma-delimited format. It has the following fields:
1. Year
2. Stock name
3. Rank
4. Market value (in millions)
5. % of index
这个文件在我上传的包里有
问题1。 List the years since 2000 when Amer Intl Group (AIG) is NOT in the top 10 in market value.
问题2。 List all the companies that were NOT ranked # 1 while commanding at least 3% of the index market value. //# 指的是NUMBER
贴出一些top10_mktval.csv 的内容,
2007,Exxon Mobil,1,511887,3.86%
2007,Genl Electric,2,379826,2.86%
2007,Microsoft Corp,3,333054,2.51%
2007,AT&T Inc,4,252051,1.90%
2007,Procter & Gamble,5,228016,1.72%
2007,Google Inc'A',6,216323,1.63%
2007,Chevron Corp,7,197061,1.49%
2007,Johnson & Johnson,8,190879,1.44%
2007,Wal-Mart Stores,9,190349,1.44%
2007,Bank of America,10,183068,1.38%
2006,Exxon Mobil,1,446944,3.43%
2006,Genl Electric,2,383564,2.94%
2006,Microsoft Corp,3,293538,2.25%
2006,Citigroup Inc,4,273691,2.10%
2006,Bank of America,5,239758,1.84%
2006,Procter & Gamble,6,203656,1.56%
2006,Wal-Mart Stores,7,192479,1.48%
2006,Johnson & Johnson,8,191415,1.47%
2006,Pfizer Inc,9,186751,1.43%
2006,Amer Intl Group,10,186296,1.43%
2005,Genl Electric,1,370344,3.21%
2005,Exxon Mobil,2,349512,3.03%
2005,Microsoft Corp,3,278358,2.41%
2005,Citigroup Inc,4,245512,2.13%
2005,Procter & Gamble,5,197801,1.71%
2005,Wal-Mart Stores,6,194851,1.69%
2005,Bank of America,7,185342,1.61%
2005,Johnson & Johnson,8,178793,1.55%
2005,Amer Intl Group,9,177098,1.54%
2005,Pfizer Inc,10,171901,1.49%
2004,Genl Electric,1,385883,3.45%
2004,Exxon Mobil,2,330693,2.96%
2004,Microsoft Corp,3,290489,2.60%
2004,Citigroup Inc,4,250042,2.24%
2004,Wal-Mart Stores,5,223686,2.00%
2004,Pfizer Inc,6,202508,1.81%
2004,Bank of America,7,189801,1.70%
2004,Johnson & Johnson,8,188213,1.68%
2004,Amer Intl Group,9,171042,1.53%
2004,Intl Bus. Machines,10,164106,1.47%
2003,Genl Electric,1,310384,3.05%
2003,Microsoft Corp,2,295937,2.91%
2003,Exxon Mobil,3,271002,2.67%
2003,Pfizer Inc,4,269622,2.65%
2003,Citigroup Inc,5,250402,2.46%
2003,Wal-Mart Stores,6,229589,2.26%
。
。
。
————————————my solution------------------
虽然要求使用grep解决问题,但本人最近在学习awk,一下是awk的代码:
1: awk -F, '$1 > 2000 && $2 !~ "Amer Intl Group" {print $1}' top10_mktval.csv | uniq -c |awk '$1 ~ 10 {print $2}'
2:awk -F, '$3 !~ 1&& $5 ~ "^[3-9]\.*\%$" {print $2}' top10_mktval.csv
grep代码:(试用了uniq命令,:)偷懒)
1: grep -v 'Amer Intl Group' top10_mktval.csv |grep -o '^[2-9][0-9]\{3\}' |uniq -c |grep '10'|grep -o '[2-9][0-9]\{3\}'
2: grep -Ei '[3-9]\.[0-9][0-9]\%$' top10_mktval.csv| grep ',[^1],'
希望有人可以提出更好的代码,多指点,谢谢
|
相关阅读 更多 +