shell学习笔记－－sort uniq cut tr split

时间：2008-12-26 来源：szszszsz

1、sort
sort缺省默认空格和tab键为分隔符。其他方式分隔，使用-t选项。缺省以第1列来排序，即-k1
   -n 指定分类是域上的数字分类。
如编辑文件1111.txt
sdsad   311     315     asd3f
wdasd   551     133     adsff
sdsad   606     44      fgfdgdf
wdwew   77      599     gghgf
eeese   23      22      fgdf
eeese   23      22      fgdf
dfdff   78      55      fdgd
   -k 使用k做分类排序,如按第2列来分类排序
[root@localhost ~]# sort -k2 1111.txt
eeese   23      22      fgdf
eeese   23      22      fgdf
sdsad   311     315     asd3f
wdasd   551     133     adsff
sdsad   606     44      fgfdgdf
wdwew   77      599     gghgf
dfdff   78      55      fdgd
    -n 指定分类列上按数值来分类排序，如第按第2列数值大小来分类
[root@localhost ~]# sort -k2n 1111.txt
eeese   23      22      fgdf
eeese   23      22      fgdf
wdwew   77      599     gghgf
dfdff   78      55      fdgd
sdsad   311     315     asd3f
wdasd   551     133     adsff
sdsad   606     44      fgfdgdf
    -u 去除重复的行，即完全一样的行，只保留一行
[root@localhost ~]# sort -k2n -u 1111.txt
eeese   23      22      fgdf         只有1行了
wdwew   77      599     gghgf
dfdff   78      55      fdgd
sdsad   311     315     asd3f
wdasd   551     133     adsff
sdsad   606     44      fgfdgdf

2.uniq
默认不加参数，重复的行只显示1行
[root@localhost ~]# uniq 1111.txt
sdsad   311     315     asd3f
wdasd   551     133     adsff
sdsad   606     44      fgfdgdf
wdwew   77      599     gghgf
eeese   23      22      fgdf
dfdff   78      55      fdgd
-u --unique 只显示不是重复出现的行，如下重复的行eeese   23      22      fgdf 被去掉了
[root@localhost ~]# uniq -u 1111.txt
sdsad   311     315     asd3f
wdasd   551     133     adsff
sdsad   606     44      fgfdgdf
wdwew   77      599     gghgf
dfdff   78      55      fdgd

-d, --repeated 只显示重复的行
[root@localhost ~]# uniq -d 1111.txt
eeese   23      22      fgdf

-c --count 打印每一重复行出现次数
        [root@localhost ~]# uniq -c 1111.txt
      1 sdsad   311     315     asd3f
      1 wdasd   551     133     adsff
      1 sdsad   606     44      fgfdgdf
      1 wdwew   77      599     gghgf
      2 eeese   23      22      fgdf
      1 dfdff   78      55      fdgd

3.cut
    -c 用来指定剪切范围，如下所示：
          -c2，5-8 剪切第2个字符，然后是第5到第8个字符。
          -c1-50 剪切前5 0个字符。
   -f field 指定剪切域数，如下所示
         -f3，5 剪切第3域，第5域。
         -f2，8-10 剪切第2域，第8域到第10域。
4.tr
把小写字母换成大写
   tr "[a-z]" "[A-Z]" <1111.txt
把单个空格换成tab键
   tr " " "\t" <1111.txt
把多个空格换成tab键
   tr -s " " "\t" <1111.txt

5.split
有一文件aaa.sql，有3532行。大小2675K
[root@localhost ~]# wc -l aaa.sql
3532 aaa.sql
[root@localhost ~]# ll aaa.sql
-rw-r--r-- 1 root root 2675086 12-28 04:36 aaa.sql
split不加参数，默认以1000行一个文件分割，文件名以xaa，xab，xac....
加-l参数，以1500行分割文件
[root@localhost ~]# split -l 1500 aaa.sql
[root@localhost ~]# wc -l xa*
   1500 xaa
   1500 xab
    532 xac
   3532 总计
以文件大小来分割-b参数,以1M来分割文件
[root@localhost ~]# split -b 1m aaa.sql
[root@localhost ~]# ll xa*
-rw-r--r-- 1 root root 1048576 12-28 04:48 xaa
-rw-r--r-- 1 root root 1048576 12-28 04:48 xab
-rw-r--r-- 1 root root 577934 12-28 04:48 xac
以800k来分割文件
[root@localhost ~]# split -b 800k aaa.sql
[root@localhost ~]# ll xa*
-rw-r--r-- 1 root root 819200 12-28 04:49 xaa
-rw-r--r-- 1 root root 819200 12-28 04:49 xab
-rw-r--r-- 1 root root 819200 12-28 04:49 xac
-rw-r--r-- 1 root root 217486 12-28 04:49 xad