use Gawk extract title from XML files
时间:2008-09-25 来源:suli2921
1. copy all title.xml files to a file called all
@echo off
set /p rootlocation=
xcopy %rootlocation%AxSASetup\Target\XmlComp\4d1aff77-e1c9-40ca-b862-d7c30e1eb15d.cmp.xml D:\temp\topic\
xcopy %rootlocation%AxSASetup\Target\XmlComp\bb94701b-458b-48a6-ac47-db8027b320af.cmp.xml D:\temp\topic\
xcopy %rootlocation%AxSASetup\Target\XmlComp\75a4698c-9fba-4056-828e-e4f3940245b8.cmp.xml D:\temp\topic\
copy *.xml.xml D:\temp\topic\all.txt
2. gawk -F\<title\> '{print $2}' all | sed 's/<\/title>//; /^$/d' > titlelist
Intro gawk -F\<title\> '{print $2}' all in file all, use <title> as separator, print the second segment, that is the title text
sed 's/<\/title>//; /^$/d' delete </title> and blank lines
>titlelist get a title list
xcopy %rootlocation%AxSASetup\Target\XmlComp\bb94701b-458b-48a6-ac47-db8027b320af.cmp.xml D:\temp\topic\
xcopy %rootlocation%AxSASetup\Target\XmlComp\75a4698c-9fba-4056-828e-e4f3940245b8.cmp.xml D:\temp\topic\
copy *.xml.xml D:\temp\topic\all.txt
2. gawk -F\<title\> '{print $2}' all | sed 's/<\/title>//; /^$/d' > titlelist
Intro gawk -F\<title\> '{print $2}' all in file all, use <title> as separator, print the second segment, that is the title text
sed 's/<\/title>//; /^$/d' delete </title> and blank lines
>titlelist get a title list
相关阅读 更多 +