php抓网站图片
时间:2006-12-21 来源:njuguo
php抓网站图片的程序
<?php
$fp = fopen('http://www.163.com', 'r');
$data = '';
while(!feof($fp)){
$data .= fread($fp, 8912);
}
fclose($fp);
preg_match_all( "'<\s*img\s.*?src\s*=\s*
([\"\'])?
(?(1) (.*?)\\1 | ([^\s\>]+))
' isx", $data, $links);
while(list($key, $val) = each($links[2])){
if(!empty($val))
$match[] = $val;
}
while(list($key, $val) = each($links[3])){
if(!empty($val))
$match[] = $val;
}
$count = count($match);
for($i = 0; $i < $count; $i++){
echo $match[$i].'<br>';
}
?>
用网易主页测试,结果如下:
http://www.163.com/images/neteaselogo.gif
http://www.163.com/images/button_1.gif
http://www.163.com/images/button_3.gif
http://www.163.com/images/button_4.gif
http://www.163.com/images/button_5.gif
http://adimg.163.com/homepage/eachnet/0721/ebay.gif
+ imgSource +
http://cimg2.163.com/sports/2006/8/1/200608011203471f45a.jpg
http://cimg2.163.com/ent/2006/8/1/200608011518238760f.jpg
http://cimg2.163.com/ent/2006/8/1/200608011524370676d.jpg
http://cimg2.163.com/stock/2006/8/1/2006080111352263ed3.jpg
http://cimg2.163.com/biz/2006/8/1/200608011147230c5ec.jpg
http://cimg2.163.com/digi/2006/7/31/nuo/6075.jpg
http://cimg2.163.com/auto/2006/8/1/20060801134705d9391.gif
http://cimg2.163.com/health/2006/8/1/20060801090700ba56b.jpg
http://cimg2.163.com/lady/2006/8/1/200608011026029a8be.jpg
http://adimg.163.com/homepage/eachnet/0801/img01.gif
http://adimg.163.com/homepage/eachnet/0801/lycra.gif
http://adimg.163.com/homepage/eachnet/0801/ym.gif
http://adimg.163.com/homepage/eachnet/0801/camera_36x36.gif
http://adimg.163.com/homepage/eachnet/0801/cloth_36x36.gif
http://adimg.163.com/homepage/eachnet/0801/toy_36x361.gif
http://cimg.163.com/stock/d.gif
http://cimg.163.com/stock/d.gif
http://images.163.com/images/163homepage/biaoshi.gif
http://images.163.com/bj110.gif
http://adgeo.163.com/ad_cookies
惭愧,有两个有问题(红色标记)。
打开网易主页的源代码,有如下两句:
<img src=" + imgSource + " width=185 height=120 alt=点击查看 />
<img src='http://adgeo.163.com/ad_cookies' width="0" height="0">