一段将GB编码转换为utf8的代码
时间:2007-02-17 来源:PHP爱好者
一段将GB编码转换为utf8的代码 gb2utf8.php 文件如下:
Class GB2UTF8
{
var $gb; // 待转换的GB2312字符串
var $utf8; // 转换后的UTF8字符串
var $CodeTable; // 转换过程中使用的GB2312代码文件数组
var $ErrorMsg; // 转换过程之中的错误讯息
function GB2UTF8($InStr="")
{
$this->gb=$InStr;
$this->SetGb2312();
($this->gb=="")?0:$this->Convert();
}
function SetGb2312($InStr="gb2312.txt")
{ // 设置gb2312代码文件,默认为gb2312.txt
$this->ErrorMsg="";
$tmp=@file($InStr);
if (!$tmp) {
$this->ErrorMsg="No GB2312";
return false;
}
$this->CodeTable=array();
while(list($key,$value)=each($tmp)) {
$this->CodeTable[hexdec(substr($value,0,6))]=substr($value,7,6);
}
}
function Convert()
{ // 转换GB2312字符串到UTF8字符串,需预先设置$gb
$this->utf8="";
if(!trim($this->gb) || $this->ErrorMsg!="") {
return ($this->utf8=$this->ErrorMsg);
}
$str=$this->gb;
while($str) {
if (ord(substr($str,0,1))>127)
{
$tmp=substr($str,0,2);
$str=substr($str,2,strlen($str));
$tmp=$this->U2UTF8(hexdec($this->CodeTable[hexdec(bin2hex($tmp))-0x8080]));
for($i=0;$i<strlen ($tmp);$i+=3)
$this->utf8.=chr(substr($tmp,$i,3));
}
else
{
$tmp=substr($str,0,1);
$str=substr($str,1,strlen($str));
$this->utf8.=$tmp;
}
}
return $this->utf8;
}
function U2UTF8($InStr)
{
for($i=0;$i<count($InStr);$i++)
$str="";
if ($InStr <0x80) {
$str.=ord($InStr);
}
else if ($InStr <0x800) {
$str.=(0xC0 | $InStr>>6);
$str.=(0x80 | $InStr & 0x3F);
}
else if ($InStr <0x10000) {
$str.=(0xE0 | $InStr>>12);
$str.=(0x80 | $InStr>>6 & 0x3F);
$str.=(0x80 | $InStr & 0x3F);
}
else if ($InStr <0x200000) {
$str.=(0xF0 | $InStr>>18);
$str.=(0x80 | $InStr>>12 & 0x3F);
$str.=(0x80 | $InStr>>6 & 0x3F);
$str.=(0x80 | $InStr & 0x3F);
}
return $str;
}
}
php爱好者站 http://www.phpfans.net dreamweaver|flash|fireworks|photoshop.
Class GB2UTF8
{
var $gb; // 待转换的GB2312字符串
var $utf8; // 转换后的UTF8字符串
var $CodeTable; // 转换过程中使用的GB2312代码文件数组
var $ErrorMsg; // 转换过程之中的错误讯息
function GB2UTF8($InStr="")
{
$this->gb=$InStr;
$this->SetGb2312();
($this->gb=="")?0:$this->Convert();
}
function SetGb2312($InStr="gb2312.txt")
{ // 设置gb2312代码文件,默认为gb2312.txt
$this->ErrorMsg="";
$tmp=@file($InStr);
if (!$tmp) {
$this->ErrorMsg="No GB2312";
return false;
}
$this->CodeTable=array();
while(list($key,$value)=each($tmp)) {
$this->CodeTable[hexdec(substr($value,0,6))]=substr($value,7,6);
}
}
function Convert()
{ // 转换GB2312字符串到UTF8字符串,需预先设置$gb
$this->utf8="";
if(!trim($this->gb) || $this->ErrorMsg!="") {
return ($this->utf8=$this->ErrorMsg);
}
$str=$this->gb;
while($str) {
if (ord(substr($str,0,1))>127)
{
$tmp=substr($str,0,2);
$str=substr($str,2,strlen($str));
$tmp=$this->U2UTF8(hexdec($this->CodeTable[hexdec(bin2hex($tmp))-0x8080]));
for($i=0;$i<strlen ($tmp);$i+=3)
$this->utf8.=chr(substr($tmp,$i,3));
}
else
{
$tmp=substr($str,0,1);
$str=substr($str,1,strlen($str));
$this->utf8.=$tmp;
}
}
return $this->utf8;
}
function U2UTF8($InStr)
{
for($i=0;$i<count($InStr);$i++)
$str="";
if ($InStr <0x80) {
$str.=ord($InStr);
}
else if ($InStr <0x800) {
$str.=(0xC0 | $InStr>>6);
$str.=(0x80 | $InStr & 0x3F);
}
else if ($InStr <0x10000) {
$str.=(0xE0 | $InStr>>12);
$str.=(0x80 | $InStr>>6 & 0x3F);
$str.=(0x80 | $InStr & 0x3F);
}
else if ($InStr <0x200000) {
$str.=(0xF0 | $InStr>>18);
$str.=(0x80 | $InStr>>12 & 0x3F);
$str.=(0x80 | $InStr>>6 & 0x3F);
$str.=(0x80 | $InStr & 0x3F);
}
return $str;
}
}
php爱好者站 http://www.phpfans.net dreamweaver|flash|fireworks|photoshop.
相关阅读 更多 +