File_Archive is a PEAR package
时间:2007-07-17 来源:liuxingyuyuni
File_Archive is a PEAR package that will let you manipulate easily the tar, gz, bz2, tgz, tbz, zip, ar and deb files.
It lets you generate archives on the fly, write them as a file or in memory...
A few examples so that you see how File_Archive is easy to use
Extract a tar archive to a sub directoryrequire_once "File/Archive.php";
File_Archive::extract(
//The content of archive.tar is extracted
$src = "archive.tar/",
//And is written to folder archive
$dest = "archive"
);
?>
Send a zip archive containing the content of a tar file to the standard outputrequire_once "File/Archive.php";
File_Archive::extract(
//The content of archive.tar appears in the root folder (default argument)
$src = "archive.tar/",
//And is written to ...
File_Archive::toArchive( // ... a zip archive
"archive.zip", // called archive.zip
File_Archive::toOutput() // that will be sent to the standard output
)
);
?>
Output a file from an archiverequire_once "File/Archive.php";
//Note as the archives appear as folders
File_Archive::extract(
$src = "archive.tar/inner.tgz/file.txt",
File_Archive::toOutput()
);
?>
Readers
A reader is an object that represents a list of files and directories. Those files can be generated dynamically or exist physically. For example, there is a reader class for a directory, or for each archive format handled by File_Archive, and they all have the same interface.
To create a reader, you will have to use the File_Archive factory. The important function is the read function
function read($URL, [$symbolic=null], [$uncompression = 0], [$directoryDepth = -1])
In this function the URL will represent what you want to read.
Generation of sourcesrequire_once "File/Archive.php";
/*
To read a directory, just give the directory name
By default, the directory and all the subdirectories
will be parsed (see $directoryDepth to change this)
*/
$source = File_Archive::read("Path/to/dir");
/*
To read from one single file,
simply provide the name of the file
*/
$source = File_Archive::read("Path/to/dir/file.txt");
/*
An archive will be considered as a directory if a slash follows
This example reads the directory and all the subdirectories of inner/dir
contained in archive Path/to/dir/archive.tar
This reads all the .txt files in the inner directory of the archive.tar
*/
$source = File_Archive::read("Path/to/dir/archive.tar/inner/*.txt");
/*
If you want to uncompress the archive (read all its content)
Note: if you ommit the trailing /, the archive will be treated as a single file
*/
$source = File_Archive::read("Path/to/dir/archive.tar/");
?>
The symbolic attribute says how the files read will be displayed for future use
If $URL is a directory, $URL will be replaced by $symbolic (or '' is $symbolic is null). So, in our first example, the files will be displayed as if the current directory was 'Path/to/dir': since by default $symbolic is empty, Path/to/dir will be simply removed from the file.
You may want to put 'Path/to/dir' as $symbolic to keep the full path 'Path/to/dir'.
If $URL is a file, then only the filename will be kept, and $symbolic will be added to it. So, in our second example, the source contains a file with symbolic name 'file.txt'. If a symbolic name 'foo' had been specified, the source would contain 'foo/file.txt'
The $uncompression parameter indicate how many files will be uncompressed while parsing the tree to files.
By default the files are not uncompressed. So, if you do File_Archive::read('archive.tar/inner/dir', 'inner/dir'), and if archive.tar contains a file called archive.tar/inner/dir/file.tgz, this second archive will appear as a file and not as a directory. It won't be uncompressed because $uncompression is 0.
If $uncompression is set to 1, file.tgz would appear as a directory, but the files inside this archive would not be uncompressed
If $uncompression is set to -1, all the files would be uncompressed, regardless of the depth
Note: the compressed files that may appear in $URL are not taken into account by $uncompression variable
The $directoryDepth parameter gives a limit to the number of directory read by the reader.
Multi readers
Using a multi reader, you can make several sources appear as one.
You can create a multi reader using the File_Archive::readMulti() function
Multi reader//This reader contains the content of directory and archive.tar
$source = File_Archive::readMulti(array(
File_Archive::read("directory"),
File_Archive::read("archive.tar/")
);
?>
Reading the content of a data reader Any reader provides the following interface:
- function next()
Go to the next file in the source. Returns false when the end of the archive is reached.
- function getFilename()
Returns the filename of the currently selected entry in the archive
- function getStat()
Returns the stat as the stat function does
This function may not return a complete array, it may even return array()
- function getDataFilename()
For optimisation purposes : if the source is a physical file, this function returns the name of the file
Else it returns null
- function getData($length = -1)
Reads some data from the source
This function will return a string which size is determined by the smallest of
- length if length>=0
- the end of the file
- length if length>=0
- function skip($length)
Equivalent to getData($length), but does not return any data.
Depending on the data reader, this function can be far more efficient than getData
- function close()
Should be called after having used the data reader (closes the file handles...)
This function moves the object in the same state as it was before the first call to next
After this call, you can iterate again on the data reader.
Writing content of a data reader->close(); //Move back to the begining of the source
while($source->next())
echo $source->getFilename()."
\n";
?>
Functions that use readers
All File_Archive functions that take a reader as an argument also accept strings and arrays. The strings will be automatically interpreted as a reader using File_Archive::read function. The arrays will be interpreted as a multi reader.
Since the readers are passed by reference, you will have to pass a variable and not the raw string or array.
It is thus possible to rewrite the previous example like that:
Multi reader//This reader contains the content of directory and archive.tar
File_Archive::extract(
$src = array("directory", "archive.tar/"),
File_Archive::toArchive("test.zip")
);
?>
Writers
A writer is an object that deals with data. Some writers transform data (this is the case of the archive writers), some save them to disk (for files writers), or to memory (for the memory writer)... They all implement the same interface
You can transfer data from a reader to a writer using the File_Archive::extract function
All the writers can be created thanks to the File_Archive factory, and more particularly the File_Archive::to* functions.
Write archives To create a writer that will generate an archive, use the toArchive function
function toArchive($filename, &$innerWriter, [$type=null], [$stat = array()], [$autoClose = true])
- $filename is the name of the generated archive
- $innerWriter is another writer in which the archive file will be written
- $type is one of Tar, Gzip or Zip and indicates the format of compression. If not specified, the type is determined thanks to the extension of the filename
- $stat is an optionnal array to indicate stats about the archive (see the PHP stat function for the possible indexes)
- $autoClose indicate whether the inner writer will be closed once the data are sent
It may be usefull not to close the writer if you want to append some more data after
In general, you won't need to keep the writer open, so you should just keep the default value
Generation of archive writersrequire_once "File/Archive.php";
/* Writer to a tar file */
File_Archive::toArchive("archive.tar", $innerWriter);
/* Writer to a tar.gz file */
File_Archive::toArchive("archive.tgz", $innerWriter);
/* Writer to a zip file */
File_Archive::toArchive("archive.zip", $innerWriter);
?>
Write to files A writer can write the files to physical files. To create such a writer, call File_Archive::toFiles();
If a directory does not exist, it will be automatically created by files.
Use files writerrequire_once "File/Archive.php";
/* Copy a whole directory to another location */
File_Archive::extract(
File_Archive::read("Path/to/dir", "new/directory");
File_Archive::toFiles()
);
/* Convert an archive to another format: tgz to zip */
File_Archive::extract(
File_Archive::read("archive.tgz/"),
File_Archive::toArchive("archive.zip",
File_Archive::toFiles()
)
);
?>
Send emails
You can send emails as attachment using a mail writer available thanks to File_Archive::toMail function
function toMail($to, $headers, $message, &$mail = null)
This function relies on the PEAR mail and mail_mime libraries, and the parameters are the same as the one of these classes :
- $to an array or a string with comma separated recipients
- $headers will be sent to Mail_Mime and to $mail: an associative array of headers. The header name is used as key and the header value as value.
- $message the text version of the body of the mail
You can provide an HTML version thanks to the setHTMLBody and addHTMLImage of the writer The prototypes of these functions are the same as the ones of Mail_Mime
- $mail the way to send mail. This is an object created thanks to the Mail::factory function.
If null, Mail::factory("mail") will be used (and the email will be sent using the PHP mail function)
Use mail writerrequire_once "File/Archive.php";
/* Send the files in the current directory (no recursion) as attachment */
File_Archive::extract(
File_Archive::read("Path/to/dir", "", 0, 0),
File_Archive::toMail(
$to, // recipients
array(
"Subject" => "Path/to/dir directory",
"From" => "[email protected]"
),
"Find all the files attached" // body
)
);
?>
Send files to the user
To send files to the remote user (ie write data to the standard output), you need a special writer. You can build one calling function File_Archive::toOutput().
This writer will automatically send a header forcing the download of the file
If you don't want that, call File_Archive::toOutput(false).
Multi writers
Using a multi writer, you can write the data to two or more different locations in parallele
A typical use is to send the file to the user at the same time as you write it to a file
It can also be used to generate archives in different formats
You can create a multi writer using the File_Archive::toMulti($dest1, $dest2)
Multi writer//Send a directory to the user and to a file
File_Archive::extract(
File_Archive::read("directory"),
File_Archive::toArchive("multi.zip",
File_Archive::toMulti(
File_Archive::toOutput(),
File_Archive::toFiles()
)
)
);
?> h3>Writing to a writer It is also possible to write data directly to a writer, without using a reader. To do so, you can use the following interface implemented by any writer:
- function newFile($URL, $stat) Create a new file in the writer
URL is the name of the file, stat is an array of statistics about the data (see the PHP stat function for more information)
The stat array may not contain all the information. The only index that must be present is index 7 (size of the data)
- function writeData($data) Append the specified data to the writer
A call to newFile must have been done previously
- function close() Close the writer, eventually flush the data, write the footer...
This function must be called before the end of the file, else some data may not be treated by the writer
Dynamic creation of a zip filerequire_once "File/Archive.php";
$dest = File_Archive::toArchive("foo.zip", File_Archive::toFiles());
$dest->newFile("even.txt");
for($i=0; $i100; $i++)
$dest->writeData((2*$i)."\n");
$dest->newFile("odd.txt");
for($i=0; $i100; $i++)
$dest->writeData((2*$i+1)."\n");
$dest->close();
?> Note: If you do not specify the stat array in the newFile function, the majority of the archives will have to buffer the data until the end of the file is reached (this is because the size of the file is usually needed to be able to write the header).
This may be a memory problem if you want to generate really large files.
Functions that use writers
All File_Archive functions that take a writer as an argument also accept strings and arrays. The strings will be automatically interpreted as a writer using
File_Archive::appender
function. The arrays will be interpreted as a multi writer.
Since the readers are passed by reference, you will have to pass a variable and not the raw string or array.
It is thus possible to rewrite the previous example like that:
Multi writer//Send a directory to the user and to a file
File_Archive::extract(
$src = "directory",
File_Archive::toArchive("multi.zip",
$dest = array(
File_Archive::toOutput(),
File_Archive::toFiles()
)
)
);
?>
Predicates
File_Archive introduces the concept of filters to be able to select the files from a source.
A filter is a particular reader that you can create with the File_Archive::filter function
This function requires you to give a predicate. You can build this predicat using the File_Archive::pred* functions.
The standard logic predicat are
- predTrue(): always evaluates to true
- predFalse(): always evaluates to false
- predAnd($p1, $p2, ...): evaluates to $p1 && $p2 && ...
- predOr($p1, $p2, ...): evaluates to $p1 || $p2 || ...
- predNot($p): evaluates to !$p
- predMinSize($size): keep only the files which size is >= $size (in bytes)
- predMinTime($time): keep only the files that have been modified after $time (unix timestamp)
- predMaxDepth($depth): keep only the files that have a public name with less than $depth directories
- predExtension($list): keep only the files with a given extension
$list is an array or a comma separated string of allowed extensions
- predEreg($ereg): keep only the files that have a public name that matches the given regular expression
- predEregi($ereg): same as predEreg, but the test is case insensitive
Filter examples//Extract all the files that contain an 'a' in their path or filename from a tar archive
File_Archive::extract(
File_Archive::filter(
File_Archive::predEreg("a"),
File_Archive::read("archive.tar/", "folder")
),
File_Archive::toFiles()
);
//Compress a directory to a zip file, including only the files
//smaller than 1MB that have changed since last hour
File_Archive::extract(
File_Archive::filter(
File_Archive::predAnd(
File_Archive::predNot(
File_Archive::predMinSize(1024*1024)
),
File_Archive::predMinTime(time()-3600)
),
File_Archive::read("directory")
),
File_Archive::toArchive("directory.zip",
File_Archive::toFiles()
)
);
?>
Archive modification: remove and appenders
File_Archive version 1.3 introduces some new functions to edit existing archives. These functions will allow you to remove or append files to an existing archive.
Since for File_Archive, the file system is just another reader / writer, those modifications can be done on "real" archives (real files), or on nested archives (an archive inside another archive).
Remove files from an existing archive To remove files from an archive, you'll use one of the following functions from File_Archive class:
- remove(&$pred, $URL)
Removes all the files that follow a given predicate from file $URL
Note that the URL is the same as in the File_Archive::read function
You can use nested archives
- removeFromSource(&$pred, &$source, $URL = null)
Same as remove, but use $source instead of the default file system reader
If no URL is specified, $source must be an archive, and the files will be removed from here
- removeDuplicates($URL)
This function will remove all the doublons from the archive at $URL
Only the most recent file will be kept
If the modification date is not specified for a file, it will be considered infinitely old
If two files have the same modification date, the one that has highest position in the archive (usually the one that was added to the archive in last) will be kept
- removeDuplicatesFromSource(&$source, $URL = null)
Same as removeDuplicate, but use $source instead of the default file system reader
If no URL is specified, the duplicates will be removed from $source itself
Remove Jpg, gif and Bmp from a zip archiverequire_once "File/Archive.php";
File_Archive::remove(
File_Archive::predExtension(
array("jpg", "jpeg", "bmp", "gif")
),
"archive.zip"
);
?>
Remove image files from a nested archiverequire_once "File/Archive.php";
File_Archive::remove(
File_Archive::predMIME(
array("image/*")
),
"archive.zip/data.tgz"
);
?>
Appending files to an archive To append files to an archive, you'll use one of the following functions from File_Archive class:
- appender($URL, $unique = null, $type = null, $stat = array())
Allows to append to the archive specified by the URL.
If unique is set to true, the eventual duplicates created by the insertion of new files will be automatically removed. If set to null, the default value (that you can change using File_Archive::setOption('appendRemoveDuplicates', true/false)) is used (false by default)
If the archive does not exist, it will be created using the type specified in parameter (or looking at the extension in the URL if the type is not specified), and the stat array
- appenderFromSource(&$source, $URL = null, $unique = null, $type = null, $stat = array())
Same as appender, but using $source instead of the default file system reader
Add a folder to an archiverequire_once "File/Archive.php";
File_Archive::extract(
File_Archive::read("folder"),
File_Archive::appender("archive.zip")
);
?>
Note that if you use a string instead of a writer in a function, the string will be converted using File_Archive::appender. Thus the previous example could be rewritten to:
Add a folder to an archiverequire_once "File/Archive.php";
File_Archive::extract(
$src = "folder",
$dest= "archive.zip"
);
?>
Cache
File_Archive 1.4 introduce the possibility to use a cache to store intermediate result of a zip compression. It uses the
Cache_Lite
PEAR package to do so.
A zip file is made of compressed files, one after the others. So if you generate an archive that contains files A, B and C and then another archive that contains A and C, you will compress twice the files A and C. The use of the cache will allow to save the compressed version of files A, B and C on the first compression, to use them again in the second compression
Usage examples
The cache can be (and should be) used if you dynamically create some zip archive that contains frequently the same files. For example, you may want to allow the user to select some images, videos or other files from your gallery and allow them to download a compressed zip archive that contains these files. If you do so without cache, your server will answer very slowly if a lot of users ask the files. With the cache, the files will be compressed only once.
On my machine (a thinkpad T42P with default factory equipment), generating a 200MB zip archive takes around 30s of CPU without the cache, 32s of CPU with an empty cache and 2s of CPU if all the files to compress are already in cache.
How to use the cache
The cache is a
Cache_Lite
object. So you must have installed the package.
Then all you have to do is use the File_Archive::setOption function with the 'cache' parameter.
Set up the cacherequire_once "File/Archive.php";
require_once "Cache/Lite.php";
//Create the cache object
$cache = new Cache_Lite(
array(
//See the doc of Cache_Lite for its constructor parameters
)
);
//Ask File_Archive to use the cache object we just created
File_Archive::setOption("cache", $cache);
//And then create your archives as usual
//Generate a file called archive.zip in the working folder
File_Archive::extract(
$src = "folderToCompress",
$dest = "archive.zip"
);
//Send an archive to the user
File_Archive::extract(
$src = "folderToCompress",
File_Archive::toArchive(
"archive.zip",
File_Archive::toOutput()
)
);
?>
Dynamic generation of archive files for a gallery One possible use of File_Archive is to dynamically generate archives that contain pictures or videos from a gallery.
The choice of the file format is important if you want an efficient generation. Let's see what are the possibilities:
- Tar
Pros: generation very efficient, constant memory usage, no need to cache
Cons: no compression (but anyway images or video can hardly be compressed), not as widely used as Zip
- Tgz, Tbz
Pros: very high compression ratio, constant memory usage
Cons: can't be cached, needs a lot of CPU at each generation
- Zip
Pros: intermediate result can be cached, compressed, you can choose the compression level, widely used
Cons: compression ratio lower than for Tgz/Tbz
We will focus on Tar and Zip generation, Tgz and Tbz are too CPU expensive for an "on the fly" archive generation
Tar generation
On the fly creation of a TAR archiverequire_once "File/Archive.php";
// $files is an array of path to the files that must be added to the archive
File_Archive::extract(
$files,
File_Archive::toArchive("myGallery.tar", File_Archive::toOutput())
);
?>
Zip generation The main advantages of the Zip generation is that it is not very expensive (due to the ability to cache the result), and widely used. I think 2 viable options are to generate uncompressed Zip archives (since you don't reduce a lot the size of picture and video files by compressing them) or to generate compressed Zip archive using a cache system.
On the fly creation of an uncompressed ZIP archiverequire_once "File/Archive.php";
File_Archive::setOption("zipCompressionLevel", 0);
// $files is an array of path to the files that must be added to the archive
File_Archive::extract(
$files,
File_Archive::toArchive("myGallery.zip", File_Archive::toOutput())
);
?>
On the fly creation of a compressed ZIP archive with a cacherequire_once "File/Archive.php";
require_once "Cache/Lite.php";
// See the documentation of cache lite for the meaning of the $options array
// fileNameProtection must be left to the default true value
// automaticSerialization is not required and should be left to false
$options = array("cacheDir" => "tmp");
File_Archive::setOption("cache", new Cache_Lite($options));
File_Archive::setOption("zipCompressionLevel", 9);
// $files is an array of path to the files that must be added to the archive
File_Archive::extract(
$files,
File_Archive::toArchive("myGallery.zip", File_Archive::toOutput())
);
?>
Putting it all together Since generating a zip or a tar archive is pretty much the same code, you can write a simple code that lets the user choose what format he wants. The following code is taken from a code I really use in my gallery
Custom archive= array("tar", "zip");
if(!in_array($_GET["type"], $allowedFormats))
die("Type ".$_GET["type"]." is either unknown or not allowed");
require_once "File/Archive.php";
//Ask File_Archive to use the cache object we just created
File_Archive::setOption("zipCompressionLevel", 0);
/**
* I skipped the generation of the $files array since it really depends on you gallery
* and what files the user requires
*/
File_Archive::extract(
$files,
File_Archive::toArchive(
"myGallery.".$_GET["type"],
File_Archive::toOutput()
)
);
}
?>
相关阅读 更多 +
排行榜 更多 +