PHP Zip Archive Extract Entire Folder

179 views Asked by At

I am using PHP ZipArchive. Very simple but I am having trouble with extracting an entire folder.

$zip=new \ZipArchive;
$zip->open('zipped.docx');
$zip->extractTo('unzipped/',['word/document.xml','word/media/*']);
$zip->close();

I am trying to just unzip a docx document and extract the document.xml as well as the media folder with the images associated with the docx.

I've tried: word/media/*, word/media/, word/media with no success.

Is this a possibility?

1

There are 1 answers

1
Ozan Kurt On

Yes, it is possible to extract an entire folder using PHP's ZipArchive class. However, the ZipArchive class does not support wildcard patterns like "word/media/*" for extracting multiple files or directories at once.

To extract the "word/document.xml" file and the "word/media" folder with its contents, you can follow these steps:

$zip = new \ZipArchive;
$zipFile = 'zipped.docx';
$extractPath = 'unzipped/';

if ($zip->open($zipFile) === true) {
    // Extract document.xml
    $zip->extractTo($extractPath, 'word/document.xml');

    // Extract the media folder and its contents
    $mediaFolderIndex = $zip->locateName('word/media/');
    if ($mediaFolderIndex !== false) {
        for ($i = $mediaFolderIndex; $i < $zip->numFiles; $i++) {
            $filename = $zip->getNameIndex($i);
            if (strpos($filename, 'word/media/') === 0) {
                $zip->extractTo($extractPath, $filename);
            } else {
                break; // Exit the loop when we reach a file outside the media folder
            }
        }
    }

    $zip->close();
} else {
    echo 'Failed to open the zip file.';
}

In this code, we first open the zip file using the open() method. Then we extract the "word/document.xml" file using extractTo().

Next, we locate the index of the "word/media/" folder using locateName(). If the folder is found, we iterate over the zip file entries starting from that index. For each entry within the "word/media/" folder, we extract it using extractTo().

Please note that this code assumes that the "word/media" folder and its contents are structured in a predictable way within the DOCX file. If the structure varies or if there are other specific requirements, you may need to adjust the code accordingly.