If a zip file has a comment that is 65,535 characters long - which is a
valid length and the maximum allowable length - the initial read of the
archive fails to find the Zip64 End of Central Directory Locator and
therefore cannot read the rest of the file.
This commit fixes this by making sure that we look far enough back into
the file from the end to find this locator, and then use the
information in it to find the Zip64 End of Central Directory Record.
Test added to catch regressions.
Fixes#509.
This method provides a short cut to finding out how many entries are in
an archive by reading this number directly from the central directory,
and not iterating through the entire set of entries.
If a zip file has a comment that is 65,535 characters long - which is a
valid length and the maximum allowable length - the initial read of the
archive fails to find the End of Central Directory Record and therefore
cannot read the rest of the file.
This commit fixes this by making sure that we look far enough back into
the file from the end to find the EoCDR. Test added to catch
regressions.
Fixes#508.
We were previously trying to work out where the next entry would be,
even with GP bit 3 set, but the logic was flaky and cannot really be
correct given the data available. It's not expected behaviour, so raise
the error instead.
This means that we get rid of the incorrect `Entry.data_descriptor_size`
which was doing more harm than good.
This greatly simplifies the creation of `Entry` objects when only a
couple of fields are not set to their defaults, while at the same time
allowing an `Entry` to be fully configured at creation time if
appropriate.
This fundamentally changes the `Entry` API and means that some
convenience methods in `OutputStream` and `File` have needed to be
refactored.
There was some fairly odd stuff going on in `put_next_entry` that
allowed for data within an `Entry` to be overridden and prevented an
`Entry` from being a single point of truth. Fixing this also simplifies
the code within `File` and still passes all tests.
Also, fixing the above means we can stop passing the compression level
around as a parameter and use the value stored in each `Entry` directly.
Let's keep `compression_level` out of the `Entry` public API though as
it only makes sense when writing an `Entry`: there doesn't seem to be an
obvious way to read what level of compression was used when reading an
`Entry` from a zip file.
It looks like it needs to be surfaced in `add` and `get_output_stream`.
The compression level defaults to whatever the global default is unless
it is overridden on opening the Zip::File.
Also needed to reorder some of the requires in the top-level module file
now that we are using defaults in the File class.
Allow an Entry to specify a compression level and pass this down to the
underlying OutputStream infrastructure. OutputStream has been able to
specify a compression level for a while but this has, up until now, only
ever been set to the default.
This fundamentally changes the API so will need a major version bump.
StringIO objects created within File.open_buffer were not being switched into
binmode, but those passed in were. Fix this inconsistency and add a test.
Things are now more carefully set up, and if a buffer is passed in which
represents a file that already exists then this is taken into account. All
initialization is now done in File.new, rather than being split between there
and File.open_buffer.
This has also needed a bit of a re-write of Zip::File.initialize. I've tried to
bring some logic to it as a result, and have added comments to explain what is
now happening.