Commit Graph

455 Commits

Author SHA1 Message Date
Robert Haines deca4d5aeb Fix de facto regression for input streams.
A close reading of the ZIP spec insists that if bit 3 of the GP flags is set
then the archive cannot be read via `Zip::InputStream`. But in most cases
the correct information is present to be able to do so, both safely and
reliably, and v2.4 does allow this.

This commit ensures that behaviour is present in v3.0.
2025-02-08 16:51:30 +00:00
Robert Haines 98881e23d1 Add a test to ensure correct version number format.
Hopefully this will avoid a repeat of the '2.4' debacle...
2025-02-01 16:31:20 +00:00
Jean Boussier 5c6a7c9ad9 Fix `File#write_buffer` to always return the given `io`
Ref: ef89a62b70

This fixes a regression in 2.4.rc1.

Cherry-picked into 3.0 for consistency.
2024-04-09 10:10:47 +01:00
Robert Haines 0c0003cfda Add `DOSTime#absolute_time?`.
This method returns `true` if the time instance was created with accurate
timezone information. Ultimately, only those times parsed from binary
DOS format are missing accurate timezone information, but we need this
flag because ruby `Time` objects (from which `DOSTime` is decended) always
have a timezone set (usually whatever is local at the time).
2024-04-08 18:37:08 +01:00
Robert Haines d53f046bc7 Add `Entry#absolute_time?`.
This method returns `true` if an entry has timezone information in its
timestamps, `false` otherwise.
2024-03-07 21:31:24 +00:00
Robert Haines 1c06454985 Update minimum ruby version to 3.0.
All rubies before 3.0 are EOL and this is a major version bump, so it's
the right time to do this.
2024-03-01 22:14:48 +00:00
OZAWA Sakuro a4f9ec6423 Add compatibility test for Zip::InputStream#read(0) 2023-04-14 11:25:22 +01:00
Robert Haines 84087e5774 Ensure that entries can be extracted safely without path traversal.
This commit adds a parameter to the `File#extract` and `Entry#extract` methods
so that a base destination directory can be specified for extracting archives
in bulk to somewhere in the filesystem that isn't the current working
directory. This directory is `.` by default. It is combined with the entry
path - which shouldn't but could have relative directories (e.g. `..`) in it -
and tested for safety before extracting.

Resolves #540.
2023-04-14 11:15:24 +01:00
Robert Haines 58f053afb0 Only use the Zip64 CDIR end locator if needed.
Previously the central directory Zip64 data was written even if it wasn't
strictly needed. The standard allows for entries to include Zip64 data
(say, if they are streamed and their size is unknown when writing the file
data) without needing any Zip64 data in the central directory. So now we
only write central directory Zip64 data if there are over 65535 files or
the file data is huge.
2023-01-03 20:19:40 +00:00
Robert Haines f460da3afb Prevent unnecessary Zip64 data being stored.
With Zip64 write support enabled by default, it's important that we
only store the extra data when we need to. This commit ensures that
the Zip64 extra data is included for an entry if its size is over
4GB, or if we don't know how big it will be at the point of writing
the local header data.

This commit also removes the need for the Zip64Placeholder extra
data field. Now we just use the Zip64 field itself and ensure it's
filled in correctly.
2023-01-03 20:19:40 +00:00
Stan Hu d6eb73566c Enable Zip64 by default
Previously if RubyZip attempted to create an archive with more than
64K entries, the central directory would truncate the count. `unzip`
and `zipinfo` would fail with an error message such as:

```
error:  expected central file header signature not found (file #93272).
  (please check that you have transferred or created the zipfile in the
  appropriate BINARY mode and that you have compiled UnZip properly)
```

This generated a lot of confusion and a production issue since many
tools fail to decode a RubyZip-created archive if Zip64 is not enabled
for a large number of files. Since Zip64 support is now the norm,
enable this by default.
2023-01-03 20:19:40 +00:00
Robert Haines 750d372380 Rename DestinationFileExistsError -> DestinationExistsError.
And define the error message within the class.
2022-08-16 11:13:30 +01:00
Robert Haines e3f0aecf93 Define the EntryNameError message within the error class. 2022-08-16 10:52:18 +01:00
Robert Haines 07eca2bae8 Define the EntrySizeError message within the error class. 2022-08-15 22:02:33 +01:00
Robert Haines 7097492dc8 Define the EntryExistsError message within the error class. 2022-08-14 22:23:51 +01:00
Robert Haines 51231673a4 Define the DecompressionError message within the error class. 2022-08-14 22:23:51 +01:00
Robert Haines 19fe79e31e Define the SplitArchiveError message within the error class. 2022-08-14 22:23:51 +01:00
Robert Haines 03a9ee6b8a Rename `GPFBit3Error` to `StreamingError`.
`GPFBit3Error` doesn't really mean anything to the general user, and
it's not descriptive of the issue at hand. This error is raised when a
zip file cannot be streamed via `InputStream`, so `StreamingError` makes
more sense.

Also standardize the error message while we're about it.
2022-08-14 22:23:51 +01:00
Robert Haines 2e4dd9e0aa Improve the message for CompressionMethodError.
Convert the compression method number into a meaningful text
representation, e.g., "BZIP2" instead of "12".
2022-08-14 22:23:51 +01:00
Robert Haines 08391da4d5 Ensure that `Entry.ftype` is correct via `InputStream`.
When reading an archive with `InputStream`, `Entry.ftype` was returning
`:file` for all entries, even if they were a directory. This is due to
various side-effects in many methods in `Entry`. This commit fixes the
behaviour, but not the side-effects.

Fixes #533.
2022-08-13 22:09:55 +01:00
Brian Williams 6f1ad8b37d Fix unraised error on encrypted archives 2022-08-09 22:11:42 +01:00
Robert Haines 708b7f5393 Add a couple more checks in the tests for double `commit`s.
Just ensure that a `commit` really does stick with both new and edited
zip files.
2022-06-25 08:53:35 +01:00
Robert Haines 14ff11ba05 Re-initialize CDir after a `commit`.
Using the factored-out code preserves options set in `File`.

Fixes #529.
2022-06-25 08:51:32 +01:00
Robert Haines 466383ff1a Add other `Entry` time methods and test them all. 2022-06-20 17:18:20 +01:00
Robert Haines ae0262df2e Add `Entry#zip64?` as a better way detect Zip64 entries. 2022-06-20 17:18:20 +01:00
Robert Haines 307fc6c6e9 Mark other mutating methods in `Entry` as dirty.
Also, remove `Entry#extra=` as it makes no sense (and wasn't even being
tested).

And remove slightly odd test that was assuming an archive would not be
changed if its utime was changed - even if it was being changed back
immediately. This test was merely confirming that we weren't catching
timestamp changes correctly.
2022-06-20 17:18:20 +01:00
Robert Haines 33dce510a6 Remove `Entry#dirty=` as 'dirtyness' is now monitored internally.
Had to round out some of the accessors that mark an `Entry` as dirty.
2022-06-20 17:18:20 +01:00
Robert Haines e0e754ae65 Switch how the `Entry::dirty` flag is used.
Set it to true by default - because a new `Entry` is dirty by
definition, having not been written yet. Then make sure that an `Entry`
that is created by reading from a zip file is set as not dirty.
2022-06-20 17:18:19 +01:00
Robert Haines 48d6acf9ca Ensure all streams passed to `File.new` are in `binmode`.
Previously, only those streams that were passed to `new` by `open_buffer`
were in the correct mode.
2022-06-18 16:19:52 +01:00
Robert Haines 513ce5e5f7 Remove unnecessary encoding change in tests for `File`. 2022-06-18 12:45:59 +01:00
Finn Bacall 451a04f7a2 Test for `Errno::ENOENT` 2022-06-16 20:31:35 +01:00
Finn Bacall 8b87b0e200 Implement `Zip::FileSystem::ZipFsFile#symlink?` 2022-06-16 20:31:35 +01:00
Robert Haines 05a1739069 Properly test `File#mkdir`. 2022-01-22 08:39:43 +00:00
Robert Haines e2e0e23763 Remove `File::add_buffer` from the API.
Its functionality is now replicated in `File::open_buffer` but in a more
secure way.
2022-01-22 07:34:00 +00:00
Robert Haines 044759f502 Fix `OutputStream#put_next_entry` to preserve `StreamableStream`s.
When passing an `Entry` type to `File#get_output_stream` the entry is
used to create a `StreamableStream`, which preserves all the info in the
entry, such as timestamp, etc. But then in `put_next_entry` all that is
lost due to the test for `kind_of?(Entry)` which a `StreamableStream` is
not. See #503 for details.

This change tests for `StreamableStream`s in `put_next_entry` and uses
them directly. Some set-up within `Entry` needed to be made more robust
to cope with this, but otherwise it's a low impact change, which does
fix the problem.

The reason this case was being missed before is that the tests weren't
testing `get_output_stream` with an `Entry` object, so I have also added
that test too.

Fixes #503.
2022-01-20 19:29:40 +00:00
Robert Haines bdbd573290 Remove unnecessary static method from `CentralDirectory`.
`CentralDirectory` shouldn't be in the public API for rubyzip and
there's nothing that `CentralDirectory::read_from_stream` did that
couldn't be done by just initializing an object first. Keeping it around
risked things getting out of date as we streamline and fix other things.
2022-01-17 18:10:17 +00:00
Robert Haines 75503df682 Round out the max comment size tests.
Just sanity check the comment size and the number of entries once the
file has been initialized.
2022-01-17 18:03:04 +00:00
Robert Haines 9c3f8254c7 Fix reading zip64 files with max length file comment.
If a zip file has a comment that is 65,535 characters long - which is a
valid length and the maximum allowable length - the initial read of the
archive fails to find the Zip64 End of Central Directory Locator and
therefore cannot read the rest of the file.

This commit fixes this by making sure that we look far enough back into
the file from the end to find this locator, and then use the
information in it to find the Zip64 End of Central Directory Record.

Test added to catch regressions.

Fixes #509.
2022-01-17 18:02:39 +00:00
Robert Haines cf258bbb71 Move to ruby 2.5 as the earliest supported version.
2.4 is nearly two years beyond EOL now.

Closes #484.
2022-01-11 22:26:09 +00:00
Robert Haines 14b63f68db Ensure `File.open_buffer` doesn't rewrite unchanged data. 2021-11-30 22:22:37 +00:00
Robert Haines f5e19db273 Add a 100,000 file zip to test `count_entries`. 2021-11-20 20:02:47 +00:00
Robert Haines 22e47641e6 Add `File::count_entries`.
This method provides a short cut to finding out how many entries are in
an archive by reading this number directly from the central directory,
and not iterating through the entire set of entries.
2021-11-20 10:53:00 +00:00
Robert Haines 3db1eff1e3 Add `CentralDirectory#count_entries`.
This method gets the number of entries from a zip archive without
loading all of the individual entries.
2021-11-20 10:50:55 +00:00
Robert Haines 765cb316f1 Fix reading unknown extra fields.
When loading extra fields from both the central directory and local headers,
unknown fields were not merged correctly. They were being appended, which
means that we end up with the two versions stuck together - in some
cases duplicating the field completely.

This broke all kinds of things (like calculating the size of a local
header) in subtle ways.

This commit fixes this by implementing a new `Unknown` extra field type,
and making sure that when reading local and central extra fields they
are stored and preserved correctly. We cannot assume the unknown fields
use the same data in the local and central headers.

Fixes #505.
2021-11-19 19:53:38 +00:00
Robert Haines f7cd692e15 Fix reading zip files with max length file comment.
If a zip file has a comment that is 65,535 characters long - which is a
valid length and the maximum allowable length - the initial read of the
archive fails to find the End of Central Directory Record and therefore
cannot read the rest of the file.

This commit fixes this by making sure that we look far enough back into
the file from the end to find the EoCDR. Test added to catch
regressions.

Fixes #508.
2021-11-19 19:35:36 +00:00
Robert Haines 54b7762c8f Don't silently alter zip files opened with `Zip::sort_entries`.
Fixes #329.
2021-06-30 23:18:59 +01:00
Robert Haines 19e5f4a8ce Detect and raise GPFBit3Error in `InputStream.get_next_entry`.
We were previously trying to work out where the next entry would be,
even with GP bit 3 set, but the logic was flaky and cannot really be
correct given the data available. It's not expected behaviour, so raise
the error instead.

This means that we get rid of the incorrect `Entry.data_descriptor_size`
which was doing more harm than good.
2021-06-27 21:43:03 +01:00
Robert Haines 8071290ce6 Update and tidy up encryption tests. 2021-06-27 15:56:39 +01:00
Robert Haines 50dddca0be Update encrypted fixtures to remove data descriptors. 2021-06-27 15:54:08 +01:00
Robert Haines aa646ef827 Use named params for `InputStream`. 2021-06-27 10:20:11 +01:00