Commit Graph

588 Commits

Author SHA1 Message Date
Robert Haines 31e6688528 Remove unused private method `File#directory?`.
This was a fairly horrible method anyway, for a number of reasons. It
looked like a method that tested whether a name was a 'directory' name
or not, and it did, but it also had some side effects where it would
convert it *to* a directory name in some cases as well. Thankfully,
nothing was using it any more, and as it was private we can lose it
safely. Gone.
2022-01-22 07:38:18 +00:00
Robert Haines e2e0e23763 Remove `File::add_buffer` from the API.
Its functionality is now replicated in `File::open_buffer` but in a more
secure way.
2022-01-22 07:34:00 +00:00
Robert Haines 044759f502 Fix `OutputStream#put_next_entry` to preserve `StreamableStream`s.
When passing an `Entry` type to `File#get_output_stream` the entry is
used to create a `StreamableStream`, which preserves all the info in the
entry, such as timestamp, etc. But then in `put_next_entry` all that is
lost due to the test for `kind_of?(Entry)` which a `StreamableStream` is
not. See #503 for details.

This change tests for `StreamableStream`s in `put_next_entry` and uses
them directly. Some set-up within `Entry` needed to be made more robust
to cope with this, but otherwise it's a low impact change, which does
fix the problem.

The reason this case was being missed before is that the tests weren't
testing `get_output_stream` with an `Entry` object, so I have also added
that test too.

Fixes #503.
2022-01-20 19:29:40 +00:00
Robert Haines 4cf801c5f3 Tidy up `EntrySet` accessors.
`entry_order` is no longer a member, so remove it. `entry_set` should
not be public, but needs to be protected for use in `==`.
2022-01-18 20:09:34 +00:00
Robert Haines 8489ab07d1 `OutputStream`: use a `CentralDirectory` object internally.
Now `CentralDirectory` is a bit cleaner it actually makes sense to use
it here instead of an `EntrySet` and comment separately.
2022-01-17 22:32:56 +00:00
Robert Haines 60f8fffbc2 Reorder methods in `CentralDirectory` with private at the end. 2022-01-17 22:04:45 +00:00
Robert Haines bdbd573290 Remove unnecessary static method from `CentralDirectory`.
`CentralDirectory` shouldn't be in the public API for rubyzip and
there's nothing that `CentralDirectory::read_from_stream` did that
couldn't be done by just initializing an object first. Keeping it around
risked things getting out of date as we streamline and fix other things.
2022-01-17 18:10:17 +00:00
Robert Haines 9c3f8254c7 Fix reading zip64 files with max length file comment.
If a zip file has a comment that is 65,535 characters long - which is a
valid length and the maximum allowable length - the initial read of the
archive fails to find the Zip64 End of Central Directory Locator and
therefore cannot read the rest of the file.

This commit fixes this by making sure that we look far enough back into
the file from the end to find this locator, and then use the
information in it to find the Zip64 End of Central Directory Record.

Test added to catch regressions.

Fixes #509.
2022-01-17 18:02:39 +00:00
Robert Haines 1d6bfb7e69 Expose the `EntrySet` more cleanly through `CentralDirectory`.
There is now no direct access to the set of entries in a central
directory. This makes the interface cleaner because we now, for example,
add/delete things directly to/from the central directory, rather than
to/from the entry set contained within the central directory.
2022-01-16 11:53:40 +00:00
Robert Haines 34731b1885 `Zip::File` no longer subclasses `Zip::CentralDirectory`.
It has bothered me for years that the central directory is exposed in
this way. A zip file should *have* a central directory, but it should
not *be* one.

This commit starts us down the path of properly separating the two.
2022-01-15 13:10:54 +00:00
Robert Haines f8b9d07022 Round out EOCD data size constants in CDir. 2022-01-12 09:13:15 +00:00
Robert Haines cf258bbb71 Move to ruby 2.5 as the earliest supported version.
2.4 is nearly two years beyond EOL now.

Closes #484.
2022-01-11 22:26:09 +00:00
Robert Haines 14b63f68db Ensure `File.open_buffer` doesn't rewrite unchanged data. 2021-11-30 22:22:37 +00:00
Robert Haines 22e47641e6 Add `File::count_entries`.
This method provides a short cut to finding out how many entries are in
an archive by reading this number directly from the central directory,
and not iterating through the entire set of entries.
2021-11-20 10:53:00 +00:00
Robert Haines 3db1eff1e3 Add `CentralDirectory#count_entries`.
This method gets the number of entries from a zip archive without
loading all of the individual entries.
2021-11-20 10:50:55 +00:00
Robert Haines 6a516fb0b1 Factor out reading EOCD records.
This allows for reading the EOCD records without then automatically
reading all of the entry data as well, so that we can do other things
faster, like provide the number of entries in an archive.
2021-11-20 10:36:32 +00:00
Robert Haines 765cb316f1 Fix reading unknown extra fields.
When loading extra fields from both the central directory and local headers,
unknown fields were not merged correctly. They were being appended, which
means that we end up with the two versions stuck together - in some
cases duplicating the field completely.

This broke all kinds of things (like calculating the size of a local
header) in subtle ways.

This commit fixes this by implementing a new `Unknown` extra field type,
and making sure that when reading local and central extra fields they
are stored and preserved correctly. We cannot assume the unknown fields
use the same data in the local and central headers.

Fixes #505.
2021-11-19 19:53:38 +00:00
Robert Haines f7cd692e15 Fix reading zip files with max length file comment.
If a zip file has a comment that is 65,535 characters long - which is a
valid length and the maximum allowable length - the initial read of the
archive fails to find the End of Central Directory Record and therefore
cannot read the rest of the file.

This commit fixes this by making sure that we look far enough back into
the file from the end to find the EoCDR. Test added to catch
regressions.

Fixes #508.
2021-11-19 19:35:36 +00:00
Robert Haines bc6523ec43 Unpick changes from v2.3.1. 2021-07-05 22:22:01 +01:00
Robert Haines c3b1e5d693 Pick changes from v2.3.1. 2021-07-03 13:45:22 +01:00
Robert Haines 54b7762c8f Don't silently alter zip files opened with `Zip::sort_entries`.
Fixes #329.
2021-06-30 23:18:59 +01:00
Robert Haines 66527ae10d Fix minor typo in `GPFBit3Error` message in `InputStream`. 2021-06-27 21:55:15 +01:00
Robert Haines 19e5f4a8ce Detect and raise GPFBit3Error in `InputStream.get_next_entry`.
We were previously trying to work out where the next entry would be,
even with GP bit 3 set, but the logic was flaky and cannot really be
correct given the data available. It's not expected behaviour, so raise
the error instead.

This means that we get rid of the incorrect `Entry.data_descriptor_size`
which was doing more harm than good.
2021-06-27 21:43:03 +01:00
Robert Haines aa646ef827 Use named params for `InputStream`. 2021-06-27 10:20:11 +01:00
Robert Haines f75eb61578 Use named parameters for `File#get_output_stream`. 2021-06-27 10:20:11 +01:00
Robert Haines debc9fda91 Use named parameters for `File::split`. 2021-06-27 10:20:11 +01:00
Robert Haines f033ae760d Use named parameters for `File::new`.
This is a breaking change, but now is the time to do this as we've
already done the same for `Entry::new`.
2021-06-27 10:20:11 +01:00
Robert Haines e1e1cab39c Fix some non-writable `StringIO`s. 2021-06-27 10:20:11 +01:00
Robert Haines 659db85bff `open` and `write_buffer` in `OutputStream` use named params. 2021-06-27 10:20:11 +01:00
Robert Haines 7ae90be63e Fix Style/OptionalBooleanParameter in `OutputStream`. 2021-06-27 10:20:11 +01:00
Robert Haines e7f0aba5ff Fix Style/OptionalBooleanParameter cop in `Entry`.
Just an internal API so safe, and makes things a lot neater.
2021-06-27 10:20:11 +01:00
Robert Haines 8699e356d4 Improve documentation for `File.glob`.
Closes #338.
2021-06-26 20:04:17 +01:00
Robert Haines a301d68eeb Raise an error if entry names exceed 65,535 characters.
Fixes #247.
2021-06-26 19:21:07 +01:00
Robert Haines 49e313629e Remove the `ZipXError` v1 legacy classes. 2021-06-26 17:39:25 +01:00
Robert Haines e000552deb Raise an error on reading a split archive with `InputStream`.
Fixes #349.
2021-06-26 12:39:08 +01:00
Robert Haines 193507b15a Adjust Layout/LineLength cop to 100 characters.
We'll get the line length down in stages...
2021-06-25 22:31:34 +01:00
Robert Haines 84b3e8c644 Ensure `InputStream` raises `GPFBit3Error` for OSX Archive files.
Fixes #493.
2021-06-25 17:53:18 +01:00
Robert Haines 78565db40c Simplify `InputStream.open_entry`.
Also ensure `@complete_entry` is initialized!
2021-06-25 17:53:18 +01:00
Robert Haines ac053bd787 Improve documentation and error messages for `InputStream`.
Closes #196.
2021-06-25 16:58:01 +01:00
Robert Haines 1183607ea1 Flush buffered `OutputStream` on close.
Fixes #265.
2021-06-23 22:24:44 +01:00
Robert Haines 71f2c90b20 Test that a corrupted cdir entry is caught. 2021-06-18 12:08:31 +01:00
Robert Haines afe1892208 Fix a mis-firing CentralDirectory test.
`test_read_from_truncated_zip_file` was not testing what it thought it
was. It was testing whether we caught an out-of-bounds cdir offset, not
whether we caught a corrupted cdir entry.

This commit embraces the actual behaviour and tests that we catch an
out-of-bounds error for both standard `IO`s and `StringIO`s.
2021-06-18 11:44:58 +01:00
Robert Haines 21ba82c67c Move the split signature to the constants file. 2021-06-12 16:29:25 +01:00
Robert Haines 80382135e5 Tidy up some of the file split code. 2021-06-12 16:29:25 +01:00
Robert Haines bd2f15e4bb Extract the `Zip::File::split` code into its own module.
This code is rarely used and may not even be correct according to the
standard. Also this de-clutters the `File` class.
2021-06-12 16:29:06 +01:00
Robert Haines 7df623fb0e Read EOCD record for Zip64 files.
Means we actually read in the file-level comment now!

Fixes #492.
2021-06-11 23:23:34 +01:00
Robert Haines d8111826bf Remove the now redundant `read_zip_*` methods.
We're unpacking headers in chunks now, using `unpack`.
2021-06-11 13:51:40 +01:00
Robert Haines dc27c99eb1 Refactor unpacking the Zip64 eocd record. 2021-06-11 13:50:09 +01:00
Robert Haines 7e254dc581 Refactor unpacking the eocd record.
The old version used some really obfuscated code to perform what is an
essentially fairly simple job.
2021-06-10 22:44:51 +01:00
Robert Haines cd9a3fcad1 Move all the `read_zip_*` methods out of `Entry`.
They were only ever used in `CentralDirectory` anyway.
2021-06-10 17:29:00 +01:00
Robert Haines c0f20321ae Merge branch 'fix_depreciation_warning' of https://github.com/bbuchalter/rubyzip into bbuchalter-fix_depreciation_warning
* 'fix_depreciation_warning' of https://github.com/bbuchalter/rubyzip:
  Use default ruby behavior for Array.join
  Remove OUTPUT_FIELD_SEPARATOR-related test behaviors
  Set OUTPUT_FIELD_SEPARATOR to nil in test
  Prefer OUTPUT_RECORD_SEPARATOR to $\
  Prefer OUTPUT_FIELD_SEPARATOR to $,
2021-06-07 20:02:15 +01:00
Robert Haines 2410f2889e Restore file timestamps on all platforms.
Was only being done on Unix-type filesystems for some reason. Moved code
so that it is run for all files, whatever the underlying platform.
2021-06-06 16:17:22 +01:00
Robert Haines a6c6345084 Set restoring permissions and times as the default. 2021-06-06 16:17:22 +01:00
Robert Haines 684b69f330 Move the restore options to the top level.
This will ensure consistency between `File` and `Entry`.
2021-06-06 16:17:22 +01:00
Robert Haines a4e51f15fc Use constants instead of literals for some `fstype` calls. 2021-06-06 15:02:49 +01:00
Robert Haines 26b7f98c08 Use octal for more obvious definition of file-modes. 2021-06-06 15:02:49 +01:00
Robert Haines 9d8fc05c43 Refactor `get_entry` in `FileSystem::File(::Stat)`.
Rename it to `find_entry` because that is ultimately what is called on
the underlying zip file. Make `FileSystem::File#find_entry` public as it
need to be called from `FileSystem::File::Stat`, so now we can avoid
`__send__`. Neither class is documented anyway, so no harm done there.
2021-06-06 15:02:49 +01:00
Robert Haines 64a162ced4 Refactor `FileSystem::File::Stat.delegate_to_fs_file`.
Now uses `class_exec` instead of `class_eval`.
2021-06-06 15:02:49 +01:00
Robert Haines 99ecf3638f Remove spurious empty line at start of module. 2021-06-06 15:02:49 +01:00
Robert Haines 7b2e9c7970 Extract `FileSystem::File::Stat` from `FileSystem::File`. 2021-06-06 15:02:49 +01:00
Robert Haines d1329299c3 Extract `FileSystem::File` from the main filesystem file. 2021-06-06 15:02:49 +01:00
Robert Haines a1c9b63e61 Extract `FileSystem::Dir` from the main filesystem file. 2021-06-06 15:02:49 +01:00
Robert Haines 239baef845 Extract `DirectoryIterator` from the main filesystem file. 2021-06-06 15:02:49 +01:00
Robert Haines 204d084fdf Extract `ZipFileNameMapper` from the main filesystem file. 2021-06-06 15:02:49 +01:00
Jan-Joost Spanjers cdef4a5187 Prevent directory not empty error when running file_test on Windows
Fixed error:

ZipFileTest#test_open_buffer_no_op_does_not_change_file:
Errno::ENOTEMPTY: Directory not empty @ dir_s_rmdir - D:/a/_temp/d20210605-6612-1yi35sp
    C:/hostedtoolcache/windows/Ruby/2.4.10/x64/lib/ruby/2.4.0/fileutils.rb:1335:in `rmdir'
    C:/hostedtoolcache/windows/Ruby/2.4.10/x64/lib/ruby/2.4.0/fileutils.rb:1335:in `block in remove_dir1'
    C:/hostedtoolcache/windows/Ruby/2.4.10/x64/lib/ruby/2.4.0/fileutils.rb:1349:in `platform_support'
    C:/hostedtoolcache/windows/Ruby/2.4.10/x64/lib/ruby/2.4.0/fileutils.rb:1334:in `remove_dir1'
    C:/hostedtoolcache/windows/Ruby/2.4.10/x64/lib/ruby/2.4.0/fileutils.rb:1327:in `remove'
    C:/hostedtoolcache/windows/Ruby/2.4.10/x64/lib/ruby/2.4.0/fileutils.rb:689:in `block in remove_entry'
    C:/hostedtoolcache/windows/Ruby/2.4.10/x64/lib/ruby/2.4.0/fileutils.rb:1384:in `ensure in postorder_traverse'
    C:/hostedtoolcache/windows/Ruby/2.4.10/x64/lib/ruby/2.4.0/fileutils.rb:1384:in `postorder_traverse'
    C:/hostedtoolcache/windows/Ruby/2.4.10/x64/lib/ruby/2.4.0/fileutils.rb:687:in `remove_entry'
    C:/hostedtoolcache/windows/Ruby/2.4.10/x64/lib/ruby/2.4.0/tmpdir.rb:101:in `mktmpdir'
    D:/a/rubyzip/rubyzip/test/file_test.rb:136:in `test_open_buffer_no_op_does_not_change_file'

Rationale:

File#dup does not behave like what you would expect from #dup on Ruby.
File#dup calls dup(2), which has OS dependant behavoir.

On Windows, calling File#dup seems to cause an extra reference
to an open file, which prevents deleting that file later.

With this commit, we leave out the call to File#dup on Windows.
It is not clear to me that removing this call has no undesired
consequences, but all other existing tests still succeed.
2021-06-06 14:44:20 +01:00
Jan-Joost Spanjers 8a5fef8074 Fix FileSystem::ZipFileNameMapper#expand_path on Windows
Fixes regression introduced by 0e4dc676a0.
2021-06-06 14:44:20 +01:00
Robert Haines b705085b09 `Entry#name_safe?` now allows Windows drive mappings. 2021-06-06 14:44:20 +01:00
Robert Haines 22a54853e6 Reinstate normalising pathname separators to `/`.
But only do it after we have set filename encoding appropriately to
avoid breaking multibyte characters with `\`s in them.

Fixes #324.
2021-06-04 16:22:45 +01:00
Robert Haines 1777a3ff53 Make sure `::Zip.force_entry_names_encoding` is reset.
It was the one option left out of `::Zip.reset!` for some reason.
2021-06-01 22:38:43 +01:00
Ariel Zelivansky f54e3b7f56 Fix improvement & fix NTFS 2021-05-30 10:28:27 +01:00
Ariel Zelivansky 01acd0488a Quick fix to prevent crash when mtime is nil 2021-05-30 10:28:27 +01:00
Robert Haines e70e1d3080 Add `InputStream#size`.
This will enable `InputStream` to be used with external APIs that expect
to be able to query the expected size of data they will receive, such as
S3.

Fixes #451.
2021-05-26 13:35:16 +01:00
Robert Haines ce08405c1a Fix (most) Style/MutableConstant cop errors.
The last one, in `ExtraField` needs a sizeable refactor to fix.
2021-05-25 21:50:06 +01:00
Robert Haines 530afe5d0c Fix Performance/BlockGivenWithExplicitBlock cop. 2021-05-25 21:24:50 +01:00
Robert Haines 55ed74c20e Fix/configure Style/AccessorGrouping cop. 2021-05-25 21:24:50 +01:00
Robert Haines fe998a5aec Fix Layout/EmptyLinesAroundAttributeAccessor cop. 2021-05-25 21:24:50 +01:00
Robert Haines efa23a84ba Fix Style/RedundantBegin cop. 2021-05-25 21:24:50 +01:00
Robert Haines 1b3f4bb7b8 Fix Style/HashConversion cop. 2021-05-25 21:24:50 +01:00
Robert Haines 606b5ffbb2 Fix Lint/EmptyBlock cop. 2021-05-25 21:24:50 +01:00
Robert Haines e10badf68e Fix Style/FrozenStringLiteralComment cop. 2021-05-25 21:24:50 +01:00
Taichi Ishitani 0e4dc676a0 fix frozen string literal error 2021-05-25 21:24:50 +01:00
Robert Haines 8702876e55 Set the default `Entry` time to the file's mtime on Windows.
For some reason this was being skipped on Windows, but not Linux or
MacOS.
2021-05-18 21:59:54 +01:00
Robert Haines 34237efc00 Ensure that `Entry#time=` sets times as `DOSTime` objects.
Fixes #481.
2021-05-18 19:57:03 +01:00
Robert Haines 2b2e0ee568 Bump version to 3.0.0.
There are breaking changes in the recent PRs that have been merged.
2021-05-18 19:23:56 +01:00
Robert Haines 4ed35cae94 DosTime#<=> should return `nil` if other is not comparable. 2021-05-17 19:57:16 +01:00
Robert Haines 6e9f2976d1 Add temporary fix for JRuby to workaround Time cmp bug.
Workaround jruby/jruby#6668 until fix is released.

Version 9.2.18.0 is hopefully the version that will fix this, but we can
adjust the version accordingly if not.
2021-05-17 19:55:13 +01:00
Oleksandr Simonov 7f3bb29487
Merge pull request #464 from hainesr/remove_dosequals
Replace and deprecate `Zip::DOSTime#dos_equals`.
2021-05-03 10:22:13 +03:00
Oleksandr Simonov a0345420d8
Merge branch 'master' into compression_level 2021-02-14 14:26:12 +02:00
Oleksandr Simonov e397af3e0d
Merge pull request #447 from jlahtinen/fix_zlib_deflate_buffer_growth
Fix zlib deflate buffer growth
2021-02-14 14:24:36 +02:00
Oleksandr Simonov 9f29d09e02
Merge pull request #462 from hainesr/fix-zis-partial-read
Fix input stream partial read error.
2021-02-14 14:23:13 +02:00
Brian Buchalter ab9f546557 Use default ruby behavior for Array.join
Since we are using the default behavior of OUTPUT_FIELD_SEPERATOR
anyway, let's allow Ruby to maange that default for us, so if it should
change in the future, we don't have to change.
2021-01-26 04:43:26 -07:00
Robert Haines 5a4d1d8b6b Replace and deprecate `Zip::DOSTime#dos_equals`.
Having a specific 'does this instance equal another instance' method is
kind of annoying and breaks a number of things. Most obviously it breaks
comparing to `nil`: `nil.dos_equals(other)` will fail where
`nil == other` does not.

So this commit overrides `<=>` in `Zip::DOSTime` and deprecates
`dos_equals`.
2020-11-28 21:19:58 +00:00
Robert Haines 2ea805c951 Check `number_of_bytes` before comparison in read.
If an input stream has been read from, and left some data in the
internal buffer, then a subsequent `read`, with no amount of bytes to be
read having been specified, will raise an error when comparing to `nil`.
This fix checks that the number of bytes specified in the `read` is not
`nil` before comparing with the size of the internal buffer.

Fixes #461.
2020-11-08 17:20:53 +00:00
Robert Haines 8bafcbbc4d Remove dead code in extra_field/generic.rb (`==`).
From what I can tell this was erroneously copied out of extra_field.rb
during a refactor. It attempts to compare a non-existent Hash that used
to be inherited before the refactor. If this code had been left within
ExtraField it would make more sense, but as it's not needed there either
let's just remove it.

See 20d79feb99 for the refactor.
2020-10-03 18:27:20 +01:00
Robert Haines c2b9aa2893 Correctly read extra fields when opening a zip file.
Previously, only the extra fields stored in the central directory were
being read in. In reality it is often the case that the extra field in
the central directory is just a marker, and the full data is in the
local header. So we need to read both in and merge the two into the
final correct extra field. This merging infrastructure was already
implemented in the extra field code but we never actually read the
local extra fields in until now.

Reading the central directory headers and local entry headers seems
rather fragile, so we can't just read one over the other and hope to end
up with a correctly merged set of extra fields because this breaks other
things. So we need to specifically read the local extra field data and
merge just those bits.

This commit also fixes a couple of tests that were 'broken' by us now
reading extra fields in correctly!
2020-10-03 18:27:20 +01:00
Robert Haines f742994cf2 Abstract out reading extra fields in Entry.
Remove some (almost) duplicated code and get ready for the real fix.
2020-09-20 18:55:39 +01:00
Robert Haines fe1d3c8da0 Fix reading Ux extra field.
As previously implemented the `uid` and `gid` fields could only ever be
read as 0, because they were being initialized to zero and then
memoization (`@uid ||= uid`) was used to 'save' the new value. Using `nil`
as the initial value for either of these fields breaks so many tests, so I
have fixed this by not using memoization instead. This is safe because it
is only the local extra field that holds these values for this type of
extra field.
2020-09-20 18:54:23 +01:00
Robert Haines f1dd724a3a Use constants for the compression level gp flags. 2020-08-31 17:48:08 +01:00
Robert Haines cf3f4339f6 Make sure that compression method is STORE for level 0.
Whatever the compression method that is set by the user, if the
compression level is set to 0 (no compression), then the entry should be
STORED. This mimics commandline tool behaviour and matches user
expectations.
2020-08-31 17:48:08 +01:00
Robert Haines 0620fba13d Don't use raw numbers for Entry compression types.
Constants for Store and Deflate are already available, so use them. It
might be sensible to remove these local versions, but they do have their
uses as a shortened form.
2020-08-31 17:48:08 +01:00