This was a fairly horrible method anyway, for a number of reasons. It
looked like a method that tested whether a name was a 'directory' name
or not, and it did, but it also had some side effects where it would
convert it *to* a directory name in some cases as well. Thankfully,
nothing was using it any more, and as it was private we can lose it
safely. Gone.
When passing an `Entry` type to `File#get_output_stream` the entry is
used to create a `StreamableStream`, which preserves all the info in the
entry, such as timestamp, etc. But then in `put_next_entry` all that is
lost due to the test for `kind_of?(Entry)` which a `StreamableStream` is
not. See #503 for details.
This change tests for `StreamableStream`s in `put_next_entry` and uses
them directly. Some set-up within `Entry` needed to be made more robust
to cope with this, but otherwise it's a low impact change, which does
fix the problem.
The reason this case was being missed before is that the tests weren't
testing `get_output_stream` with an `Entry` object, so I have also added
that test too.
Fixes#503.
`CentralDirectory` shouldn't be in the public API for rubyzip and
there's nothing that `CentralDirectory::read_from_stream` did that
couldn't be done by just initializing an object first. Keeping it around
risked things getting out of date as we streamline and fix other things.
If a zip file has a comment that is 65,535 characters long - which is a
valid length and the maximum allowable length - the initial read of the
archive fails to find the Zip64 End of Central Directory Locator and
therefore cannot read the rest of the file.
This commit fixes this by making sure that we look far enough back into
the file from the end to find this locator, and then use the
information in it to find the Zip64 End of Central Directory Record.
Test added to catch regressions.
Fixes#509.
There is now no direct access to the set of entries in a central
directory. This makes the interface cleaner because we now, for example,
add/delete things directly to/from the central directory, rather than
to/from the entry set contained within the central directory.
It has bothered me for years that the central directory is exposed in
this way. A zip file should *have* a central directory, but it should
not *be* one.
This commit starts us down the path of properly separating the two.
This method provides a short cut to finding out how many entries are in
an archive by reading this number directly from the central directory,
and not iterating through the entire set of entries.
This allows for reading the EOCD records without then automatically
reading all of the entry data as well, so that we can do other things
faster, like provide the number of entries in an archive.
When loading extra fields from both the central directory and local headers,
unknown fields were not merged correctly. They were being appended, which
means that we end up with the two versions stuck together - in some
cases duplicating the field completely.
This broke all kinds of things (like calculating the size of a local
header) in subtle ways.
This commit fixes this by implementing a new `Unknown` extra field type,
and making sure that when reading local and central extra fields they
are stored and preserved correctly. We cannot assume the unknown fields
use the same data in the local and central headers.
Fixes#505.
If a zip file has a comment that is 65,535 characters long - which is a
valid length and the maximum allowable length - the initial read of the
archive fails to find the End of Central Directory Record and therefore
cannot read the rest of the file.
This commit fixes this by making sure that we look far enough back into
the file from the end to find the EoCDR. Test added to catch
regressions.
Fixes#508.
We were previously trying to work out where the next entry would be,
even with GP bit 3 set, but the logic was flaky and cannot really be
correct given the data available. It's not expected behaviour, so raise
the error instead.
This means that we get rid of the incorrect `Entry.data_descriptor_size`
which was doing more harm than good.
`test_read_from_truncated_zip_file` was not testing what it thought it
was. It was testing whether we caught an out-of-bounds cdir offset, not
whether we caught a corrupted cdir entry.
This commit embraces the actual behaviour and tests that we catch an
out-of-bounds error for both standard `IO`s and `StringIO`s.
* 'fix_depreciation_warning' of https://github.com/bbuchalter/rubyzip:
Use default ruby behavior for Array.join
Remove OUTPUT_FIELD_SEPARATOR-related test behaviors
Set OUTPUT_FIELD_SEPARATOR to nil in test
Prefer OUTPUT_RECORD_SEPARATOR to $\
Prefer OUTPUT_FIELD_SEPARATOR to $,
Rename it to `find_entry` because that is ultimately what is called on
the underlying zip file. Make `FileSystem::File#find_entry` public as it
need to be called from `FileSystem::File::Stat`, so now we can avoid
`__send__`. Neither class is documented anyway, so no harm done there.
Fixed error:
ZipFileTest#test_open_buffer_no_op_does_not_change_file:
Errno::ENOTEMPTY: Directory not empty @ dir_s_rmdir - D:/a/_temp/d20210605-6612-1yi35sp
C:/hostedtoolcache/windows/Ruby/2.4.10/x64/lib/ruby/2.4.0/fileutils.rb:1335:in `rmdir'
C:/hostedtoolcache/windows/Ruby/2.4.10/x64/lib/ruby/2.4.0/fileutils.rb:1335:in `block in remove_dir1'
C:/hostedtoolcache/windows/Ruby/2.4.10/x64/lib/ruby/2.4.0/fileutils.rb:1349:in `platform_support'
C:/hostedtoolcache/windows/Ruby/2.4.10/x64/lib/ruby/2.4.0/fileutils.rb:1334:in `remove_dir1'
C:/hostedtoolcache/windows/Ruby/2.4.10/x64/lib/ruby/2.4.0/fileutils.rb:1327:in `remove'
C:/hostedtoolcache/windows/Ruby/2.4.10/x64/lib/ruby/2.4.0/fileutils.rb:689:in `block in remove_entry'
C:/hostedtoolcache/windows/Ruby/2.4.10/x64/lib/ruby/2.4.0/fileutils.rb:1384:in `ensure in postorder_traverse'
C:/hostedtoolcache/windows/Ruby/2.4.10/x64/lib/ruby/2.4.0/fileutils.rb:1384:in `postorder_traverse'
C:/hostedtoolcache/windows/Ruby/2.4.10/x64/lib/ruby/2.4.0/fileutils.rb:687:in `remove_entry'
C:/hostedtoolcache/windows/Ruby/2.4.10/x64/lib/ruby/2.4.0/tmpdir.rb:101:in `mktmpdir'
D:/a/rubyzip/rubyzip/test/file_test.rb:136:in `test_open_buffer_no_op_does_not_change_file'
Rationale:
File#dup does not behave like what you would expect from #dup on Ruby.
File#dup calls dup(2), which has OS dependant behavoir.
On Windows, calling File#dup seems to cause an extra reference
to an open file, which prevents deleting that file later.
With this commit, we leave out the call to File#dup on Windows.
It is not clear to me that removing this call has no undesired
consequences, but all other existing tests still succeed.
This will enable `InputStream` to be used with external APIs that expect
to be able to query the expected size of data they will receive, such as
S3.
Fixes#451.
Workaround jruby/jruby#6668 until fix is released.
Version 9.2.18.0 is hopefully the version that will fix this, but we can
adjust the version accordingly if not.
Since we are using the default behavior of OUTPUT_FIELD_SEPERATOR
anyway, let's allow Ruby to maange that default for us, so if it should
change in the future, we don't have to change.
Having a specific 'does this instance equal another instance' method is
kind of annoying and breaks a number of things. Most obviously it breaks
comparing to `nil`: `nil.dos_equals(other)` will fail where
`nil == other` does not.
So this commit overrides `<=>` in `Zip::DOSTime` and deprecates
`dos_equals`.
If an input stream has been read from, and left some data in the
internal buffer, then a subsequent `read`, with no amount of bytes to be
read having been specified, will raise an error when comparing to `nil`.
This fix checks that the number of bytes specified in the `read` is not
`nil` before comparing with the size of the internal buffer.
Fixes#461.
From what I can tell this was erroneously copied out of extra_field.rb
during a refactor. It attempts to compare a non-existent Hash that used
to be inherited before the refactor. If this code had been left within
ExtraField it would make more sense, but as it's not needed there either
let's just remove it.
See 20d79feb99 for the refactor.
Previously, only the extra fields stored in the central directory were
being read in. In reality it is often the case that the extra field in
the central directory is just a marker, and the full data is in the
local header. So we need to read both in and merge the two into the
final correct extra field. This merging infrastructure was already
implemented in the extra field code but we never actually read the
local extra fields in until now.
Reading the central directory headers and local entry headers seems
rather fragile, so we can't just read one over the other and hope to end
up with a correctly merged set of extra fields because this breaks other
things. So we need to specifically read the local extra field data and
merge just those bits.
This commit also fixes a couple of tests that were 'broken' by us now
reading extra fields in correctly!
As previously implemented the `uid` and `gid` fields could only ever be
read as 0, because they were being initialized to zero and then
memoization (`@uid ||= uid`) was used to 'save' the new value. Using `nil`
as the initial value for either of these fields breaks so many tests, so I
have fixed this by not using memoization instead. This is safe because it
is only the local extra field that holds these values for this type of
extra field.
Whatever the compression method that is set by the user, if the
compression level is set to 0 (no compression), then the entry should be
STORED. This mimics commandline tool behaviour and matches user
expectations.
Constants for Store and Deflate are already available, so use them. It
might be sensible to remove these local versions, but they do have their
uses as a shortened form.