This introduces a basic public yaml rest test plugin that is supposed to be used by external
elasticsearch plugin authors. This is driven by #76215
- Rename yaml-rest-test to intern-yaml-rest-test
- Use public yaml plugin in example plugins
Co-authored-by: Mark Vieira <portugee@gmail.com>
This change allows user to disable GeoIP downloader using elasticsearch.yml and it deletes .geoip_databases index if downloader is disabled.
Closes#76586
This change adds additional assertion in GeoIpDownloaderIT.testInvalidTimestamp which makes sure that validity checks work both ways (so going out of validity and back) and it should fix race in cleanUp method leading to occasional failures.
Closes#75221Closes#74358
Adjust GeoIpDownloaderIT test suit to wait for managed databases files
to be removed after each test.
After each test geoip downloader is disabled, which should eventually
remove the managed geoip database files. This happens in the background.
However a new test starts that assumes that the builtin databases are used
then that test can fail, because expected assertions will fail. The changes
in this commit should address this.
Closes#74358
This change fixes problem with GeoIpProcessor when there's GeoIpTaskState present in the cluster state but there's no database matching the one used by the processor. It can happen when there are some but not all databases already updated.
This change updates the way we handle net new system indices, which are
those that have been newly introduced and do not require any BWC
guarantees around non-system access. These indices will not be included
in wildcard expansions for user searches and operations. Direct access
to these indices will also not be allowed for user searches.
The first index of this type is the GeoIp index, which this change sets
the new flag on.
Closes#72572
This PR changes the way GeoIpDownloader and GeoIpProcessor handle situation when we are unable to update databases for 30 days. In that case:
GeoIpDownloader will delete all chunks from .geoip_databases index
DatabaseRegistry will delete all files on ingest nodes
GeoIpProcessor will tag document with tags: ["_geoip_expired_database"] field (same way as in Logstash)
This change also fixes bug with that breaks DatabaseRegistry and when it tires to download databases after updating timestamp only (GeoIpDownloader checks if there are new databases and updates timestamp because local databases are up to date)
ParseField is part of the x-content lib, yet it doesn't exist under the
same root package as the rest of the lib. This commit moves the class to
the appropriate package.
relates #73784
When libs/core was created, several classes were moved from server's
o.e.common package, but they were not moved to a new package. Split
packages need to go away long term, so that Elasticsearch can even think
about modularization. This commit moves all the classes under o.e.common
in core to o.e.core.
relates #73784
The recent upgrade of the Azure SDK has caused a few test failures that
have been difficult to debug and do not yet have a fix. In particular, a
change to the netty reactor resolving
(https://github.com/reactor/reactor-netty/issues/1655). We need to wait
for a fix for that issue, so this reverts commit
6c4c4a0ecb.
relates #73493
This commit upgrades the Azure SDK to 12.11.0 and Jackson to 2.12.2. The
Jackson upgrade must happen at the same time due to Azure depending on
this new version of Jackson.
closes#66555closes#67214
Co-authored-by: Francisco Fernández Castaño <francisco.fernandez.castano@gmail.com>
Due to problems discovered in #72572 we have to disable geoip downloader for now. We use ingest.geoip.downloader.enabled.default as feature flag.
This change also reverts changes to docs.
This commit upgrades the Azure SDK to 12.11.0 and Jackson to 2.12.2. The
Jackson upgrade must happen at the same time due to Azure depending on
this new version of Jackson.
closes#66555closes#67214
As required by MaxMind license we can't use databases that are older than 30 days as we could miss "don't sell" request.
This check was missing before and this change fixes that.
Instead of doing a refresh as part of each index request, perform
this separately after all chunks have been indexed.
Also perform a flush, so that the translog is trimmed and
doesn't contain all these large write operations (1mb) until
an automatic refresh happens (which may take a while since
no other indexing will take place for a while).
This change removes assertion from DatabaseRegistry - we can easily loose .geoip_databases index with persistent task state still in cluster state, this is not assertion failing, this is usual failure and should be signalled as one.
This also tries to fix packaging tests by avoiding duplicates in elasticsearch.yml.
Closes#71762
This PR adds documentation for GeoIPv2 auto-update feature.
It also changes related settings names from geoip.downloader.* to ingest.geoip.downloader to have the same convention as current setting.
Relates to #68920
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
This change enables GeoIP downloader by default.
It removes feature flag but adds flag that is used by tests to disable it again (as we don't want to hammer GeoIP database service with every test cluster we spin up).
Relates to #68920
* Warn users if security is implicitly disabled
Elasticsearch has security features implicitly disabled by default for
Basic and Trial licenses, unless explicitly set in the configuration
file.
This may be good for onboarding, but it also lead to unintended insecure
clusters.
This change introduces clear warnings when security features are
implicitly disabled.
- a warning header in each REST response if security is implicitly
disabled;
- a log message during cluster boot.
This change fixes number of problems in GeoIPv2 code:
- closes streams from Files.list in GeoIpCli, which should fix tests on Windows
- makes sure that total download time in GeoIP stats is non-negative (we serialize it as vInt which can cause problems with negative numbers and it can happen when clock was changed during operation)
- fixes handling of failed/simultaneous downloads, #69951 was meant as a way to prevent 2 persistent tasks to index chunks but it would prevent any update if single download failed mid indexing, this change uses timestamp (lastUpdate) as sort of UUID. This should still prevent 2 tasks to step on each other toes (overwriting chunks) but in the end still only single task should be able to update task state (this is handled by persistent tasks framework)
Closes#71145
Ensure that the index request is routed to the ingest,
so that the lazy loading occurs of geoip database
on ingest node (which is what is asserted later on)
Otherwise the database is lazy loaded on a different node.
(without this fix, this test fails reproducible with
`-Dtests.seed=2E234CC71CE96F4F`)
Closes#71251
This change adds additional test to GeoIpDownloaderIT which tests that artifacts produces by GeoIP CLI tool can be consumed by cluster the same way as from our original service.
It does so by running the tool from fixture which then simply serves the generated files (this is exactly the way users are supposed to use the tool as well).
Relates to #68920
Today when creating an internal test cluster, we allow the test to
supply the node settings that are applied. The extension point to
provide these settings has a single integer parameter, indicating the
index (zero-based) of the node being constructed. This allows the test
to make some decisions about the settings to return, but it is too
simplistic. For example, imagine a test that wants to provide a setting,
but some values for that setting are not valid on non-data nodes. Since
the only information the test has about the node being constructed is
its index, it does not have sufficient information to determine if the
node being constructed is a non-data node or not, since this is done by
the test framework externally by overriding the final settings with
specific settings that dicate the roles of the node. This commit changes
the test framework so that the test has information about what settings
are going to be overriden by the test framework after the test provide
its test-specific settings. This allows the test to make informed
decisions about what values it can return to the test framework.
Air-gapped environments can't simply use GeoIp database service provided by Infra, so they have to either use proxy or recreate similar service themselves.
This PR adds tool to make this process easier. Basic workflow is:
download databases from MaxMind site to single directory (either .mmdb files or gzipped tarballs with .tgz suffix)
run the tool with $ES_PATH/bin/elasticsearch-geoip -s directory/to/use [-t target/directory]
serve static files from that directory (for example with docker run -v directory/to/use:/usr/share/nginx/html:ro nginx
use server above as endpoint for GeoIpDownloader (geoip.downloader.endpoint setting)
to update new databases simply put new files in directory and run the tool again
This change also adds support for relative paths in overview json because the cli tool doesn't know about the address it would be served under.
Relates to #68920
In DatabaseRegistry we tried to replace file that was still open. This is not a problem under Linux and MacOS but Windows doesn't like it.
It was caught by our CI with reproducible failures when WindowsFS was set up by Lucene.
Now we skip one temp file and use GzipInputStream directly which fixes this problem.
Marking as non-issue since the code was not released yet.
Closes#70977Closes#71006
We have to ship COPYRIGHT.txt and LICENSE.txt files alongside .mmdb files for legal compliance. Infra will pack these in single .tgz (gzipped tar) archive provided by GeoIP databases service.
This change adds support for that format to GeoIpDownloader and DatabaseRegistry
Node in GeoIpStats response can have no databases field if there are no databases yet downloaded to that node. We have to check if the key is there before processing it to avoid NPE.
Fixes#70789
This change adds _geoip/stats endpoint that can be used to collect basic data about geoip downloader (successful, failed and skipped downloads, current db count and total time spent downloading).
It also fixes missing/wrong origins for clients that will break if used with security.
Relates to #68920
This change adjust where the geoip tmp directory is created
to avoid issues when running multiple nodes on the same machine.
In the java tmp dir, a 'geoip-databases' directory is created and
directly under this directory a directory with the node id as name is created.
This allows safely running multiple nodes on the same machine (this
happens mainly during tests).
Closes#69972
Relates to #68920
This change modifies GeoIpDownloaderIT to wait in assertBusy despite of error (by wrapping whole body in try-catch) and adds additional assertion to debug failures tracked in #69594
The test failure looks legit, because there is a possibility that the same databases
was downloaded twice. See added comment in DatabaseRegistry class.
Relates to #69972
This test predefined expected md5 hashes in constants, that were expected with java15.
However java16 creates different md5 hashes and so the expected md5 hashes don't match
with the actual md5 hashes, which caused tests in this test suite to fail (running
with java16 only).
The tests now generates the expected md5 hash during the test instead of using predefined constants.
Closes#69986
Wait for ingest threads to stop using the DatabaseReaderLazyLoader, so the during the next run the db update thread doesn't try to remove the db again (because the file hasn't yet been deleted).
Also delete tmp dirs this test create at the end of the test, so that when repeating this test many times, this test doesn't accumulate many directories with database files.
Closes#69980
This change switches clean up in DatabaseRegistry.initialize from using Files.walk and stream operations to Files.walkFileTree which can be made more robust in case of errors
When geoip.downloader.enabled setting changes we should try to start/stop geo ip task from single node only- other requests will definitely fail.
This change also extends timeout in GeoIpDownloaderIT as current short one fails sometimes in CI