elasticsearch/docs/reference/sql/endpoints/jdbc.asciidoc

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

244 lines
8.2 KiB
Plaintext
Raw Normal View History

2018-06-23 06:40:25 +08:00
[role="xpack"]
[[sql-jdbc]]
== SQL JDBC
{es}'s SQL jdbc driver is a rich, fully featured JDBC driver for {es}.
It is Type 4 driver, meaning it is a platform independent, stand-alone, Direct to Database,
pure Java driver that converts JDBC calls to {es-sql}.
[[sql-jdbc-installation]]
[discrete]
=== Installation
The JDBC driver can be obtained from:
Dedicated page::
https://www.elastic.co/downloads/jdbc-client[elastic.co] provides links, typically for manual downloads.
Maven dependency::
https://maven.apache.org/[Maven]-compatible tools can retrieve it automatically as a dependency:
["source","xml",subs="attributes"]
----
<dependency>
<groupId>org.elasticsearch.plugin</groupId>
<artifactId>x-pack-sql-jdbc</artifactId>
<version>{version}</version>
</dependency>
----
from https://search.maven.org/artifact/org.elasticsearch.plugin/x-pack-sql-jdbc[Maven Central Repository],
or from `artifacts.elastic.co/maven` by adding it to the repositories list:
["source","xml",subs="attributes"]
----
<repositories>
<repository>
<id>elastic.co</id>
<url>https://artifacts.elastic.co/maven</url>
</repository>
</repositories>
----
[[jdbc-compatibility]]
[discrete]
=== Version compatibility
include::version-compat.asciidoc[]
[[jdbc-setup]]
[discrete]
=== Setup
The driver main class is `org.elasticsearch.xpack.sql.jdbc.EsDriver`.
2021-03-31 21:57:47 +08:00
Note the driver implements the JDBC 4.0 +Service Provider+ mechanism meaning it is registered automatically
as long as it is available in the classpath.
Once registered, the driver understands the following syntax as an URL:
["source","text",subs="attributes"]
----
jdbc:[es|elasticsearch]://[[http|https]://]?[host[:port]]?/[prefix]?[\?[option=value]&]*
----
`jdbc:[es|elasticsearch]://`:: Prefix. Mandatory.
`[[http|https]://]`:: Type of HTTP connection to make. Possible values are
`http` (default) or `https`. Optional.
`[host[:port]]`:: Host (`localhost` by default) and port (`9200` by default).
Optional.
`[prefix]`:: Prefix (empty by default). Typically used when hosting {es} under
a certain path. Optional.
`[option=value]`:: Properties for the JDBC driver. Empty by default.
Optional.
The driver recognized the following properties:
[[jdbc-cfg]]
[discrete]
===== Essential
[[jdbc-cfg-timezone]]
`timezone` (default JVM timezone)::
Timezone used by the driver _per connection_ indicated by its `ID`.
*Highly* recommended to set it (to, say, `UTC`) as the JVM timezone can vary, is global for the entire JVM and can't be changed easily when running under a security manager.
[[jdbc-cfg-network]]
[discrete]
===== Network
`connect.timeout` (default `30000`)::
Connection timeout (in milliseconds). That is the maximum amount of time waiting to make a connection to the server.
`network.timeout` (default `60000`)::
Network timeout (in milliseconds). That is the maximum amount of time waiting for the network.
`page.size` (default `1000`)::
Page size (in entries). The number of results returned per page by the server.
SQL: fix use of requestTimeout and pageTimeout query parameters (#79360) Resolves #72151 The _sql endpoint offers a `page_timeout` parameter for customizing how long scroll contexts should be kept open (if needed) and a `request_timeout` parameter which the docs describe as "Timeout before the request fails.". Currently, the value of the `page_timeout` parameter is used as the `timeout` in subsequent _search requests and not as the timeout in the `scroll` configuration. For the `scroll` configuration, SQL uses the `request_timeout` parameter. This PR addresses the issue by swapping the uses of `page_timeout` and `request_timeout` in querier. Additionally, the PR removes some unused artifacts that might have caused some confusion: - The `timeout` and `keepAlive` fields in `Querier`. Instead, `Querier` directly uses the according fields in `SqlConfiguration`. - The `SqlConfiguration` parameter from `ScrollCursor.clear`, it's not used but required an instance of `SqlConfiguration` with all default values. - One overloaded constructor of `SqlConfiguration` that was only used for calling `ScrollCursor.clear` (and some tests) and used default values for an (arbitrary?) subset of the fields. - The fields related to async requests in `SqlConfiguration`. I'm a bit unsure about this one but the fields are never read and it does not seem like an SQL specific concern. The whole creation of the async tasks is handled in `TransportSqlQueryAction` and the downstream components do not require the information.
2021-10-27 21:21:52 +08:00
`page.timeout` (default `45000`)::
Page timeout (in milliseconds). Minimum retention period for the scroll cursor on the server. Queries that require
a stateful scroll cursor on the server side might fail after this timeout. Hence, when scrolling through large result sets,
processing `page.size` records should not take longer than `page.timeout` milliseconds.
`query.timeout` (default `90000`)::
Query timeout (in milliseconds). That is the maximum amount of time waiting for a query to return.
[[jdbc-cfg-auth]]
[discrete]
==== Basic Authentication
`user`:: Basic Authentication user name
`password`:: Basic Authentication password
[[jdbc-cfg-ssl]]
[discrete]
==== SSL
`ssl` (default `false`):: Enable SSL
`ssl.keystore.location`:: key store (if used) location
`ssl.keystore.pass`:: key store password
`ssl.keystore.type` (default `JKS`):: key store type. `PKCS12` is a common, alternative format
`ssl.truststore.location`:: trust store location
`ssl.truststore.pass`:: trust store password
`ssl.truststore.type` (default `JKS`):: trust store type. `PKCS12` is a common, alternative format
`ssl.protocol`(default `TLS`):: SSL protocol to be used
[discrete]
==== Proxy
`proxy.http`:: Http proxy host name
`proxy.socks`:: SOCKS proxy host name
[discrete]
==== Mapping
`field.multi.value.leniency` (default `true`):: Whether to be lenient and return the first value (without any guarantees of what that
will be - typically the first in natural ascending order) for fields with multiple values (true) or throw an exception.
[discrete]
==== Index
`index.include.frozen` (default `false`):: Whether to include frozen indices in the query execution or not (default).
[discrete]
==== Cluster
`catalog`:: Default catalog (cluster) for queries. If unspecified, the
queries execute on the data in the local cluster only.
+
experimental:[] See <<modules-cross-cluster-search,{ccs}>>.
[discrete]
==== Error handling
`allow.partial.search.results` (default `false`):: Whether to return partial results in case of shard failure or fail the query throwing
the underlying exception (default).
[discrete]
==== Troubleshooting
`debug` (default `false`):: Setting it to `true` will enable the debug logging.
`debug.output` (default `err`):: The destination of the debug logs. By default, they are sent to standard error. Value `out` will redirect the logging to standard output. A file path can also be specified.
[discrete]
==== Additional
`validate.properties` (default `true`):: If disabled, it will ignore any misspellings or unrecognizable properties. When enabled, an exception
will be thrown if the provided property cannot be recognized.
To put all of it together, the following URL:
["source","text"]
----
jdbc:es://http://server:3456/?timezone=UTC&page.size=250
----
opens up a {es-sql} connection to `server` on port `3456`, setting the JDBC connection timezone to `UTC` and its pagesize to `250` entries.
=== API usage
One can use JDBC through the official `java.sql` and `javax.sql` packages:
[[java-sql]]
==== `java.sql`
The former through `java.sql.Driver` and `DriverManager`:
["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{jdbc-tests}/JdbcIntegrationTestCase.java[connect-dm]
--------------------------------------------------
<1> The server and port on which Elasticsearch is listening for
HTTP traffic. The port is by default 9200.
<2> Properties for connecting to Elasticsearch. An empty `Properties`
instance is fine for unsecured Elasticsearch.
[[javax-sql]]
==== `javax.sql`
Accessible through the `javax.sql.DataSource` API:
["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{jdbc-tests}/JdbcIntegrationTestCase.java[connect-ds]
--------------------------------------------------
<1> The server and port on which Elasticsearch is listening for
HTTP traffic. By default 9200.
<2> Properties for connecting to Elasticsearch. An empty `Properties`
instance is fine for unsecured Elasticsearch.
Which one to use? Typically client applications that provide most
configuration properties in the URL rely on the `DriverManager`-style
while `DataSource` is preferred when being _passed_ around since it can be
configured in one place and the consumer only has to call `getConnection`
without having to worry about any other properties.
To connect to a secured Elasticsearch server the `Properties`
should look like:
["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{security-tests}/JdbcSecurityIT.java[admin_properties]
--------------------------------------------------
Once you have the connection you can use it like any other JDBC
connection. For example:
["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{jdbc-tests}/SimpleExampleTestCase.java[simple_example]
--------------------------------------------------
[NOTE]
{es-sql} doesn't provide a connection pooling mechanism, thus the connections
the JDBC driver creates are not pooled. In order to achieve pooled connections,
a third-party connection pooling mechanism is required. Configuring and setting up the
third-party provider is outside the scope of this documentation.