elasticsearch

Commit Graph

Author	SHA1	Message	Date
Ed Savage	fd20027751	[ML] Performance improvements for categorization jobs (#89824 ) Categorization of strings which break down to a huge number of tokens can cause the C++ backend process to choke - see elastic/ml-cpp#2403. This PR adds a limit filter to the default categorization analyzer which caps the number of tokens passed to the backend at 100. Unfortunately this isn't a complete panacea to all the issues surrounding categorization of many tokened / large messages as verification checks on the frontend can also fail due to calls to the datafeed _preview API returning an excessive amount of data.	2022-09-08 18:41:01 +01:00
Lisa Cawley	458ef91066	[DOCS] Move ML info and upgrade APIs (#84005 )	2022-02-16 11:23:00 -08:00

Author

SHA1

Message

Date

Ed Savage

fd20027751

[ML] Performance improvements for categorization jobs (#89824 )

Categorization of strings which break down to a huge number of tokens can cause the C++ backend process to choke - see elastic/ml-cpp#2403.

This PR adds a limit filter to the default categorization analyzer which caps the number of tokens passed to the backend at 100.

Unfortunately this isn't a complete panacea to all the issues surrounding categorization of many tokened / large messages as verification checks on the frontend can also fail due to calls to the datafeed _preview API returning an excessive amount of data.

2022-09-08 18:41:01 +01:00

Lisa Cawley

458ef91066

[DOCS] Move ML info and upgrade APIs (#84005 )

2022-02-16 11:23:00 -08:00

2 Commits