elasticsearch

Commit Graph

Author	SHA1	Message	Date
Benjamin Trent	14ca8fee20	[ML] add support for xlm_roberta tokenized models (#94089 ) Many multi-lingual and newer models use a tokenization scheme similar to sentence-piece. This PR adds support for one of those tokenization schemes, XLMRoBERTa. The main changes are: - Support for xlm_roberta tokenization configuration - Adding `scores` to the vocabulary document stored, requiring that scores be the same size as the vocabulary - Adding a new flat text file to resources that is the spm char normalizer.	2023-06-13 08:40:55 -04:00
David Roberts	6fa3d73fd5	[ML] Make native inference generally available (#92213 ) Previously this functionality was beta. This PR changes it to GA.	2022-12-12 15:43:30 +00:00
Nik Everett	6481342466	Fix sneaky docs test failure (#91829 ) This prevents docs files from starting with a "response" because when that happens the response is converted to an assertion and appended to the last snippet that was processed. If that last snipper was in a different file then it's very hard to reason about the tests. That goes double because the order we iterate files isn't defined.... Anyway! This adds a guard in the build, removes the offending "response", and reenables the tests that we'd thought we failing here. Closes #91081	2022-12-07 11:02:44 -05:00
David Roberts	d9ea080d10	[ML] Release native inference functionality as beta (#90418 ) Previously this functionality was tech preview (aka experimental). This PR changes it to beta.	2022-09-28 11:09:02 +01:00
Lisa Cawley	89a3e18e10	[DOCS] Add preview admonition to infer API (#86486 )	2022-05-05 13:49:02 -07:00
Benjamin Trent	258d2b71e2	[ML] add roberta/bart docs (#85001 ) adds roberta section to NLP tokenization documentation.	2022-03-17 12:14:57 -04:00
Lisa Cawley	429bdd9afc	[DOCS] Move trained model APIs out of dataframe analytics (#81315 )	2021-12-03 09:21:09 -08:00

7 Commits