docetl2

Commit Graph

Author	SHA1	Message	Date
Shreya Shankar	0f6d9514e3	Add type checks to validation functions (#427 ) * feat: Add output type validation to map and reduce operations Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com> * Refine API error message for validation failures Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com> * Test map type validation for integer answers Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com> * fix test for validating schemas --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com>	2025-09-12 16:18:21 -07:00
Shreya Shankar	4cfd371744	feat: add topk implementation (#410 )	2025-08-13 13:27:54 -07:00
Shreya Shankar	79542259cb	Refactor sample operation for multiple stratify keys (#408 ) * Enhance sample operation with multi-key stratification and per-group sampling Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com> * Refactor SampleOperation with improved validation and sampling methods Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com> * Refactor sample operation with improved stratification and simplified config Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com>	2025-08-13 10:16:55 -07:00
Shreya Shankar	f6ead4ea6b	Switch from poetry to uv (#402 ) * Switch from Poetry to uv for dependency management and packaging Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com> * Update README with uv installation and dependency management instructions Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com> * Optimize Docker build: improve dependency installation and caching Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com> * verify that uv works on my local installation --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com>	2025-08-09 13:20:32 -07:00
Sid Jha	c89e269d67	Remove old typing imports (#389 ) * Remove old typing imports * Add future annotation * Add back in import * Use Iterator instead of Iterable * fix: small edits to fix broken tests --------- Co-authored-by: Shreya Shankar <ss.shankar505@gmail.com>	2025-07-20 12:01:17 -07:00
Shreya Shankar	4cee4d7817	Clean up and reorganize pytest tests (#377 ) * Replace api_wrapper with runner in test fixtures and configurations Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com> * Refactor test fixtures and reorganize configuration in test files Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com>	2025-07-03 22:15:33 -07:00
Shreya Shankar	b8d2beb602	feat: adding conditional gleaning (#375 ) * fix: improve caching and don't raise error for bad gather configs * fix: improve caching and don't raise error for bad gather configs * feat: adding conditional gleaning	2025-07-01 22:40:38 -07:00
Shreya Shankar	631ea0c34f	feat: Add calibration support to map operations for improved consistency (#365 ) * chore: run pre-commit * feat: add calibration to map ops * feat: add calibration to map ops * fix: comment out flaky test	2025-06-14 23:04:13 -07:00
shabie	4793c89e92	feat: add flag to stream map operation outputs to disk (#323 ) * feat: add flag to stream map operation outputs to disk * flush partial results default False * comment out print statement in test basic map * rewrite test to not use batched config --------- Co-authored-by: Shreya Shankar <ss.shankar505@gmail.com>	2025-03-19 19:17:56 -06:00
Shreya Shankar	05c4357c59	docs: add verbose param (#283 )	2025-01-22 12:23:46 +01:00
Shreya Shankar	57d1cd78a1	refactor: DSLRunner now uses a pull-based execution model (#273 ) * partial commit * refactor: dslrunner is now a pull based execution model * refactor: dslrunner is now a pull based execution model * refactor: optimizer is now using the new pull based execution model * refactor: optimizer is now using the new pull based execution model * refactor: optimizer is now using the new pull based execution model * remove builder file * remove builder file and make tests pass * fix tests	2025-01-10 12:45:04 -08:00
Rohit Rawat	0e077aa740	added enum support (#254 ) * added enum support * tests: add test for enum type output * docs: update docs to support enum type schemas --------- Co-authored-by: Shreya Shankar <ss.shankar505@gmail.com>	2024-12-26 17:50:23 -06:00
Shreya Shankar	aa5c2a5c93	test: add new optimizer test	2024-11-13 22:11:37 -08:00
Shreya Shankar	7242da81f4	fix: allow user to pass in litellm completion kwargs	2024-11-11 14:04:40 -08:00
Shreya Shankar	2b9aea83dc	fix: allow reduce_key types to be lists	2024-11-11 13:14:19 -08:00
Shreya Shankar	33c24365b6	move tests into basic dir	2024-10-31 17:39:20 -07:00
Shreya Shankar	604ac257fe	feat: adding batching for map and filter calls	2024-10-29 19:14:30 -07:00
Shreya Shankar	50936a8901	fix test: comment out timeout test	2024-10-27 19:25:34 -07:00
Shreya Shankar	f6e3cd0473	mark as flaky test	2024-10-22 12:47:31 -07:00
Shreya Shankar	a6f72510cd	refactor: remove optimizer from configwrapper	2024-10-19 13:25:00 -07:00
Shreya Shankar	7e0deb354e	fix: tests check for cost	2024-10-13 22:02:43 -04:00
Shreya Shankar	eecbd41e44	refactor: address redhog feedback	2024-10-13 17:43:48 -04:00
Shreya Shankar	658b59e689	refactor: address redhog feedback	2024-10-13 17:37:32 -04:00
Shreya Shankar	64b5345043	fix: change gleaning prompt to validation_prompt	2024-10-13 17:17:45 -04:00
Shreya Shankar	307456b98c	feat: add reduce operation lineage	2024-10-13 17:17:45 -04:00
Shreya Shankar	38a073ff75	refactor: combine sampling and outlier operators	2024-10-12 17:51:27 -04:00
Shreya Shankar	042bdf2e05	update test	2024-10-12 12:35:28 -07:00
Shreya Shankar	c158ae11e8	refactor: move validation and gleaning into call llm	2024-10-11 17:17:32 -07:00
Shreya Shankar	70604cb08b	Merge staging to main (after adding cluster operator) (#88 ) * Parsers can now return any number of fields, and can access the whole item * nit: change gpt-4o to gpt-4o-mini in tests * feat: add verbose parameter for gleaning * feat: add verbose parameter for gleaning * fix: tokenizers should be wrapped in try catch * fix: resort to eval if ast eval does not work * docs: update docs to reflect new custom parsing API Co-authored-by: redhog <redhog@users.noreply.github.com> * Clustering (#84) * nit: change gpt-4o to gpt-4o-mini in tests * feat: add verbose parameter for gleaning * feat: add verbose parameter for gleaning * fix: tokenizers should be wrapped in try catch * fix: resort to eval if ast eval does not work * Merge staging to main (after parsers refactor) (#82) * Parsers can now return any number of fields, and can access the whole item * nit: change gpt-4o to gpt-4o-mini in tests * feat: add verbose parameter for gleaning * feat: add verbose parameter for gleaning * fix: tokenizers should be wrapped in try catch * fix: resort to eval if ast eval does not work * docs: update docs to reflect new custom parsing API --------- Co-authored-by: Egil <egil.moller@freecode.no> * Added new clustering operation * Reverse path * Added docs for cluster operator * Bugfix for docs formatting * docs: add sample parameter (#87) * Added new clustering operation * Reverse path * Added docs for cluster operator * Bugfix for docs formatting * add tests and link to doc --------- Co-authored-by: Shreya Shankar <ss.shankar505@gmail.com> Co-authored-by: Egil <egil.moller@freecode.no> * fix: fixing params in test --------- Co-authored-by: Egil <egil.moller@freecode.no> Co-authored-by: redhog <redhog@users.noreply.github.com> Co-authored-by: Egil Möller <redhog@redhog.org>	2024-10-08 23:51:02 -07:00
Shreya Shankar	2e6997d646	Merge staging to main (after parsers refactor) (#82 ) * Parsers can now return any number of fields, and can access the whole item * nit: change gpt-4o to gpt-4o-mini in tests * feat: add verbose parameter for gleaning * feat: add verbose parameter for gleaning * fix: tokenizers should be wrapped in try catch * fix: resort to eval if ast eval does not work * docs: update docs to reflect new custom parsing API --------- Co-authored-by: Egil <egil.moller@freecode.no>	2024-10-07 21:33:32 -07:00
Shreya Shankar	69b491eebb	refactor: switch changes back to the runner object	2024-10-07 09:30:24 -07:00
Shreya Shankar	27de5f2615	refactor: improving tests and consistency in the rate limits refactor	2024-10-06 22:55:06 -07:00
Shreya Shankar	74fad079af	fix: enable gleaning llm calls to work	2024-10-05 09:20:59 -07:00
Shreya Shankar	5883fb354f	fix: enable gleaning llm calls to work	2024-10-05 09:00:25 -07:00
Shreya Shankar	d12bf3a004	docs: improving documentation for pipeline api	2024-10-04 09:07:38 -07:00
Shreya Shankar	4454949429	feat: support for gemini	2024-10-03 11:44:02 -07:00
Shreya Shankar	9846f01324	docs: update documentation for custom parsers	2024-09-30 22:14:39 -07:00
Shreya Shankar	674c64bf0f	test: reduce the character minimum for parsing test	2024-09-30 21:57:03 -07:00
Shreya Shankar	c3416753f0	feat: add custom dataset parsers	2024-09-30 18:32:04 -07:00
Shreya Shankar	e8694356b2	chore: refactor schemas.py to also include ops from api.py	2024-09-30 15:57:47 -07:00
Shreya Shankar	3e98bcfe9d	rebase with main	2024-09-30 15:14:49 -07:00

41 Commits