Shreya Shankar
4e9f4532ad
refactor: update MOAR documentation
2025-12-29 14:04:23 -06:00
Shreya Shankar
9eee1a2553
feat: add claude-code skill ( #469 )
...
* feat: add claude-code skill
* feat: add claude-code skill
* get ready to bump up version
2025-12-27 22:23:30 -06:00
Shreya Shankar
fa86e23cfb
add limit param to llm ops ( #466 )
2025-12-26 19:46:28 -08:00
Shreya Shankar
56a1b7e794
Feat: Add first-class LanceDB retrievers and prompt augmentation across LLM operators ( #460 )
...
* feat: adding retrievers
* feat: allow indexes to be built on datasets created by docetl pipelines
* Testing retriever and updating docs with logging
2025-12-26 16:48:48 -06:00
Lindsey Wei
81d110404d
Add MOAR optimizer to docetl ( #464 )
...
* feat: adding conditional gleaning (#375 )
* fix: improve caching and don't raise error for bad gather configs
* fix: improve caching and don't raise error for bad gather configs
* feat: adding conditional gleaning
* chore: bump up fastapi and python multipart (#376 )
* merge
* chore: bump up fastapi and python multipart
* chore: bump up fastapi and python multipart
* pipeline for chaining
* chaining + gleaning(map)
* Clean up and reorganize pytest tests (#377 )
* Replace api_wrapper with runner in test fixtures and configurations
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Refactor test fixtures and reorganize configuration in test files
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
* Refactor api.py for structured output (#378 )
* Refactor API wrapper with modular design for LLM calls and output handling
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Refactor APIWrapper: Simplify LLM call logic and improve modularity
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Refactor output mode handling in APIWrapper with flexible configuration
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Add comprehensive tests for DocETL output modes with synthetic data
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Refactor output modes tests with improved pytest structure and DSLRunner
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Fix runtime errors
* Add nested JSON parsing for string values in API response
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Handle nested JSON parsing by extracting matching key values
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Simplify JSON parsing logic in API utility functions
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Add to tests
* Add documentation for DocETL output modes and configuration options
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Add docs
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
* Add operators to pandas API (#379 )
* baseline w/ 3 rewrites
* added sample
* 3 choices
* feat: add global bypass cache (#383 )
* baseline experiments
* baseline experiments
* update directives
* MCTS random expand
* MCTS v1
* MCTS V1 w/ instantiation check on chaining
* MCTS V2
* remove key
* delete key
* update evaluation code
* >1 instantiation for chaining & true f1 score
* true f1 score as acc
* hypervolume as value algo
* Add directives folder from PR #1
* tests for directives
* updates on instantiate_schemas.py
* adding experiment folder from P1
* added init for mcts
* partially merged PR1
* fix instantiate schema
* added chunking directives
* passed directive tests
* fix bug in doc_chunking instantiate
* fix doc_chunking bug
* update prompt for chunk_header_summary
* update gitignore
* fix mcts bugs when rebasing
* add chunk sampling directive
* print AUC at end
* reward update
merged
* fixed bug in doc_compression and avoid using chunk_header_summary twice
* fixed chunk sampling
* adding blackvault dataset and refactoring hard coded cuad statistics
* scaled cost, progressive widening
* feat: add head tail directive
* feat: add head tail directive
* feat: add head tail directive
* merge chunk sampling + doc chunking into one directive. also modify sample operator to stratify by multiple keys
* add game reviews experiemnt and fix doc chunking
* trim down some fat in game reviews eval logic
* syntax check for stratify key
* added split_key check and fixed bug in take_head_tail
* added **kwargs
* remove reduce in op fusion
* adding medec workload
* document_key check for doc_compression & load schema data
* add sustainability expt
* add reduce chaining
* update expand retry & op fusion
* remove hard coding of blackvault
* remove hard coding of blackvault
* adding agent for instantiation (that is allowed to read docs), and a clarify instructions directive
* adding agent for instantiation (that is allowed to read docs), and a clarify instructions directive
* add swap with code directive
* optimizing cost in first 5 iterations
* added mcts log and remove plot
* add memory along tree path and action reward in log
* add biodex and have train and test splits
* add biodex and have train and test splits
* added map reduce fusion, small changes to mcts
* added reference point for calculating hypervolume in evaluation
* fix bug in calculating hypervolume
* integrate with modal
* add run_all.py script
* add run_all.py script
* fix bug in baseline applying ops
* update mcts chat history
* update mcts chat history
* control message length to fit context window
* fix agent baseline loop
* remove print statement
* update run baseline
* adding extremely simple agent baseline that just gives us entirely new pipelines
* adding extremely simple agent baseline that just gives us entirely new pipelines
* update readme with simple / naive agent baseline
* update
* change modal image
* edit gitignore to operators doc
* adding retrieval based rewrite directive
* debug simple baseline
* fix baseline agent and top k chunking directive
* fix baseline agent and top k chunking directive
* fix baseline agent and top k chunking directive
* fix chunking topk directive and add hierarchical reduce
* merge the original plan execution of three methods
* add plot and hypervolume calculation
* feat: add cascade filtering directive
* feat: add arbitrary rewrite directive
* remove unnecessary field
* fixed fall back method for baseline & mcts
* adding some validation for arbitrary rewrite
* update sample operation
* change model choice
* updates on plot
* fix gpt-5 temperature in query engine
* update model choice in base
* use rp@5 for biodex eval
* separate cost & acc optimize change model directive
* add abacus to plot
* added memo table for each node and provide it to the agent for directive selection
* add facility dataset (we dont need to use for paper) and lotus baselines
* edit prompts in change model directive
* change models
* added rules on compositions of directives. clean up mcts using utils
* add multiple instantiation for selected directives
* fix gemini errors
* adding PZ baselines (still in progress
* add all PZ baselines to main
* add all PZ baselines to main
* fix bugs about multiple instantiations
* add utility to plot the test set
* fix linter errors
* fix PZ script for medec dataset
* add original pipeline on test set to plot
* change color of mcts
* test all model on the input query before search
* add search cost calculation
* fixed bug in json file path
* add concurrency control; start searching from pareto model plans
* set reward to be vertical distance to the step frontier
* newest mcts version
* small changes during exp
* change to run lotus, add bootstrap
* Add needs_code_filter variable for operator fusion
* add validation step; randomly select when values have ties
* eval function change
* clean up run test frontier
* Delete bootstrap_evaluation.py
* Delete BioDEX_evaluate.py
* Delete CUAD_evaluate.py
* Delete CUAD_sample50.py
* Delete bootstrapping.py
* Delete evaluate_blackvault.py
* Delete exp_graph.py
* Delete exp_graph_max.py
* Delete re_evaluate_zero_scores.py
* Delete split_json_data.py
* Delete test_evaluation.py
* Delete test_gemini_models.py
* Delete test_operations_modal.py
* Delete test_validate_frontier.py
* Delete validate_pareto_frontier.py
* Delete docetl/reasoning_optimizer/Untitled-1.py
* Delete docetl/BioDEX_test.py
* Delete docetl/mcts/execute_res_HV directory
* Delete docetl/mcts/graph_baseline.py
* Delete docetl/mcts/graph.py
* remove acc comparator
* delete graph
* delete irrelevant files
* mcts clean up
* update naming
* clean up simple agent
* update readme
* add user specified eval function
* Delete experiments/reasoning/run_tests.py
* Delete experiments/reasoning/run_test_frontier.py
* Delete experiments/reasoning/run_baseline.py
* Delete experiments/reasoning/run_all.py
* Delete experiments/reasoning/plot_result.py
* Delete experiments/reasoning/plot_matrix.py
* Delete experiments/reasoning/combine_biodex_test_results.py
* Delete experiments/reasoning/create_biodex_test_summary.py
* Delete experiments/reasoning/generate_biodex_summary.py
* Delete experiments/reasoning/TEST_FRONTIER_README.md
* Delete experiments/reasoning/utils.py
* Delete experiments/reasoning/README.md
* Delete experiments/reasoning/othersystems directory
* Delete experiments/reasoning/outputs/blackvault_baseline
* Delete experiments/reasoning/outputs/blackvault_mcts
* Delete experiments/reasoning/outputs/cuad_lotus_evaluation.json
* Delete compute_words_per_document.py
* Delete docetl/moar/acc_comparator.py
* Delete docetl/moar/acc_comparator.py.backup
* Delete docetl/moar/instantiation_check.py
* Delete docetl/reasoning_optimizer/build_optimization.py
* Delete docetl/reasoning_optimizer/generate_rewrite_plan.py
* remove auc
* simplify util
* Delete experiments/reasoning/data directory
* requirements
* eval function cleaning
* update readme
* update readme
* update readme
* remove facility
* update readme
* Update api to be consistent with main
* remove unused files
* clean up relative imports and model choices
* change all printing to use the rich console
* add documentation for moar and CLI
* move evaluation code from experiments dir to moar dir
* fix documentation
---------
Co-authored-by: Shreya Shankar <ss.shankar505@gmail.com>
Co-authored-by: linxiwei <lindseywei@visitor-10-57-110-173.wifi.berkeley.edu>
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: linxiwei <lindseywei@visitor-10-57-109-39.wifi.berkeley.edu>
Co-authored-by: linxiwei <lindseywei@lindseys-mbp-7.lan>
Co-authored-by: linxiwei <lindseywei@visitor-10-57-111-186.wifi.berkeley.edu>
Co-authored-by: linxiwei <lindseywei@wifi-10-41-110-112.wifi.berkeley.edu>
Co-authored-by: linxiwei <lindseywei@wifi-10-44-111-72.wifi.berkeley.edu>
Co-authored-by: Lindsey Wei <152750390+LindseyyyW@users.noreply.github.com>
Co-authored-by: linxiwei <lindseywei@wifi-10-44-110-253.wifi.berkeley.edu>
Co-authored-by: linxiwei <lindseywei@wifi-10-41-110-92.wifi.berkeley.edu>
Co-authored-by: linxiwei <lindseywei@wifi-10-41-108-154.wifi.berkeley.edu>
Co-authored-by: linxiwei <lindseywei@wifi-10-41-109-198.wifi.berkeley.edu>
Co-authored-by: linxiwei <lindseywei@Lindseys-MacBook-Pro-7.local>
Co-authored-by: linxiwei <lindseywei@192.168.0.110>
2025-11-28 13:47:42 -06:00
Shreya Shankar
0110071cd5
fix: add code ops and extract to python api ( #463 )
2025-11-24 14:18:59 -06:00
Shreya Shankar
a184a3c1e9
fix: add code ops and extract to python api ( #462 )
...
* fix: add code ops and extract to python api
* fix: add code ops and extract to python api
2025-11-24 14:10:07 -06:00
Shreya Shankar
c3ef5684d5
Add fallback models documentation to user guide ( #454 )
...
* Add fallback models documentation and configuration
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Refactor: Remove operation-specific models from fallback docs
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Docs: Add content warning errors to fallback model triggers
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Remove outdated best practices from fallback models documentation
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-11-14 13:32:31 -08:00
Shreya Shankar
ea0013d7fd
Implement LiteLLM fallback models for reliability ( #453 )
...
* feat: Add LiteLLM Router for fallback models
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Add model to litellm_params for LiteLLM Router
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Refactor: Prioritize operation model in LiteLLM Router fallbacks
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Refactor: Cache LiteLLM Routers in APIWrapper
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Refactor: Separate completion and embedding routers
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Add fallback models example configuration
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* feat: Add fallback models to LiteLLM Router
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Refactor Router fallbacks to use list of dicts
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Remove example fallback config file
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Feat: Add version constraint for paddlepaddle
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-11-14 13:17:47 -08:00
Shreya Shankar
50e45cf7db
Refactor: Use num_tokens instead of token_count ( #430 )
...
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-09-15 15:32:33 -07:00
Shreya Shankar
d1dab86466
Fix cosine similarity blocking in resolve operation ( #428 )
...
* fix: resolve blocking
* fix: resolve blocking
2025-09-14 20:24:58 -07:00
Shreya Shankar
4cfd371744
feat: add topk implementation ( #410 )
2025-08-13 13:27:54 -07:00
Shreya Shankar
6dafd46fba
Update Pandas API to Use New Output Parameter Format ( #409 )
...
* chore: update pandas api and docs
* chore: update pd accessors
2025-08-13 10:41:49 -07:00
Shreya Shankar
79542259cb
Refactor sample operation for multiple stratify keys ( #408 )
...
* Enhance sample operation with multi-key stratification and per-group sampling
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Refactor SampleOperation with improved validation and sampling methods
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Refactor sample operation with improved stratification and simplified config
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-08-13 10:16:55 -07:00
Shreya Shankar
f6ead4ea6b
Switch from poetry to uv ( #402 )
...
* Switch from Poetry to uv for dependency management and packaging
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Update README with uv installation and dependency management instructions
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Optimize Docker build: improve dependency installation and caching
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* verify that uv works on my local installation
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-08-09 13:20:32 -07:00
Shreya Shankar
84c4807009
chore: switching to cloudbank for blob storage ( #400 )
...
* add seo
* add seo
* switching to cloudbank for blob storage
2025-07-31 14:28:00 -07:00
Shreya Shankar
0f0253be16
Add structured output mode support to pandas API ( #394 )
...
* Add structured output mode support to pandas API
This commit addresses issue #393 by adding the ability to specify structured
output mode in the pandas semantic accessor.
Key changes:
- Enhanced map() method to accept 'output' parameter with schema and mode
- Added support for 'structured_output' mode alongside existing 'tools' mode
- Maintained backward compatibility with existing 'output_schema' parameter
- Added comprehensive parameter validation and error handling
New API format:
```python
df.semantic.map(
prompt="Extract data: {{input.text}}",
output={
"schema": {"name": "str", "age": "int"},
"mode": "structured_output" # Optional, defaults to "tools"
}
)
```
Tests added:
- test_semantic_map_structured_output: Tests new structured output functionality
- test_semantic_map_invalid_output_mode: Tests output mode validation
- test_semantic_map_structured_output_vs_tools: Compares both output modes
- test_semantic_map_backward_compatibility: Ensures old API still works
- test_semantic_map_parameter_validation: Comprehensive parameter validation
🤖 Generated with [Claude Code](https://claude.ai/code )
Co-Authored-By: Claude <noreply@anthropic.com>
* Update pandas documentation for structured output mode
- Updated pandas/index.md to document new output parameter format
- Added section explaining output modes (tools vs structured_output)
- Updated examples throughout pandas/operations.md and pandas/examples.md
- Maintained backward compatibility examples
- Added guidance on when to use structured output mode
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
2025-07-22 08:54:44 -07:00
Sid Jha
19a8286978
Improve syntax_check by leveraging pydantic validation ( #392 )
...
* Work on other operators
* Fix
* Add ValidationInfo
* Bug fix
* docs: fix errors in equijoin documentation
---------
Co-authored-by: Shreya Shankar <ss.shankar505@gmail.com>
2025-07-20 18:54:59 -07:00
Shreya Shankar
9028efc8d4
Update cluster documentation to use inputs iteration in summary prompt ( #386 )
...
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-07-14 18:12:22 -07:00
Shreya Shankar
d995c534b5
feat: add global bypass cache ( #383 )
2025-07-08 16:53:40 -07:00
Shreya Shankar
9b836ee228
Add operators to pandas API ( #379 )
2025-07-04 11:59:24 -07:00
Shreya Shankar
1e4709a112
Refactor api.py for structured output ( #378 )
...
* Refactor API wrapper with modular design for LLM calls and output handling
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Refactor APIWrapper: Simplify LLM call logic and improve modularity
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Refactor output mode handling in APIWrapper with flexible configuration
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Add comprehensive tests for DocETL output modes with synthetic data
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Refactor output modes tests with improved pytest structure and DSLRunner
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Fix runtime errors
* Add nested JSON parsing for string values in API response
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Handle nested JSON parsing by extracting matching key values
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Simplify JSON parsing logic in API utility functions
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Add to tests
* Add documentation for DocETL output modes and configuration options
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Add docs
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-07-04 10:58:15 -07:00
Shreya Shankar
9a72a6b729
chore: bump up fastapi and python multipart ( #376 )
...
* merge
* chore: bump up fastapi and python multipart
* chore: bump up fastapi and python multipart
2025-07-02 07:37:06 -07:00
Shreya Shankar
b8d2beb602
feat: adding conditional gleaning ( #375 )
...
* fix: improve caching and don't raise error for bad gather configs
* fix: improve caching and don't raise error for bad gather configs
* feat: adding conditional gleaning
2025-07-01 22:40:38 -07:00
Shreya Shankar
7071ade539
fix: improve caching and don't raise error for bad gather configs ( #373 )
...
* merge
* fix: improve caching and don't raise error for bad gather configs
* fix: improve caching and don't raise error for bad gather configs
* fix: improve caching and don't raise error for bad gather configs
2025-06-30 23:19:17 -07:00
Shreya Shankar
ea60f38afd
docs: improve gleaning description ( #371 )
...
* docs: improve gleaning description
* docs: improve gleaning description
* docs: improve gleaning description
* docs: fix spacing in gleaning docs
2025-06-26 21:25:26 -07:00
Shreya Shankar
d156351c69
docs: improve gleaning description ( #370 )
...
* docs: improve gleaning description
* docs: improve gleaning description
* docs: improve gleaning description
2025-06-26 21:19:51 -07:00
Shreya Shankar
631ea0c34f
feat: Add calibration support to map operations for improved consistency ( #365 )
...
* chore: run pre-commit
* feat: add calibration to map ops
* feat: add calibration to map ops
* fix: comment out flaky test
2025-06-14 23:04:13 -07:00
Shreya Shankar
8a7b4e1566
feat: add extract operator ( #361 )
...
* feat: add extract operator
* feat: add extract operator
* feat: add extract operator
* feat: add extract operator
2025-05-13 17:43:46 -07:00
Shreya Shankar
bb5bdff9d1
feat: adding `api_base` to yaml ( #359 )
...
* feat: adding api base to yaml
* feat: adding api base to yaml
* docs: add api base docs
2025-05-11 18:40:29 -07:00
Shreya Shankar
1a41df6cb5
fix: add system prompt and other config vars to python api ( #356 )
2025-05-03 18:05:22 -07:00
Shreya Shankar
3781edebfa
docs: update for rank op ( #344 )
...
* update documentation for rank op
* update documentation for rank op
2025-04-21 18:24:37 -07:00
Shreya Shankar
9ad27afa12
update documentation for rank op ( #343 )
2025-04-21 18:22:21 -07:00
Shreya Shankar
0b2d5b0324
feat: add rank operation card in docwrangler ( #341 )
...
* feat: add rank operation card in docwrangler
* feat: add rank operation card in docwrangler
* feat: add rank operation card in docwrangler
2025-04-21 16:02:19 -07:00
Shreya Shankar
c960d6fcc0
feat: add rank operation ( #340 )
...
* feat: add order by operator
* adding a bunch of order functions
* refactor: move around tests for rank operator
* add anthropic hh test for rank
* refactor: move around tests for rank operator
* refactor: move around tests for rank operator
2025-04-21 13:48:41 -07:00
Shreya Shankar
d20c3982e4
feat: add TPM rate limits & pipeline settings to DocWrangler ( #338 )
...
* add TPM rate limit
* feat: add rate limiting in docwrangler; tpm
* feat: add rate limiting in docwrangler; tpm
* feat: add rate limiting in docwrangler; tpm
2025-04-19 22:38:48 -07:00
Shreya Shankar
1c4e181201
docs: add python api quickstart ( #337 )
...
* docs: add python api quickstart
* docs: add python api quickstart
2025-04-19 09:49:50 -07:00
Shreya Shankar
e7fd306d99
Add python docs ( #334 )
2025-04-17 12:23:24 -07:00
shabie
4793c89e92
feat: add flag to stream map operation outputs to disk ( #323 )
...
* feat: add flag to stream map operation outputs to disk
* flush partial results default False
* comment out print statement in test basic map
* rewrite test to not use batched config
---------
Co-authored-by: Shreya Shankar <ss.shankar505@gmail.com>
2025-03-19 19:17:56 -06:00
Shreya Shankar
fa5460b4c0
feat: add n parameter to output ( #320 )
...
* feat: add n parameter to output
* feat: add n parameter to output
* feat: add n parameter to output
* feat: add n parameter to output
2025-03-13 23:25:03 -07:00
Shreya Shankar
080e7f75ec
feat: refactoring map optimizer ( #311 )
2025-02-18 12:54:45 -08:00
Shreya Shankar
f4abe55191
feat: support pdf upload and add tutorial ( #309 )
2025-02-09 21:19:17 -08:00
Shreya Shankar
2a259a0d93
feat: add pandas df accessor ( #287 )
...
* feat: add pandas df accessor
* feat: add pandas df accessor
* feat: add pandas df accessor
2025-01-24 16:54:49 -08:00
Shreya Shankar
05c4357c59
docs: add verbose param ( #283 )
2025-01-22 12:23:46 +01:00
Shreya Shankar
eb995fdc77
feat: add llmstxt ( #267 )
...
* feat: adding llms.txt
* feat: adding llms.txt
* feat: adding llms.txt
* feat: adding llms.txt
2025-01-07 21:43:17 -08:00
Shreya Shankar
8b3e1ce640
fix: bypass vercel serverless functions when uploading datasets ( #266 )
2025-01-06 20:52:37 -08:00
Shreya Shankar
f38fbb8960
docs: update playground docs ( #259 )
...
* rebrand to docwrangler
* refactor: rebranding to docwrangler
* refactor: rebranding to docwrangler
* refactor: edit vercel.json
* fix: map optimizer should work
* docs: update playground docs
2025-01-02 00:33:05 -08:00
Shreya Shankar
662d6a2c5a
feat: add azure openai for the FE assistants ( #256 )
...
* chore: ui nits
* chore: ui nits
* feat: add tutorial for the supreme court hearings
* fix: output csv writing should work even if not all documents have all the keys
* feat: add azure openai for prompt improvement and chat, with logging
* Add system prompts to documentation
* fix: add setSystemPrompt to restore pipeline context
* docs: add vercel json
2025-01-01 11:41:43 -08:00
Shreya Shankar
af70998e6f
chore: add param to skip LLM calls when they fail ( #255 )
...
* feat: add skip_on_error param to map operations
* chore: add more helpful logging
* chore: prettier printing
* better logging when skipping on error
* better logging when skipping on error
2024-12-29 23:53:37 -06:00
Rohit Rawat
0e077aa740
added enum support ( #254 )
...
* added enum support
* tests: add test for enum type output
* docs: update docs to support enum type schemas
---------
Co-authored-by: Shreya Shankar <ss.shankar505@gmail.com>
2024-12-26 17:50:23 -06:00