Commit Graph

162 Commits

Author SHA1 Message Date
Shreya Shankar 4e9f4532ad refactor: update MOAR documentation 2025-12-29 14:04:23 -06:00
Shreya Shankar 9eee1a2553
feat: add claude-code skill (#469)
* feat: add claude-code skill

* feat: add claude-code skill

* get ready to bump up version
2025-12-27 22:23:30 -06:00
Shreya Shankar fa86e23cfb
add limit param to llm ops (#466) 2025-12-26 19:46:28 -08:00
Shreya Shankar 56a1b7e794
Feat: Add first-class LanceDB retrievers and prompt augmentation across LLM operators (#460)
* feat: adding retrievers

* feat: allow indexes to be built on datasets created by docetl pipelines

* Testing retriever and updating docs with logging
2025-12-26 16:48:48 -06:00
Lindsey Wei 81d110404d
Add MOAR optimizer to docetl (#464)
* feat: adding conditional gleaning (#375)

* fix: improve caching and don't raise error for bad gather configs

* fix: improve caching and don't raise error for bad gather configs

* feat: adding conditional gleaning

* chore: bump up fastapi and python multipart (#376)

* merge

* chore: bump up fastapi and python multipart

* chore: bump up fastapi and python multipart

* pipeline for chaining

* chaining + gleaning(map)

* Clean up and reorganize pytest tests (#377)

* Replace api_wrapper with runner in test fixtures and configurations

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Refactor test fixtures and reorganize configuration in test files

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>

* Refactor api.py for structured output (#378)

* Refactor API wrapper with modular design for LLM calls and output handling

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Refactor APIWrapper: Simplify LLM call logic and improve modularity

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Refactor output mode handling in APIWrapper with flexible configuration

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Add comprehensive tests for DocETL output modes with synthetic data

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Refactor output modes tests with improved pytest structure and DSLRunner

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Fix runtime errors

* Add nested JSON parsing for string values in API response

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Handle nested JSON parsing by extracting matching key values

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Simplify JSON parsing logic in API utility functions

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Add to tests

* Add documentation for DocETL output modes and configuration options

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Add docs

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>

* Add operators to pandas API (#379)

* baseline w/ 3 rewrites

* added sample

* 3 choices

* feat: add global bypass cache (#383)

* baseline experiments

* baseline experiments

* update directives

* MCTS random expand

* MCTS v1

* MCTS V1 w/ instantiation check on chaining

* MCTS V2

* remove key

* delete key

* update evaluation code

* >1 instantiation for chaining & true f1 score

* true f1 score as acc

* hypervolume as value algo

* Add directives folder from PR #1

* tests for directives

* updates on instantiate_schemas.py

* adding experiment folder from P1

* added init for mcts

* partially merged PR1

* fix instantiate schema

* added chunking directives

* passed directive tests

* fix bug in doc_chunking instantiate

* fix doc_chunking bug

* update prompt for chunk_header_summary

* update gitignore

* fix mcts bugs when rebasing

* add chunk sampling directive

* print AUC at end

* reward update
merged

* fixed bug in doc_compression and avoid using chunk_header_summary twice

* fixed chunk sampling

* adding blackvault dataset and refactoring hard coded cuad statistics

* scaled cost, progressive widening

* feat: add head tail directive

* feat: add head tail directive

* feat: add head tail directive

* merge chunk sampling + doc chunking into one directive. also modify sample operator to stratify by multiple keys

* add game reviews experiemnt and fix doc chunking

* trim down some fat in game reviews eval logic

* syntax check for stratify key

* added split_key check and fixed bug in take_head_tail

* added **kwargs

* remove reduce in op fusion

* adding medec workload

* document_key check for doc_compression & load schema data

* add sustainability expt

* add reduce chaining

* update expand retry & op fusion

* remove hard coding of blackvault

* remove hard coding of blackvault

* adding agent for instantiation (that is allowed to read docs), and a clarify instructions directive

* adding agent for instantiation (that is allowed to read docs), and a clarify instructions directive

* add swap with code directive

* optimizing cost in first 5 iterations

* added mcts log and remove plot

* add memory along tree path and action reward in log

* add biodex and have train and test splits

* add biodex and have train and test splits

* added map reduce fusion, small changes to mcts

* added reference point for calculating hypervolume in evaluation

* fix bug in calculating hypervolume

* integrate with modal

* add run_all.py script

* add run_all.py script

* fix bug in baseline applying ops

* update mcts chat history

* update mcts chat history

* control message length to fit context window

* fix agent baseline loop

* remove print statement

* update run baseline

* adding extremely simple agent baseline that just gives us entirely new pipelines

* adding extremely simple agent baseline that just gives us entirely new pipelines

* update readme with simple / naive agent baseline

* update

* change modal image

* edit gitignore to operators doc

* adding retrieval based rewrite directive

* debug simple baseline

* fix baseline agent and top k chunking directive

* fix baseline agent and top k chunking directive

* fix baseline agent and top k chunking directive

* fix chunking topk directive and add hierarchical reduce

* merge the original plan execution of three methods

* add plot and hypervolume calculation

* feat: add cascade filtering directive

* feat: add arbitrary rewrite directive

* remove unnecessary field

* fixed fall back method for baseline & mcts

* adding some validation for arbitrary rewrite

* update sample operation

* change model choice

* updates on plot

* fix gpt-5 temperature in query engine

* update model choice in base

* use rp@5 for biodex eval

* separate cost & acc optimize change model directive

* add abacus to plot

* added memo table for each node and provide it to the agent for directive selection

* add facility dataset (we dont need to use for paper) and lotus baselines

* edit prompts in change model directive

* change models

* added rules on compositions of directives. clean up mcts using utils

* add multiple instantiation for selected directives

* fix gemini errors

* adding PZ baselines (still in progress

* add all PZ baselines to main

* add all PZ baselines to main

* fix bugs about multiple instantiations

* add utility to plot the test set

* fix linter errors

* fix PZ script for medec dataset

* add original pipeline on test set to plot

* change color of mcts

* test all model on the input query before search

* add search cost calculation

* fixed bug in json file path

* add concurrency control; start searching from pareto model plans

* set reward to be vertical distance to the step frontier

* newest mcts version

* small changes during exp

* change to run lotus, add bootstrap

* Add needs_code_filter variable for operator fusion

* add validation step; randomly select when values have ties

* eval function change

* clean up run test frontier

* Delete bootstrap_evaluation.py

* Delete BioDEX_evaluate.py

* Delete CUAD_evaluate.py

* Delete CUAD_sample50.py

* Delete bootstrapping.py

* Delete evaluate_blackvault.py

* Delete exp_graph.py

* Delete exp_graph_max.py

* Delete re_evaluate_zero_scores.py

* Delete split_json_data.py

* Delete test_evaluation.py

* Delete test_gemini_models.py

* Delete test_operations_modal.py

* Delete test_validate_frontier.py

* Delete validate_pareto_frontier.py

* Delete docetl/reasoning_optimizer/Untitled-1.py

* Delete docetl/BioDEX_test.py

* Delete docetl/mcts/execute_res_HV directory

* Delete docetl/mcts/graph_baseline.py

* Delete docetl/mcts/graph.py

* remove acc comparator

* delete graph

* delete irrelevant files

* mcts clean up

* update naming

* clean up simple agent

* update readme

* add user specified eval function

* Delete experiments/reasoning/run_tests.py

* Delete experiments/reasoning/run_test_frontier.py

* Delete experiments/reasoning/run_baseline.py

* Delete experiments/reasoning/run_all.py

* Delete experiments/reasoning/plot_result.py

* Delete experiments/reasoning/plot_matrix.py

* Delete experiments/reasoning/combine_biodex_test_results.py

* Delete experiments/reasoning/create_biodex_test_summary.py

* Delete experiments/reasoning/generate_biodex_summary.py

* Delete experiments/reasoning/TEST_FRONTIER_README.md

* Delete experiments/reasoning/utils.py

* Delete experiments/reasoning/README.md

* Delete experiments/reasoning/othersystems directory

* Delete experiments/reasoning/outputs/blackvault_baseline

* Delete experiments/reasoning/outputs/blackvault_mcts

* Delete experiments/reasoning/outputs/cuad_lotus_evaluation.json

* Delete compute_words_per_document.py

* Delete docetl/moar/acc_comparator.py

* Delete docetl/moar/acc_comparator.py.backup

* Delete docetl/moar/instantiation_check.py

* Delete docetl/reasoning_optimizer/build_optimization.py

* Delete docetl/reasoning_optimizer/generate_rewrite_plan.py

* remove auc

* simplify util

* Delete experiments/reasoning/data directory

* requirements

* eval function cleaning

* update readme

* update readme

* update readme

* remove facility

* update readme

* Update api to be consistent with main

* remove unused files

* clean up relative imports and model choices

* change all printing to use the rich console

* add documentation for moar and CLI

* move evaluation code from experiments dir to moar dir

* fix documentation

---------

Co-authored-by: Shreya Shankar <ss.shankar505@gmail.com>
Co-authored-by: linxiwei <lindseywei@visitor-10-57-110-173.wifi.berkeley.edu>
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: linxiwei <lindseywei@visitor-10-57-109-39.wifi.berkeley.edu>
Co-authored-by: linxiwei <lindseywei@lindseys-mbp-7.lan>
Co-authored-by: linxiwei <lindseywei@visitor-10-57-111-186.wifi.berkeley.edu>
Co-authored-by: linxiwei <lindseywei@wifi-10-41-110-112.wifi.berkeley.edu>
Co-authored-by: linxiwei <lindseywei@wifi-10-44-111-72.wifi.berkeley.edu>
Co-authored-by: Lindsey Wei <152750390+LindseyyyW@users.noreply.github.com>
Co-authored-by: linxiwei <lindseywei@wifi-10-44-110-253.wifi.berkeley.edu>
Co-authored-by: linxiwei <lindseywei@wifi-10-41-110-92.wifi.berkeley.edu>
Co-authored-by: linxiwei <lindseywei@wifi-10-41-108-154.wifi.berkeley.edu>
Co-authored-by: linxiwei <lindseywei@wifi-10-41-109-198.wifi.berkeley.edu>
Co-authored-by: linxiwei <lindseywei@Lindseys-MacBook-Pro-7.local>
Co-authored-by: linxiwei <lindseywei@192.168.0.110>
2025-11-28 13:47:42 -06:00
Shreya Shankar 0110071cd5
fix: add code ops and extract to python api (#463) 2025-11-24 14:18:59 -06:00
Shreya Shankar a184a3c1e9
fix: add code ops and extract to python api (#462)
* fix: add code ops and extract to python api

* fix: add code ops and extract to python api
2025-11-24 14:10:07 -06:00
Shreya Shankar c3ef5684d5
Add fallback models documentation to user guide (#454)
* Add fallback models documentation and configuration

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Refactor: Remove operation-specific models from fallback docs

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Docs: Add content warning errors to fallback model triggers

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Remove outdated best practices from fallback models documentation

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-11-14 13:32:31 -08:00
Shreya Shankar ea0013d7fd
Implement LiteLLM fallback models for reliability (#453)
* feat: Add LiteLLM Router for fallback models

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Add model to litellm_params for LiteLLM Router

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Refactor: Prioritize operation model in LiteLLM Router fallbacks

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Refactor: Cache LiteLLM Routers in APIWrapper

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Refactor: Separate completion and embedding routers

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Add fallback models example configuration

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* feat: Add fallback models to LiteLLM Router

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Refactor Router fallbacks to use list of dicts

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Remove example fallback config file

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Feat: Add version constraint for paddlepaddle

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-11-14 13:17:47 -08:00
Shreya Shankar 50e45cf7db
Refactor: Use num_tokens instead of token_count (#430)
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-09-15 15:32:33 -07:00
Shreya Shankar d1dab86466
Fix cosine similarity blocking in resolve operation (#428)
* fix: resolve blocking

* fix: resolve blocking
2025-09-14 20:24:58 -07:00
Shreya Shankar 4cfd371744
feat: add topk implementation (#410) 2025-08-13 13:27:54 -07:00
Shreya Shankar 6dafd46fba
Update Pandas API to Use New Output Parameter Format (#409)
* chore: update pandas api and docs

* chore: update pd accessors
2025-08-13 10:41:49 -07:00
Shreya Shankar 79542259cb
Refactor sample operation for multiple stratify keys (#408)
* Enhance sample operation with multi-key stratification and per-group sampling

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Refactor SampleOperation with improved validation and sampling methods

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Refactor sample operation with improved stratification and simplified config

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-08-13 10:16:55 -07:00
Shreya Shankar f6ead4ea6b
Switch from poetry to uv (#402)
* Switch from Poetry to uv for dependency management and packaging

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Update README with uv installation and dependency management instructions

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Optimize Docker build: improve dependency installation and caching

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* verify that uv works on my local installation

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-08-09 13:20:32 -07:00
Shreya Shankar 84c4807009
chore: switching to cloudbank for blob storage (#400)
* add seo

* add seo

* switching to cloudbank for blob storage
2025-07-31 14:28:00 -07:00
Shreya Shankar 0f0253be16
Add structured output mode support to pandas API (#394)
* Add structured output mode support to pandas API

This commit addresses issue #393 by adding the ability to specify structured
output mode in the pandas semantic accessor.

Key changes:
- Enhanced map() method to accept 'output' parameter with schema and mode
- Added support for 'structured_output' mode alongside existing 'tools' mode
- Maintained backward compatibility with existing 'output_schema' parameter
- Added comprehensive parameter validation and error handling

New API format:
```python
df.semantic.map(
    prompt="Extract data: {{input.text}}",
    output={
        "schema": {"name": "str", "age": "int"},
        "mode": "structured_output"  # Optional, defaults to "tools"
    }
)
```

Tests added:
- test_semantic_map_structured_output: Tests new structured output functionality
- test_semantic_map_invalid_output_mode: Tests output mode validation
- test_semantic_map_structured_output_vs_tools: Compares both output modes
- test_semantic_map_backward_compatibility: Ensures old API still works
- test_semantic_map_parameter_validation: Comprehensive parameter validation

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Update pandas documentation for structured output mode

- Updated pandas/index.md to document new output parameter format
- Added section explaining output modes (tools vs structured_output)
- Updated examples throughout pandas/operations.md and pandas/examples.md
- Maintained backward compatibility examples
- Added guidance on when to use structured output mode

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-07-22 08:54:44 -07:00
Sid Jha 19a8286978
Improve syntax_check by leveraging pydantic validation (#392)
* Work on other operators

* Fix

* Add ValidationInfo

* Bug fix

* docs: fix errors in equijoin documentation

---------

Co-authored-by: Shreya Shankar <ss.shankar505@gmail.com>
2025-07-20 18:54:59 -07:00
Shreya Shankar 9028efc8d4
Update cluster documentation to use inputs iteration in summary prompt (#386)
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-07-14 18:12:22 -07:00
Shreya Shankar d995c534b5
feat: add global bypass cache (#383) 2025-07-08 16:53:40 -07:00
Shreya Shankar 9b836ee228
Add operators to pandas API (#379) 2025-07-04 11:59:24 -07:00
Shreya Shankar 1e4709a112
Refactor api.py for structured output (#378)
* Refactor API wrapper with modular design for LLM calls and output handling

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Refactor APIWrapper: Simplify LLM call logic and improve modularity

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Refactor output mode handling in APIWrapper with flexible configuration

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Add comprehensive tests for DocETL output modes with synthetic data

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Refactor output modes tests with improved pytest structure and DSLRunner

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Fix runtime errors

* Add nested JSON parsing for string values in API response

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Handle nested JSON parsing by extracting matching key values

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Simplify JSON parsing logic in API utility functions

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Add to tests

* Add documentation for DocETL output modes and configuration options

Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>

* Add docs

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-07-04 10:58:15 -07:00
Shreya Shankar 9a72a6b729
chore: bump up fastapi and python multipart (#376)
* merge

* chore: bump up fastapi and python multipart

* chore: bump up fastapi and python multipart
2025-07-02 07:37:06 -07:00
Shreya Shankar b8d2beb602
feat: adding conditional gleaning (#375)
* fix: improve caching and don't raise error for bad gather configs

* fix: improve caching and don't raise error for bad gather configs

* feat: adding conditional gleaning
2025-07-01 22:40:38 -07:00
Shreya Shankar 7071ade539
fix: improve caching and don't raise error for bad gather configs (#373)
* merge

* fix: improve caching and don't raise error for bad gather configs

* fix: improve caching and don't raise error for bad gather configs

* fix: improve caching and don't raise error for bad gather configs
2025-06-30 23:19:17 -07:00
Shreya Shankar ea60f38afd
docs: improve gleaning description (#371)
* docs: improve gleaning description

* docs: improve gleaning description

* docs: improve gleaning description

* docs: fix spacing in gleaning docs
2025-06-26 21:25:26 -07:00
Shreya Shankar d156351c69
docs: improve gleaning description (#370)
* docs: improve gleaning description

* docs: improve gleaning description

* docs: improve gleaning description
2025-06-26 21:19:51 -07:00
Shreya Shankar 631ea0c34f
feat: Add calibration support to map operations for improved consistency (#365)
* chore: run pre-commit

* feat: add calibration to map ops

* feat: add calibration to map ops

* fix: comment out flaky test
2025-06-14 23:04:13 -07:00
Shreya Shankar 8a7b4e1566
feat: add extract operator (#361)
* feat: add extract operator

* feat: add extract operator

* feat: add extract operator

* feat: add extract operator
2025-05-13 17:43:46 -07:00
Shreya Shankar bb5bdff9d1
feat: adding `api_base` to yaml (#359)
* feat: adding api base to yaml

* feat: adding api base to yaml

* docs: add api base docs
2025-05-11 18:40:29 -07:00
Shreya Shankar 1a41df6cb5
fix: add system prompt and other config vars to python api (#356) 2025-05-03 18:05:22 -07:00
Shreya Shankar 3781edebfa
docs: update for rank op (#344)
* update documentation for rank op

* update documentation for rank op
2025-04-21 18:24:37 -07:00
Shreya Shankar 9ad27afa12
update documentation for rank op (#343) 2025-04-21 18:22:21 -07:00
Shreya Shankar 0b2d5b0324
feat: add rank operation card in docwrangler (#341)
* feat: add rank operation card in docwrangler

* feat: add rank operation card in docwrangler

* feat: add rank operation card in docwrangler
2025-04-21 16:02:19 -07:00
Shreya Shankar c960d6fcc0
feat: add rank operation (#340)
* feat: add order by operator

* adding a bunch of order functions

* refactor: move around tests for rank operator

* add anthropic hh test for rank

* refactor: move around tests for rank operator

* refactor: move around tests for rank operator
2025-04-21 13:48:41 -07:00
Shreya Shankar d20c3982e4
feat: add TPM rate limits & pipeline settings to DocWrangler (#338)
* add TPM rate limit

* feat: add rate limiting in docwrangler; tpm

* feat: add rate limiting in docwrangler; tpm

* feat: add rate limiting in docwrangler; tpm
2025-04-19 22:38:48 -07:00
Shreya Shankar 1c4e181201
docs: add python api quickstart (#337)
* docs: add python api quickstart

* docs: add python api quickstart
2025-04-19 09:49:50 -07:00
Shreya Shankar e7fd306d99
Add python docs (#334) 2025-04-17 12:23:24 -07:00
shabie 4793c89e92
feat: add flag to stream map operation outputs to disk (#323)
* feat: add flag to stream map operation outputs to disk

* flush partial results default False

* comment out print statement in test basic map

* rewrite test to not use batched config

---------

Co-authored-by: Shreya Shankar <ss.shankar505@gmail.com>
2025-03-19 19:17:56 -06:00
Shreya Shankar fa5460b4c0
feat: add n parameter to output (#320)
* feat: add n parameter to output

* feat: add n parameter to output

* feat: add n parameter to output

* feat: add n parameter to output
2025-03-13 23:25:03 -07:00
Shreya Shankar 080e7f75ec
feat: refactoring map optimizer (#311) 2025-02-18 12:54:45 -08:00
Shreya Shankar f4abe55191
feat: support pdf upload and add tutorial (#309) 2025-02-09 21:19:17 -08:00
Shreya Shankar 2a259a0d93
feat: add pandas df accessor (#287)
* feat: add pandas df accessor

* feat: add pandas df accessor

* feat: add pandas df accessor
2025-01-24 16:54:49 -08:00
Shreya Shankar 05c4357c59
docs: add verbose param (#283) 2025-01-22 12:23:46 +01:00
Shreya Shankar eb995fdc77
feat: add llmstxt (#267)
* feat: adding llms.txt

* feat: adding llms.txt

* feat: adding llms.txt

* feat: adding llms.txt
2025-01-07 21:43:17 -08:00
Shreya Shankar 8b3e1ce640
fix: bypass vercel serverless functions when uploading datasets (#266) 2025-01-06 20:52:37 -08:00
Shreya Shankar f38fbb8960
docs: update playground docs (#259)
* rebrand to docwrangler

* refactor: rebranding to docwrangler

* refactor: rebranding to docwrangler

* refactor: edit vercel.json

* fix: map optimizer should work

* docs: update playground docs
2025-01-02 00:33:05 -08:00
Shreya Shankar 662d6a2c5a
feat: add azure openai for the FE assistants (#256)
* chore: ui nits

* chore: ui nits

* feat: add tutorial for the supreme court hearings

* fix: output csv writing should work even if not all documents have all the keys

* feat: add azure openai for prompt improvement and chat, with logging

* Add system prompts to documentation

* fix: add setSystemPrompt to restore pipeline context

* docs: add vercel json
2025-01-01 11:41:43 -08:00
Shreya Shankar af70998e6f
chore: add param to skip LLM calls when they fail (#255)
* feat: add skip_on_error param to map operations

* chore: add more helpful logging

* chore: prettier printing

* better logging when skipping on error

* better logging when skipping on error
2024-12-29 23:53:37 -06:00
Rohit Rawat 0e077aa740
added enum support (#254)
* added enum support

* tests: add test for enum type output

* docs: update docs to support enum type schemas

---------

Co-authored-by: Shreya Shankar <ss.shankar505@gmail.com>
2024-12-26 17:50:23 -06:00