* feat: Add user confirmation for non-Jinja prompts
This commit introduces a confirmation step for prompts that do not contain Jinja2 syntax. It also modifies strict_render to automatically append document context when Jinja syntax is absent.
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Refactor: Move DOCETL_CONSOLE import to function scope
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Refactor: Move has_jinja_syntax to docetl.utils
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
* feat: Add web scraper UI and API integration
Implement a web scraper with a UI for user interaction and an API for data collection.
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* feat: Integrate Modal for code execution and data scraping
This commit integrates Modal for executing Python code in a sandbox environment. It also adds the Tavily search tool and updates the system prompt to guide the AI in data scraping workflows.
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* feat: Implement agent loop control and GPT-5 default
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Refactor scraper UI to show form conditionally
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* add web scraper code
* finish web scraper tool
* finish web scraper tool
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
* feat: Add pipeline visualization builder component
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* feat: Add visual editor for pipeline configuration
This commit introduces a visual editor for configuring pipelines, allowing users to manipulate blocks, properties, and stacks through a user-friendly interface. The JSON editor remains available as an alternative.
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Add visualization builder
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
* feat: Add output type validation to map and reduce operations
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Refine API error message for validation failures
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Test map type validation for integer answers
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* fix test for validating schemas
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
The output schema has just three fields, but the formatting instructions
and the example in the prompt include extra fields. As these extra
fields are not used downstream in the pipeline, this patch removes them
from the prompt.
* Checkpoint before follow-up message
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Remove temperature for GPT-5 models in gleaning process
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
* Switch from Poetry to uv for dependency management and packaging
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Update README with uv installation and dependency management instructions
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* Optimize Docker build: improve dependency installation and caching
Co-authored-by: ss.shankar505 <ss.shankar505@gmail.com>
* verify that uv works on my local installation
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
* Add structured output mode support to pandas API
This commit addresses issue #393 by adding the ability to specify structured
output mode in the pandas semantic accessor.
Key changes:
- Enhanced map() method to accept 'output' parameter with schema and mode
- Added support for 'structured_output' mode alongside existing 'tools' mode
- Maintained backward compatibility with existing 'output_schema' parameter
- Added comprehensive parameter validation and error handling
New API format:
```python
df.semantic.map(
prompt="Extract data: {{input.text}}",
output={
"schema": {"name": "str", "age": "int"},
"mode": "structured_output" # Optional, defaults to "tools"
}
)
```
Tests added:
- test_semantic_map_structured_output: Tests new structured output functionality
- test_semantic_map_invalid_output_mode: Tests output mode validation
- test_semantic_map_structured_output_vs_tools: Compares both output modes
- test_semantic_map_backward_compatibility: Ensures old API still works
- test_semantic_map_parameter_validation: Comprehensive parameter validation
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Update pandas documentation for structured output mode
- Updated pandas/index.md to document new output parameter format
- Added section explaining output modes (tools vs structured_output)
- Updated examples throughout pandas/operations.md and pandas/examples.md
- Maintained backward compatibility examples
- Added guidance on when to use structured output mode
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>