open-notebook/open_notebook/utils
LUIS NOVO 71b8d13b24 docs: generate comprehensive CLAUDE.md reference documentation across codebase
Create a hierarchical CLAUDE.md documentation system for the entire Open Notebook
codebase with focus on concise, pattern-driven reference cards rather than
comprehensive tutorials.

## Changes

### Core Documentation System
- Updated `.claude/commands/build-claude-md.md` to distinguish between leaf and
  parent modules, with special handling for prompt/template modules
- Established clear patterns:
  * Leaf modules (40-70 lines): Components, hooks, API clients
  * Parent modules (50-150 lines): Architecture, cross-layer patterns, data flows
  * Template modules: Pattern focus, not catalog listings

### Generated Documentation
Created 15 CLAUDE.md reference files across the project:

**Frontend (React/Next.js)**
- frontend/src/CLAUDE.md: Architecture overview, data flow, three-tier design
- frontend/src/lib/hooks/CLAUDE.md: React Query patterns, state management
- frontend/src/lib/api/CLAUDE.md: Axios client, FormData handling, interceptors
- frontend/src/lib/stores/CLAUDE.md: Zustand state persistence, auth patterns
- frontend/src/components/ui/CLAUDE.md: Radix UI primitives, CVA styling

**Backend (Python/FastAPI)**
- open_notebook/CLAUDE.md: System architecture, layer interactions
- open_notebook/ai/CLAUDE.md: Model provisioning, Esperanto integration
- open_notebook/domain/CLAUDE.md: Data models, ObjectModel/RecordModel patterns
- open_notebook/database/CLAUDE.md: Repository pattern, async migrations
- open_notebook/graphs/CLAUDE.md: LangGraph workflows, async orchestration
- open_notebook/utils/CLAUDE.md: Cross-cutting utilities, context building
- open_notebook/podcasts/CLAUDE.md: Episode/speaker profiles, job tracking

**API & Other**
- api/CLAUDE.md: REST layer, service architecture
- commands/CLAUDE.md: Async command handlers, job queue patterns
- prompts/CLAUDE.md: Jinja2 templates, prompt engineering patterns (refactored)

**Project Root**
- CLAUDE.md: Project overview, three-tier architecture, tech stack, getting started

### Key Features
- Zero duplication: Parent modules reference child CLAUDE.md files, don't repeat them
- Pattern-focused: Emphasizes how components work together, not component catalogs
- Scannable: Short bullets, code examples only when necessary (1-2 per file)
- Practical: "How to extend" guides, quirks/gotchas for each module
- Navigation: Root CLAUDE.md acts as hub pointing to specialized documentation

### Cleanup
- Removed unused `batch_fix_services.py`
- Removed deprecated `open_notebook/plugins/podcasts.py`
- Updated .gitignore for documentation consistency

## Impact
New contributors can now:
1. Read root CLAUDE.md for system architecture (5 min)
2. Jump to specific layer documentation (frontend, api, open_notebook)
3. Dive into module-specific patterns in child CLAUDE.md files (1 min per module)
All documentation is lean, reference-focused, and avoids duplication.
2026-01-03 16:27:52 -03:00
..
CLAUDE.md docs: generate comprehensive CLAUDE.md reference documentation across codebase 2026-01-03 16:27:52 -03:00
README.md
__init__.py
context_builder.py
text_utils.py fix: strip <think> tags from chat responses 2025-12-18 16:31:23 -05:00
token_utils.py fix: enhance chat reference links and prevent text overflow (#173) 2025-10-19 15:38:59 -03:00
version_utils.py

README.md

ContextBuilder

A flexible and generic ContextBuilder class for the Open Notebook project that can handle any parameters and build context from sources, notebooks, insights, and notes.

Features

  • Flexible Parameters: Accepts any parameters via **kwargs for future extensibility
  • Priority-based Management: Automatic prioritization and sorting of context items
  • Token Counting: Built-in token counting and truncation to fit limits
  • Deduplication: Automatic removal of duplicate items based on ID
  • Type-based Grouping: Separates sources, notes, and insights in output
  • Async Support: Fully async for database operations

Basic Usage

from open_notebook.utils.context_builder import ContextBuilder, ContextConfig

# Simple notebook context
builder = ContextBuilder(notebook_id="notebook:123")
context = await builder.build()

# Single source with insights
builder = ContextBuilder(
    source_id="source:456",
    include_insights=True,
    max_tokens=2000
)
context = await builder.build()

Convenience Functions

from open_notebook.utils.context_builder import (
    build_notebook_context,
    build_source_context,
    build_mixed_context
)

# Build notebook context
context = await build_notebook_context(
    notebook_id="notebook:123",
    max_tokens=5000
)

# Build single source context
context = await build_source_context(
    source_id="source:456",
    include_insights=True
)

# Build mixed context
context = await build_mixed_context(
    source_ids=["source:1", "source:2"],
    note_ids=["note:1", "note:2"],
    max_tokens=3000
)

Advanced Configuration

from open_notebook.utils.context_builder import ContextConfig

# Custom configuration
config = ContextConfig(
    sources={
        "source:doc1": "insights",
        "source:doc2": "full content", 
        "source:doc3": "not in"  # Exclude
    },
    notes={
        "note:summary": "full content",
        "note:draft": "not in"  # Exclude
    },
    include_insights=True,
    max_tokens=3000,
    priority_weights={
        "source": 120,  # Higher priority
        "note": 80,     # Medium priority  
        "insight": 100  # High priority
    }
)

builder = ContextBuilder(
    notebook_id="notebook:project",
    context_config=config
)
context = await builder.build()

Programmatic Item Management

from open_notebook.utils.context_builder import ContextItem

builder = ContextBuilder()

# Add custom items
item = ContextItem(
    id="source:important",
    type="source",
    content={"title": "Key Document", "summary": "..."},
    priority=150  # Very high priority
)
builder.add_item(item)

# Apply management operations
builder.remove_duplicates()
builder.prioritize()
builder.truncate_to_fit(1000)

context = builder._format_response()

Flexible Parameters

The ContextBuilder accepts any parameters via **kwargs, making it extensible for future features:

builder = ContextBuilder(
    notebook_id="notebook:123",
    include_insights=True,
    max_tokens=2000,
    
    # Custom parameters for future extensions
    user_id="user:456",
    custom_filter="advanced",
    experimental_feature=True
)

# Access custom parameters
user_id = builder.params.get('user_id')

Output Format

The ContextBuilder returns a structured response:

{
    "sources": [...],           # List of source contexts
    "notes": [...],             # List of note contexts  
    "insights": [...],          # List of insight contexts
    "total_tokens": 1234,       # Total token count
    "total_items": 10,          # Total number of items
    "notebook_id": "notebook:123",  # If provided
    "metadata": {
        "source_count": 5,
        "note_count": 3,
        "insight_count": 2,
        "config": {
            "include_insights": true,
            "include_notes": true,
            "max_tokens": 2000
        }
    }
}

Architecture

The ContextBuilder follows these design principles:

  1. Separation of Concerns: Context building, item management, and formatting are separate
  2. Extensibility: Uses **kwargs and flexible configuration for future features
  3. Performance: Token-aware truncation and efficient deduplication
  4. Type Safety: Proper type hints and data classes for structure
  5. Error Handling: Graceful handling of missing items and database errors

Integration

The ContextBuilder integrates seamlessly with the existing Open Notebook architecture:

  • Uses existing domain models (Source, Notebook, Note)
  • Leverages the repository pattern for database access
  • Follows the same async patterns as other services
  • Integrates with the token counting utilities

Error Handling

The ContextBuilder handles errors gracefully:

  • Missing notebooks/sources/notes are logged but don't stop execution
  • Database errors are wrapped in DatabaseOperationError
  • Invalid parameters raise InvalidInputError
  • All errors include detailed context information