Skip to content

[FEATURE] Implement extract_data skill #5

@edenreich

Description

@edenreich

Summary

Implement the extract_data skill to provide web scraping and data extraction functionality. This skill allows the agent to extract structured data from web pages using various selectors and output formats.

Acceptance Criteria

  • Support CSS selector and XPath for data extraction
  • Extract various attributes (text, href, src, custom attributes)
  • Handle single and multiple element extraction
  • Support multiple extractors in one operation
  • Implement JSON, CSV, and text output formats
  • Handle missing elements gracefully
  • Support extraction from dynamically loaded content
  • Add data cleaning and normalization options
  • Implement extraction from tables and lists
  • Add comprehensive logging for extraction operations
  • Return structured data with metadata
  • Add unit tests covering various extraction scenarios

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions