Skip to content

Selector h4+ p is incorrectly translated to xpath #28

@jstray

Description

@jstray

On the page https://www.kpu.ca/calendar/2018-19/courses/jrnl/index.html, I'm trying to select the paragraph of course description that follows each course title. For example, "Students will explore how journalism fits in a media landscape..."

I can successfully highlight the appropriate elements in SG by clicking on this paragraph, then clicking on one of the "Prerequisites" elements to prevent them from being included. This results in the correct CSS selector h4+ p

However, when I translate this to an Xpath, I get //h4+//p which is not correct. I would expect this to translate to something like //h4/following::p[1], which gives the correct result.

We have been advising people to use SelectorGadget to write the expressions for the Xpath Extractor in Workbench (http://help.workbenchdata.com/steps/scrape/xpath-extractor) as a way to avoid learning the xpath syntax, so it's unfortunate that this case is mis-translated.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions