Skip to content

Commit f520b40

Browse files
committed
Fix looking up anchors from non-canonical URIs.
E.g. if we have http://example.com, a retrieval URI, we need to be able to lookup anchors within it even if the resource has some internal ID.
1 parent c9de6c2 commit f520b40

File tree

3 files changed

+16
-4
lines changed

3 files changed

+16
-4
lines changed

docs/changes.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,12 @@
22
Changelog
33
=========
44

5+
v0.27.0
6+
-------
7+
8+
* Support looking up anchors from non-canonical URIs.
9+
In other words, if you add a resource at the URI ``http://example.com``, then looking up the anchor ``http://example.com#foo`` now works even if the resource has some internal ``$id`` saying its canonical URI is ``http://somethingelse.example.com``.
10+
511
v0.26.4
612
-------
713

referencing/_core.py

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -349,14 +349,20 @@ def remove(self, uri: URI):
349349

350350
def anchor(self, uri: URI, name: str):
351351
"""
352-
Retrieve the given anchor, which must already have been found.
352+
Retrieve a given anchor from a resource which must already be crawled.
353353
"""
354-
value = self._anchors.get((uri, name))
354+
resource = self.get(uri)
355+
if resource is None:
356+
canonical_uri = uri
357+
else:
358+
canonical_uri = resource.id() or uri
359+
360+
value = self._anchors.get((canonical_uri, name))
355361
if value is not None:
356362
return Retrieved(value=value, registry=self)
357363

358364
registry = self.crawl()
359-
value = registry._anchors.get((uri, name))
365+
value = registry._anchors.get((canonical_uri, name))
360366
if value is not None:
361367
return Retrieved(value=value, registry=registry)
362368
if "/" in name:

0 commit comments

Comments
 (0)