diff --git a/docs/concepts/how-it-works.md b/docs/concepts/how-it-works.md
index 55cd2d1..edff926 100644
--- a/docs/concepts/how-it-works.md
+++ b/docs/concepts/how-it-works.md
@@ -68,14 +68,48 @@ We don't use LLMs or semantic similarity because:
 
 ## Reference Fetching
 
+The validator supports multiple reference types:
+
 ### PubMed (PMID)
 
 For `PMID:12345678`:
 
 1. Queries NCBI E-utilities API
 2. Fetches abstract and metadata
-3. Parses XML response with BeautifulSoup
-4. Caches as markdown with YAML frontmatter
+3. Attempts to retrieve full-text from PMC if available
+4. Parses XML response with BeautifulSoup
+5. Caches as markdown with YAML frontmatter
+
+### DOI (Digital Object Identifier)
+
+For `DOI:10.1234/journal.article`:
+
+1. Queries Crossref API for metadata
+2. Fetches abstract and bibliographic information
+3. Extracts title, authors, journal, year
+4. Caches abstract and metadata as markdown
+
+### URLs
+
+For `URL:https://example.com/page` or `https://example.com/page`:
+
+1. Makes HTTP GET request to fetch web page
+2. Extracts title from `<title>` tag
+3. Converts HTML to plain text (removes scripts, styles, navigation)
+4. Normalizes whitespace
+5. Caches as markdown with content type `html_converted`
+
+**Use cases for URLs:**
+- Online book chapters
+- Educational resources
+- Documentation pages
+- Any static web content
+
+**Limitations:**
+- Works best with static HTML content
+- Does not execute JavaScript
+- Cannot access content behind authentication
+- Complex dynamic pages may not extract well
 
 ### PubMed Central (PMC)
 
diff --git a/docs/how-to/validate-urls.md b/docs/how-to/validate-urls.md
new file mode 100644
index 0000000..a85f617
--- /dev/null
+++ b/docs/how-to/validate-urls.md
@@ -0,0 +1,253 @@
+# Validating URL References
+
+This guide explains how to validate references that use URLs instead of traditional identifiers like PMIDs or DOIs.
+
+## Overview
+
+The linkml-reference-validator supports validating references that point to web content, such as:
+
+- Book chapters hosted online
+- Educational resources
+- Documentation pages
+- Blog posts or articles
+- Any static web content
+
+When a reference field contains a URL, the validator:
+
+1. Fetches the web page content
+2. Extracts the page title
+3. Converts HTML to plain text
+4. Validates the extracted content against your supporting text
+
+## URL Format
+
+URLs can be specified in two ways:
+
+### Explicit URL Prefix
+
+```yaml
+my_field:
+  value: "Some text from the web page..."
+  references:
+    - "URL:https://example.com/book/chapter1"
+```
+
+### Direct URL
+
+```yaml
+my_field:
+  value: "Some text from the web page..."
+  references:
+    - "https://example.com/book/chapter1"
+```
+
+Both formats are equivalent. If a reference starts with `http://` or `https://`, it's automatically recognized as a URL reference.
+
+## Example
+
+Suppose you have an online textbook chapter at `https://example.com/biology/cell-structure` with the following content:
+
+```html
+<html>
+  <head>
+    <title>Chapter 3: Cell Structure and Function</title>
+  </head>
+  <body>
+    <h1>Cell Structure and Function</h1>
+    <p>The cell is the basic structural and functional unit of all living organisms.</p>
+    <p>Cells contain various organelles that perform specific functions...</p>
+  </body>
+</html>
+```
+
+You can validate text extracted from this chapter:
+
+```yaml
+description:
+  value: "The cell is the basic structural and functional unit of all living organisms"
+  references:
+    - "https://example.com/biology/cell-structure"
+```
+
+## How URL Validation Works
+
+### 1. Content Fetching
+
+When the validator encounters a URL reference, it:
+
+- Makes an HTTP GET request to fetch the page
+- Uses a polite user agent header identifying the tool
+- Respects rate limiting (configurable via `rate_limit_delay`)
+- Handles timeouts (default 30 seconds)
+
+### 2. Content Extraction
+
+The fetcher extracts content from the HTML:
+
+- **Title**: Extracted from the `<title>` tag
+- **Content**: HTML is converted to plain text using BeautifulSoup
+- **Cleanup**: Removes scripts, styles, navigation, headers, and footers
+- **Normalization**: Whitespace is normalized for better matching
+
+### 3. Content Type
+
+URL references are marked with content type `html_converted` to distinguish them from other reference types like abstracts or full-text articles.
+
+### 4. Caching
+
+Fetched URL content is cached to disk in markdown format with YAML frontmatter:
+
+```markdown
+---
+reference_id: URL:https://example.com/biology/cell-structure
+title: "Chapter 3: Cell Structure and Function"
+content_type: html_converted
+---
+
+# Chapter 3: Cell Structure and Function
+
+## Content
+
+The cell is the basic structural and functional unit of all living organisms.
+Cells contain various organelles that perform specific functions...
+```
+
+Cache files are stored in the configured cache directory (default: `.linkml-reference-validator-cache/`).
+
+## Configuration
+
+URL fetching behavior can be configured:
+
+```yaml
+# config.yaml
+rate_limit_delay: 0.5  # Wait 0.5 seconds between requests
+email: "your-email@example.com"  # Used in user agent
+cache_dir: ".cache/references"  # Where to cache fetched content
+```
+
+Or via command-line:
+
+```bash
+linkml-reference-validator validate \
+  --cache-dir .cache \
+  --rate-limit-delay 0.5 \
+  my-data.yaml
+```
+
+## Limitations
+
+### Static Content Only
+
+URL validation is designed for static web pages. It may not work well with:
+
+- Dynamic content loaded via JavaScript
+- Pages requiring authentication
+- Content behind paywalls
+- Frequently changing content
+
+### HTML Structure
+
+The content extraction works by:
+
+- Removing navigation, headers, and footers
+- Converting remaining HTML to text
+- Normalizing whitespace
+
+This works well for simple HTML but may not capture content perfectly from complex layouts.
+
+### No Rendering
+
+The fetcher downloads raw HTML and parses it directly. It does not:
+
+- Execute JavaScript
+- Render the page in a browser
+- Follow redirects automatically (may be added in future)
+- Handle dynamic content
+
+## Best Practices
+
+### 1. Use Stable URLs
+
+Choose URLs that are unlikely to change:
+
+- ✅ Versioned documentation: `https://docs.example.com/v1.0/chapter1`
+- ✅ Archived content: `https://archive.example.com/2024/article`
+- ❌ Blog posts with dates that might be reorganized
+- ❌ URLs with session parameters
+
+### 2. Verify Content Quality
+
+After adding a URL reference, verify the extracted content:
+
+```bash
+# Check what was extracted
+cat .linkml-reference-validator-cache/URL_https___example.com_page.md
+```
+
+Ensure the extracted text contains the relevant information you're referencing.
+
+### 3. Cache Management
+
+- Commit cache files to version control for reproducibility
+- Use `--force-refresh` to update cached content
+- Periodically review cached URLs to ensure they're still accessible
+
+### 4. Mix Reference Types
+
+URL references work alongside PMIDs and DOIs:
+
+```yaml
+findings:
+  value: "Multiple studies confirm this relationship"
+  references:
+    - "PMID:12345678"  # Research paper
+    - "DOI:10.1234/journal.article"  # Another paper
+    - "https://example.com/textbook/chapter5"  # Textbook chapter
+```
+
+## Troubleshooting
+
+### URL Not Fetching
+
+If URL content isn't being fetched:
+
+1. Check network connectivity
+2. Verify the URL is accessible in a browser
+3. Check for rate limiting or IP blocks
+4. Look for error messages in the logs
+
+### Incorrect Content Extraction
+
+If the wrong content is extracted:
+
+1. Inspect the cached markdown file
+2. Check if the page uses complex JavaScript
+3. Consider if the page structure requires custom parsing
+4. File an issue with the page URL for improvement
+
+### Validation Failing
+
+If validation fails for URL references:
+
+1. Check the cached content to see what was extracted
+2. Verify your supporting text actually appears on the page
+3. Check for whitespace or formatting differences
+4. Consider if the page content has changed since caching
+
+## Comparison with Other Reference Types
+
+| Feature | PMID | DOI | URL |
+|---------|------|-----|-----|
+| Source | PubMed | Crossref | Any web page |
+| Content Type | Abstract + Full Text | Abstract | HTML converted |
+| Metadata | Rich (authors, journal, etc.) | Rich | Minimal (title only) |
+| Stability | High | High | Variable |
+| Access | Free for abstracts | Varies | Varies |
+| Caching | Yes | Yes | Yes |
+
+## See Also
+
+- [Validating DOIs](validate-dois.md) - For journal articles with DOIs
+- [Validating OBO Files](validate-obo-files.md) - For ontology-specific validation
+- [How It Works](../concepts/how-it-works.md) - Core validation concepts
+- [CLI Reference](../reference/cli.md) - Command-line options
diff --git a/docs/index.md b/docs/index.md
index 52be20f..ba92ac8 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -2,7 +2,7 @@
 
 **Validate quotes and excerpts against their source publications**
 
-linkml-reference-validator ensures that text excerpts in your data accurately match their cited sources. It fetches references from PubMed/PMC and performs deterministic substring matching with support for editorial conventions like brackets `[...]` and ellipsis `...`.
+linkml-reference-validator ensures that text excerpts in your data accurately match their cited sources. It fetches references from PubMed/PMC, DOIs via Crossref, and URLs, then performs deterministic substring matching with support for editorial conventions like brackets `[...]` and ellipsis `...`.
 
 ## Key Features
 
diff --git a/docs/quickstart.md b/docs/quickstart.md
index 5962b70..c4a9b90 100644
--- a/docs/quickstart.md
+++ b/docs/quickstart.md
@@ -86,6 +86,33 @@ linkml-reference-validator validate text \
 
 This works the same way as PMID validation - the reference is fetched and cached locally.
 
+## Validate Against a URL
+
+For online resources like book chapters, documentation, or educational content:
+
+```bash
+linkml-reference-validator validate text \
+  "The cell is the basic structural and functional unit of all living organisms" \
+  https://example.com/biology/cell-structure
+```
+
+Or with explicit URL prefix:
+
+```bash
+linkml-reference-validator validate text \
+  "The cell is the basic unit of life" \
+  URL:https://example.com/biology/cells
+```
+
+The validator will:
+1. Fetch the web page content
+2. Extract the title from the `<title>` tag
+3. Convert HTML to plain text (removing scripts, styles, navigation)
+4. Cache the content locally
+5. Validate your text against the extracted content
+
+**Note:** URL validation works best with static HTML pages and may not work well with JavaScript-heavy or dynamic content.
+
 ## Key Features
 
 - **Automatic Caching**: References cached locally after first fetch
@@ -94,6 +121,7 @@ This works the same way as PMID validation - the reference is fetched and cached
 - **Deterministic Matching**: Substring-based (not AI/fuzzy matching)
 - **PubMed & PMC**: Fetches from NCBI automatically
 - **DOI Support**: Fetches metadata from Crossref API
+- **URL Support**: Validates against web content (books, docs, educational resources)
 
 ## Next Steps
 
diff --git a/mkdocs.yml b/mkdocs.yml
index 60aaa19..d9510ed 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -33,6 +33,7 @@ nav:
   - How-To Guides:
       - Validating OBO Files: how-to/validate-obo-files.md
       - Validating DOIs: how-to/validate-dois.md
+      - Validating URLs: how-to/validate-urls.md
   - Concepts:
       - How It Works: concepts/how-it-works.md
       - Editorial Conventions: concepts/editorial-conventions.md
diff --git a/src/linkml_reference_validator/etl/reference_fetcher.py b/src/linkml_reference_validator/etl/reference_fetcher.py
index 1a7b90c..5fc377b 100644
--- a/src/linkml_reference_validator/etl/reference_fetcher.py
+++ b/src/linkml_reference_validator/etl/reference_fetcher.py
@@ -86,6 +86,8 @@ def fetch(self, reference_id: str, force_refresh: bool = False) -> Optional[Refe
             content = self._fetch_pmid(identifier)
         elif prefix == "DOI":
             content = self._fetch_doi(identifier)
+        elif prefix == "URL":
+            content = self._fetch_url(identifier)
         else:
             logger.warning(f"Unsupported reference type: {prefix}")
             return None
@@ -100,7 +102,7 @@ def _parse_reference_id(self, reference_id: str) -> tuple[str, str]:
         """Parse a reference ID into prefix and identifier.
 
         Args:
-            reference_id: Reference ID like "PMID:12345678"
+            reference_id: Reference ID like "PMID:12345678" or URL
 
         Returns:
             Tuple of (prefix, identifier)
@@ -114,13 +116,27 @@ def _parse_reference_id(self, reference_id: str) -> tuple[str, str]:
             ('PMID', '12345678')
             >>> fetcher._parse_reference_id("12345678")
             ('PMID', '12345678')
+            >>> fetcher._parse_reference_id("URL:https://example.com/book/chapter1")
+            ('URL', 'https://example.com/book/chapter1')
+            >>> fetcher._parse_reference_id("https://example.com/direct")
+            ('URL', 'https://example.com/direct')
         """
-        match = re.match(r"^([A-Za-z_]+)[:\s]+(.+)$", reference_id.strip())
+        stripped = reference_id.strip()
+        
+        # Check if it's a direct URL (starts with http or https)
+        if stripped.startswith(('http://', 'https://')):
+            return "URL", stripped
+            
+        # Standard prefix:identifier format
+        match = re.match(r"^([A-Za-z_]+)[:\s]+(.+)$", stripped)
         if match:
             return match.group(1).upper(), match.group(2).strip()
-        if reference_id.strip().isdigit():
-            return "PMID", reference_id.strip()
-        return "UNKNOWN", reference_id
+            
+        # Plain numeric ID defaults to PMID
+        if stripped.isdigit():
+            return "PMID", stripped
+            
+        return "UNKNOWN", stripped
 
     def _fetch_pmid(self, pmid: str) -> Optional[ReferenceContent]:
         """Fetch a publication from PubMed by PMID.
@@ -234,6 +250,65 @@ def _fetch_doi(self, doi: str) -> Optional[ReferenceContent]:
             doi=doi,
         )
 
+    def _fetch_url(self, url: str) -> Optional[ReferenceContent]:
+        """Fetch content from a URL.
+
+        Fetches web content, extracts title and converts HTML to text.
+        Intended for static pages like book chapters.
+
+        Args:
+            url: The URL to fetch
+
+        Returns:
+            ReferenceContent if successful, None otherwise
+
+        Examples:
+            >>> config = ReferenceValidationConfig()
+            >>> fetcher = ReferenceFetcher(config)
+            >>> # Would fetch in real usage:
+            >>> # ref = fetcher._fetch_url("https://example.com/book/chapter1")
+        """
+        time.sleep(self.config.rate_limit_delay)
+
+        headers = {
+            "User-Agent": f"linkml-reference-validator/1.0 (mailto:{self.config.email})",
+        }
+
+        try:
+            response = requests.get(url, headers=headers, timeout=30)
+            if response.status_code != 200:
+                logger.warning(f"Failed to fetch URL:{url} - status {response.status_code}")
+                return None
+
+            soup = BeautifulSoup(response.text, "html.parser")
+
+            # Extract title
+            title_tag = soup.find("title")
+            title = title_tag.get_text().strip() if title_tag else None
+
+            # Convert HTML to text
+            # Remove script and style elements
+            for script in soup(["script", "style", "nav", "header", "footer"]):
+                script.decompose()
+
+            # Get text content
+            content = soup.get_text()
+
+            # Clean up text - normalize whitespace
+            lines = (line.strip() for line in content.splitlines())
+            content = "\n".join(line for line in lines if line)
+
+            return ReferenceContent(
+                reference_id=f"URL:{url}",
+                title=title,
+                content=content if content else None,
+                content_type="html_converted",
+            )
+
+        except Exception as e:
+            logger.error(f"Error fetching URL:{url}: {e}")
+            return None
+
     def _parse_crossref_authors(self, authors: list) -> list[str]:
         """Parse author list from Crossref response.
 
@@ -466,8 +541,11 @@ def _get_cache_path(self, reference_id: str) -> Path:
             >>> path = fetcher._get_cache_path("PMID:12345678")
             >>> path.name
             'PMID_12345678.md'
+            >>> path = fetcher._get_cache_path("URL:https://example.com/book/chapter1")
+            >>> path.name
+            'URL_https___example.com_book_chapter1.md'
         """
-        safe_id = reference_id.replace(":", "_").replace("/", "_")
+        safe_id = reference_id.replace(":", "_").replace("/", "_").replace("?", "_").replace("=", "_")
         cache_dir = self.config.get_cache_dir()
         return cache_dir / f"{safe_id}.md"
 
diff --git a/tests/test_reference_fetcher.py b/tests/test_reference_fetcher.py
index ee9673d..a78c8d2 100644
--- a/tests/test_reference_fetcher.py
+++ b/tests/test_reference_fetcher.py
@@ -308,3 +308,144 @@ def test_save_and_load_doi_from_disk(mock_get, fetcher, tmp_path):
     assert result2.reference_id == "DOI:10.9999/cached.doi"
     assert result2.title == "Cached DOI Article"
     assert result2.doi == "10.9999/cached.doi"
+
+
+def test_parse_url_reference_id(fetcher):
+    """Test parsing URL reference IDs."""
+    assert fetcher._parse_reference_id("URL:https://example.com/book/chapter1") == ("URL", "https://example.com/book/chapter1")
+    assert fetcher._parse_reference_id("url:https://example.com/article") == ("URL", "https://example.com/article")
+    assert fetcher._parse_reference_id("https://example.com/direct") == ("URL", "https://example.com/direct")
+
+
+@patch("linkml_reference_validator.etl.reference_fetcher.requests.get")
+def test_fetch_url_success(mock_get, fetcher):
+    """Test fetching URL reference successfully."""
+    mock_response = MagicMock()
+    mock_response.status_code = 200
+    mock_response.text = """
+    <html>
+        <head>
+            <title>Chapter 1: Introduction to Biology</title>
+        </head>
+        <body>
+            <h1>Chapter 1: Introduction to Biology</h1>
+            <p>Biology is the natural science that studies life and living organisms.</p>
+            <p>This chapter provides an overview of cellular structure and function.</p>
+            <div>The cell is the basic unit of life.</div>
+        </body>
+    </html>
+    """
+    mock_get.return_value = mock_response
+
+    result = fetcher.fetch("URL:https://example.com/biology-book/chapter1")
+
+    assert result is not None
+    assert result.reference_id == "URL:https://example.com/biology-book/chapter1"
+    assert result.title == "Chapter 1: Introduction to Biology"
+    assert result.content_type == "html_converted"
+    assert "Biology is the natural science" in result.content
+    assert "basic unit of life" in result.content
+
+
+@patch("linkml_reference_validator.etl.reference_fetcher.requests.get")
+def test_fetch_url_no_title(mock_get, fetcher):
+    """Test fetching URL with no title tag."""
+    mock_response = MagicMock()
+    mock_response.status_code = 200
+    mock_response.text = """
+    <html>
+        <body>
+            <h1>Main Heading</h1>
+            <p>Content without title tag.</p>
+        </body>
+    </html>
+    """
+    mock_get.return_value = mock_response
+
+    result = fetcher.fetch("URL:https://example.com/no-title")
+
+    assert result is not None
+    assert result.reference_id == "URL:https://example.com/no-title"
+    assert result.title is None
+    assert "Main Heading" in result.content
+
+
+@patch("linkml_reference_validator.etl.reference_fetcher.requests.get")
+def test_fetch_url_http_error(mock_get, fetcher):
+    """Test fetching URL that returns HTTP error."""
+    mock_response = MagicMock()
+    mock_response.status_code = 404
+    mock_get.return_value = mock_response
+
+    result = fetcher.fetch("URL:https://example.com/not-found")
+
+    assert result is None
+
+
+@patch("linkml_reference_validator.etl.reference_fetcher.requests.get")
+def test_fetch_url_request_exception(mock_get, fetcher):
+    """Test fetching URL that raises request exception."""
+    mock_get.side_effect = Exception("Network error")
+
+    result = fetcher.fetch("URL:https://example.com/error")
+
+    assert result is None
+
+
+@patch("linkml_reference_validator.etl.reference_fetcher.requests.get")
+def test_fetch_url_malformed_html(mock_get, fetcher):
+    """Test fetching URL with malformed HTML.
+
+    BeautifulSoup is very forgiving and will parse even malformed HTML.
+    This test verifies that the fetcher doesn't crash on malformed input.
+    """
+    mock_response = MagicMock()
+    mock_response.status_code = 200
+    mock_response.text = "<html><title>Test</title><body><p>Content without closing tags"
+    mock_get.return_value = mock_response
+
+    result = fetcher.fetch("URL:https://example.com/malformed")
+
+    assert result is not None
+    assert result.title == "Test"
+    assert "Content without closing tags" in result.content
+
+
+def test_url_cache_path(fetcher):
+    """Test cache path generation for URLs."""
+    path = fetcher._get_cache_path("URL:https://example.com/book/chapter1")
+    assert path.name == "URL_https___example.com_book_chapter1.md"
+
+    path = fetcher._get_cache_path("URL:https://example.com/path?param=value")
+    assert path.name == "URL_https___example.com_path_param_value.md"
+
+
+@patch("linkml_reference_validator.etl.reference_fetcher.requests.get")
+def test_save_and_load_url_from_disk(mock_get, fetcher, tmp_path):
+    """Test saving and loading URL reference from disk cache."""
+    mock_response = MagicMock()
+    mock_response.status_code = 200
+    mock_response.text = """
+    <html>
+        <head><title>Cached URL Content</title></head>
+        <body><p>This content should be cached.</p></body>
+    </html>
+    """
+    mock_get.return_value = mock_response
+
+    # First fetch - this should save to disk
+    result1 = fetcher.fetch("URL:https://example.com/cached")
+    assert result1 is not None
+
+    # Clear memory cache
+    fetcher._cache.clear()
+
+    # Second fetch - should load from disk without making HTTP request
+    with patch("linkml_reference_validator.etl.reference_fetcher.requests.get") as mock_no_request:
+        result2 = fetcher.fetch("URL:https://example.com/cached")
+        mock_no_request.assert_not_called()
+
+    assert result2 is not None
+    assert result2.reference_id == "URL:https://example.com/cached"
+    assert result2.title == "Cached URL Content"
+    assert "This content should be cached" in result2.content