Skip to content

Allow gitlab URL link shortening from non-gitlab/github.com domains #2068

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/user_guide/source-buttons.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ Source Buttons

Source buttons are links to the source of your page's content (either on your site, or on hosting sites like GitHub).

.. _add-edit-button:

Add an edit button
==================

Expand Down
22 changes: 21 additions & 1 deletion docs/user_guide/theme-elements.md
Original file line number Diff line number Diff line change
Expand Up @@ -212,7 +212,7 @@ All will end up as numbers in the rendered HTML, but in the source they look lik

## Link shortening for git repository services

Many projects have links back to their issues / PRs hosted on platforms like **GitHub** or **GitLab**.
Many projects have links back to their issues / PRs hosted on platforms like **GitHub**, **GitLab**, or **Bitbucket**.
Instead of displaying these as raw links, this theme does some lightweight formatting for these platforms specifically.

In **reStructuredText**, URLs are automatically converted to links, so this works automatically.
Expand Down Expand Up @@ -252,5 +252,25 @@ There are a variety of link targets supported, here's a table for reference:
- `https://gitlab.com/gitlab-org`: https://gitlab.com/gitlab-org
- `https://gitlab.com/gitlab-org/gitlab`: https://gitlab.com/gitlab-org/gitlab
- `https://gitlab.com/gitlab-org/gitlab/-/issues/375583`: https://gitlab.com/gitlab-org/gitlab/-/issues/375583
- `https://gitlab.com/gitlab-org/gitlab/-/merge_requests/174667`: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/174667

**Bitbucket**

- `https://bitbucket.org`: https://bitbucket.org
- `https://bitbucket.org/atlassian/workspace/overview`: https://bitbucket.org/atlassian/workspace/overview
- `https://bitbucket.org/atlassian/aui`: https://bitbucket.org/atlassian/aui
- `https://bitbucket.org/atlassian/aui/pull-requests/4758`: https://bitbucket.org/atlassian/aui/pull-requests/4758

Links provided with a text body won't be changed.

If you have links to GitHub, GitLab, or Bitbucket repository URLs that are on non-standard domains
(i.e., not on `github.com`, `gitlab.com`, or `bitbucket.org`, respectively), then these will be
shortened if the base URL is given in the `html_context` section of your `conf.py` file (see
{ref}`Add an edit button <add-edit-button>`), e.g.,

```python
html_context = {
"gitlab_url": "https://gitlab.mydomain.com", # your self-hosted GitLab
...
}
```
28 changes: 28 additions & 0 deletions src/pydata_sphinx_theme/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -275,6 +275,33 @@ def _fix_canonical_url(
context["pageurl"] = app.config.html_baseurl + target


def _add_self_hosted_platforms_to_link_transform_class(app: Sphinx) -> None:
if not hasattr(app.config, "html_context"):
return

# Use list() to force the iterator to completion because the for-loop below
# can modify the dictionary.
platforms = list(short_link.ShortenLinkTransform.supported_platform.values())

for platform in platforms:
# {platform}_url -- e.g.: github_url, gitlab_url, bitbucket_url
self_hosted_url = app.config.html_context.get(f"{platform}_url", None)
if self_hosted_url is None:
continue
parsed = urlparse(self_hosted_url)
if parsed.scheme not in ("http", "https"):
raise Exception(
f"If you provide a value for html_context option {platform}_url,"
" it must begin with http or https."
)
if not parsed.netloc:
raise Exception(
f"Unsupported URL provided for html_context option {platform}_url."
" Could not get domain (netloc) from ${self_hosted_url}."
)
short_link.ShortenLinkTransform.add_platform_mapping(platform, parsed.netloc)


def setup(app: Sphinx) -> Dict[str, str]:
"""Setup the Sphinx application."""
here = Path(__file__).parent.resolve()
Expand All @@ -286,6 +313,7 @@ def setup(app: Sphinx) -> Dict[str, str]:

app.connect("builder-inited", translator.setup_translators)
app.connect("builder-inited", update_config)
app.connect("builder-inited", _add_self_hosted_platforms_to_link_transform_class)
app.connect("html-page-context", _fix_canonical_url)
app.connect("html-page-context", edit_this_page.setup_edit_url)
app.connect("html-page-context", toctree.add_toctree_functions)
Expand Down
7 changes: 6 additions & 1 deletion src/pydata_sphinx_theme/assets/styles/base/_base.scss
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,8 @@ a {

// set up a icon next to the shorten links from github and gitlab
&.github,
&.gitlab {
&.gitlab,
&.bitbucket {
&::before {
color: var(--pst-color-text-muted);
font: var(--fa-font-brands);
Expand All @@ -63,6 +64,10 @@ a {
&.gitlab::before {
content: var(--pst-icon-gitlab);
}

&.bitbucket::before {
content: var(--pst-icon-bitbucket);
}
}

%heading-style {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ html {
--pst-icon-search-minus: "\f010"; // fa-solid fa-magnifying-glass-minus
--pst-icon-github: "\f09b"; // fa-brands fa-github
--pst-icon-gitlab: "\f296"; // fa-brands fa-gitlab
--pst-icon-bitbucket: "\f171"; // fa-brands fa-bitbucket
--pst-icon-share: "\f064"; // fa-solid fa-share
--pst-icon-bell: "\f0f3"; // fa-solid fa-bell
--pst-icon-pencil: "\f303"; // fa-solid fa-pencil
Expand Down
26 changes: 24 additions & 2 deletions src/pydata_sphinx_theme/short_link.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@

class ShortenLinkTransform(SphinxPostTransform):
"""
Shorten link when they are coming from github or gitlab and add an extra class to
the tag for further styling.
Shorten link when they are coming from github, gitlab, or bitbucket and add
an extra class to the tag for further styling.

Before:
.. code-block:: html
Expand All @@ -37,9 +37,15 @@ class ShortenLinkTransform(SphinxPostTransform):
supported_platform: ClassVar[dict[str, str]] = {
"github.com": "github",
"gitlab.com": "gitlab",
"bitbucket.org": "bitbucket",
}
platform = None

@classmethod
def add_platform_mapping(cls, platform, netloc):
"""Add domain->platform mapping to class at run-time."""
cls.supported_platform.update(dict([(netloc, platform)]))

def run(self, **kwargs):
"""Run the Transform object."""
matcher = NodeMatcher(nodes.reference)
Expand Down Expand Up @@ -96,6 +102,22 @@ def parse_url(self, uri: ParseResult) -> str:
if parts[2] in ["issues", "pull", "discussions"]:
text += f"#{parts[-1]}" # element number

elif self.platform == "bitbucket":
# split the url content
parts = path.split("/")

if len(parts) > 0:
text = parts[0] # organisation
if len(parts) > 1 and not (
parts[-2] == "workspace" and parts[-1] == "overview"
):
if len(parts) > 1:
text += f"/{parts[1]}" # repository
if len(parts) > 2:
if parts[2] in ["issues", "pull-requests"]:
itemnumber = parts[-1]
text += f"#{itemnumber}" # element number
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is hard to follow, and feels brittle to me. Why aren't we using regex for this? I worked up this gist as a POC and it seems to capture the behavior that we want:

https://gist.github.com/drammock/78d2d3c9837aafd1259866c7b936b9e4

presumably similar things will work for other forges (though IIRC gitlab is a bit more complex). @gabalafou WDYT?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. The code in this PR follows patterns already established for GitHub and GitLab, but I think those patterns are bad.

The way I think this should work is that we should be rather picky about which URL patterns we recognize/support and only shorten those. All the other URLs should be not be shortened. But right now, the code takes the opposite approach, as soon as it sees github.com or gitlab.com, it tries to shorten the URL.

For example, let's say we were just starting to support GitHub URLs. But in this scenario, let's start only with supporting pull request URLs.

Then we would convert the following link, like so:

https://github.com/pydata/pydata-sphinx-theme/101 
  => pydata/pydata-sphinx-theme#101

But we would not convert any of the following links (if we were only supporting pull request links):

https://github.com/pydata/pydata-sphinx-theme/issues
https://github.com/pydata/pydata-sphinx-theme/issues/
https://github.com/pydata/
https://github.com/pydata/pydata-sphinx-theme/commit/3caf346cacd2dad2a192a83c6cc9f8852e5a722e

None of those links would get shortened until we specifically add each type of URL that we want to support.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to go down the route of a rewrite of the link shortening logic, I would be happy to do it. In that case, I would just close this PR and open a new one.

I would abandon the feature in this PR to turn off link shortening because I suspect that part of the motivation for adding a config value to turn it off is because our current link shortener is bad. Perhaps with a better shortener, there would be little to no demand for a config setting to turn it off. What do you think, @mattpitkin?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code in this PR follows patterns already established for GitHub and GitLab, but I think those patterns are bad. [...] If we want to go down the route of a rewrite of the link shortening logic, I would be happy to do it. In that case, I would just close this PR and open a new one.

Since I think I've basically solved it for bitbucket in the linked Gist, I feel like it might be the right call to just rewrite the others too. +1 to close and open a new PR to refactor what we have and also add bitbucket. See also #2215 which adds link shortening for codeberg/forgejo and gitea.

Copy link
Author

@mattpitkin mattpitkin Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would abandon the feature in this PR to turn off link shortening because I suspect that part of the motivation for adding a config value to turn it off is because our current link shortener is bad. Perhaps with a better shortener, there would be little to no demand for a config setting to turn it off. What do you think, @mattpitkin?

Do you mean, why did I put the gitlab_url etc in the ``html_contextdictionary of theconf.py` file, rather than being at the root level? In part, I think I just found that other information was conveyed in that dictionary, so it seemed a safe place to add new information that wouldn't interfere with anything else. If that information can go at root level, then I don't have an issue with that. You'd still have to specify the required URL bases to turn on/off URL shortening for them though - remember this is specifically for URLs that are not in the standard gitlab.com, github.com, bitbucket.org base domains, and they don't even have to have, e.g., gitlab, in the domain name (see, for example, https://git.ligo.org).

Copy link
Collaborator

@gabalafou gabalafou Jun 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattpitkin, sorry, my comment actually didn't make any sense.

I was writing it in a hurry yesterday and I got two different PRs mixed up in my head: this one versus #2109. But the other PR is very much related to this PR because it provides an option to turn off link shortening.


elif self.platform == "gitlab":
# cp. https://docs.gitlab.com/ee/user/markdown.html#gitlab-specific-references
if "/-/" in path and any(
Expand Down
9 changes: 9 additions & 0 deletions tests/sites/base/page1.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,3 +31,12 @@ Page 1
https://gitlab.com/gitlab-org/gitlab/-/merge_requests/84669
https://gitlab.com/gitlab-org/gitlab/-/pipelines/511894707
https://gitlab.com/gitlab-com/gl-infra/production/-/issues/6788

**Bitbucket**

.. container:: bitbucket-container

https://bitbucket.org
https://bitbucket.org/atlassian/workspace/overview
https://bitbucket.org/atlassian/aui
https://bitbucket.org/atlassian/aui/pull-requests/4758
22 changes: 22 additions & 0 deletions tests/sites/self_hosted_version_control/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
"""Test conf file."""

# -- Project information -----------------------------------------------------

project = "Test Self Hosted Version Control URLs"
copyright = "2020, Pydata community"
author = "Pydata community"

root_doc = "index"

# -- General configuration ---------------------------------------------------

# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = []
html_theme = "pydata_sphinx_theme"
html_context = {
"github_url": "https://github.pydata.org",
"gitlab_url": "https://gitlab.pydata.org",
"bitbucket_url": "https://bitbucket.pydata.org",
}
13 changes: 13 additions & 0 deletions tests/sites/self_hosted_version_control/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
Test conversion of a self-hosted GitHub URL
===========================================

This test ensures that a site using PyData Sphinx Theme can set a self-hosted
version control URL via the theme options and then when the site is built, that
the URLs that go to that self-hosted version control domain will be properly
shortened (just like for github.com, gitlab.com, and bitbucket.org).

.. toctree::
:caption: My caption
:numbered:

links
42 changes: 42 additions & 0 deletions tests/sites/self_hosted_version_control/links.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
Test Self Hosted Version Control URLs
=====================================

**normal link**

- https://pydata-sphinx-theme.readthedocs.io/en/latest/

**GitHub**

.. container:: github-container

https://github.pydata.org
https://github.pydata.org/pydata
https://github.pydata.org/pydata/pydata-sphinx-theme
https://github.pydata.org/pydata/pydata-sphinx-theme/pull/1012
https://github.pydata.org/orgs/pydata/projects/2

**GitLab**

.. container:: gitlab-container

https://gitlab.pydata.org
https://gitlab.pydata.org/gitlab-org
https://gitlab.pydata.org/gitlab-org/gitlab
https://gitlab.pydata.org/gitlab-org/gitlab/-/issues/375583
https://gitlab.pydata.org/gitlab-org/gitlab/issues/375583
https://gitlab.pydata.org/gitlab-org/gitlab/-/issues/
https://gitlab.pydata.org/gitlab-org/gitlab/issues/
https://gitlab.pydata.org/gitlab-org/gitlab/-/issues
https://gitlab.pydata.org/gitlab-org/gitlab/issues
https://gitlab.pydata.org/gitlab-org/gitlab/-/merge_requests/84669
https://gitlab.pydata.org/gitlab-org/gitlab/-/pipelines/511894707
https://gitlab.pydata.org/gitlab-com/gl-infra/production/-/issues/6788

**Bitbucket**

.. container:: bitbucket-container

https://bitbucket.pydata.org
https://bitbucket.pydata.org/atlassian/workspace/overview
https://bitbucket.pydata.org/atlassian/aui
https://bitbucket.pydata.org/atlassian/aui/pull-requests/4758
32 changes: 32 additions & 0 deletions tests/test_build.py
Original file line number Diff line number Diff line change
Expand Up @@ -852,6 +852,38 @@ def test_shorten_link(sphinx_build_factory, file_regression) -> None:
gitlab = sphinx_build.html_tree("page1.html").select(".gitlab-container")[0]
file_regression.check(gitlab.prettify(), basename="gitlab_links", extension=".html")

bitbucket = sphinx_build.html_tree("page1.html").select(".bitbucket-container")[0]
file_regression.check(
bitbucket.prettify(), basename="bitbucket_links", extension=".html"
)


def test_self_hosted_shorten_link(sphinx_build_factory, file_regression) -> None:
"""Check that self-hosted version control URLs get shortened.

Example:
conf.py
html_context = {"github_url": "https://github.example.com"}

example_page.rst

In https://github.example.com/pydata/pydata-sphinx-theme/pull/101,
we refactored stylesheets and updated typography.

example_page.html

In <a href="https://github.example.com/pydata/pydata-sphinx-theme/pull/101">
pydata/pydata-sphinx-theme#101</a>, we refactored stylesheets and
updated typography.
"""
sphinx_build = sphinx_build_factory("self_hosted_version_control").build()
urls_page = sphinx_build.html_tree("links.html").select("article")[0]
file_regression.check(
urls_page.prettify(),
basename="self_hosted_version_control_links",
extension=".html",
)


def test_math_header_item(sphinx_build_factory, file_regression) -> None:
"""Regression test for math items in a header title."""
Expand Down
16 changes: 16 additions & 0 deletions tests/test_build/bitbucket_links.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
<div class="bitbucket-container docutils container">
<p>
<a class="bitbucket reference external" href="https://bitbucket.org">
bitbucket
</a>
<a class="bitbucket reference external" href="https://bitbucket.org/atlassian/workspace/overview">
atlassian
</a>
<a class="bitbucket reference external" href="https://bitbucket.org/atlassian/aui">
atlassian/aui
</a>
<a class="bitbucket reference external" href="https://bitbucket.org/atlassian/aui/pull-requests/4758">
atlassian/aui#4758
</a>
</p>
</div>
Loading
Loading