Skip to content

Conversation

@tz-lom
Copy link

@tz-lom tz-lom commented Jul 26, 2025

This PR enables support for self-hosted GitHub installations

I've tested it with the following setup

repo = Remotes.GitHub("XXX/YYY.git";host="AAA.com")

deploydocs(;
    deploy_config = Documenter.GitHubActions(
        "gh.asml.com",
        "https://pages.AAA.com/XXX/YYY"
    ),
    repo = repo,
    push_preview = true,

Both deployment and notification worked well

Was made during JuliaCon 2025 hackathon!

@tz-lom tz-lom force-pushed the feature/support_selfhosted_github branch 2 times, most recently from 60c5931 to f3853c9 Compare July 27, 2025 14:50
@tz-lom tz-lom force-pushed the feature/support_selfhosted_github branch from f3853c9 to ef38eb7 Compare July 27, 2025 14:50
@mortenpi
Copy link
Member

This looks fine to me! Could we also make sure that it's documented in the relevant docstrings please?

  • """
    GitHubActions <: DeployConfig
    Implementation of `DeployConfig` for deploying from GitHub Actions.
    The following environment variables influences the build
    when using the `GitHubActions` configuration:
    - `GITHUB_EVENT_NAME`: must be set to `push`, `workflow_dispatch`, or `schedule`.
    This avoids deployment on pull request builds.
    - `GITHUB_REPOSITORY`: must match the value of the `repo` keyword to [`deploydocs`](@ref).
    - `GITHUB_REF`: must match the `devbranch` keyword to [`deploydocs`](@ref), alternatively
    correspond to a git tag.
    - `GITHUB_TOKEN` or `DOCUMENTER_KEY`: used for authentication with GitHub,
    see the manual section for [GitHub Actions](@ref) for more information.
    The `GITHUB_*` variables are set automatically on GitHub Actions, see the
    [documentation](https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/store-information-in-variables#default-environment-variables).
    """
  • """
    GitHub(user :: AbstractString, repo :: AbstractString)
    GitHub(remote :: AbstractString)
    Represents a remote Git repository hosted on GitHub. The repository is identified by the
    names of the user (or organization) and the repository: `GitHub(user, repository)`. E.g.:
    ```julia
    makedocs(
    repo = GitHub("JuliaDocs", "Documenter.jl")
    )
    ```
    The single-argument constructor assumes that the user and repository parts are separated by
    a slash (e.g. `JuliaDocs/Documenter.jl`).
    """

@tz-lom
Copy link
Author

tz-lom commented Jul 27, 2025

This looks fine to me! Could we also make sure that it's documented in the relevant docstrings please?

I've updated documentation and hopefully - coverage.
But I want to mention that documentation is miss leading at this moment:

makedocs(
    repo = GitHub("JuliaDocs", "Documenter.jl")
)

The one above will probably work, but the one bellow will not (not because of my changes)

deploydocs(
    repo = GitHub("JuliaDocs", "Documenter.jl")
)

Is it just me reading documentation in a wrong way?

@tz-lom tz-lom force-pushed the feature/support_selfhosted_github branch from 2964c67 to 438901a Compare July 28, 2025 01:18
@tz-lom tz-lom requested a review from mortenpi July 29, 2025 12:40
@mortenpi
Copy link
Member

Do we have somewhere where we advertise passing e.g. GitHub() to deploydocs's repo=? I think right now we just accept a string there. I agree that it's a bit unfortunate that you can't just pass the object form there (we should probably expand that), but I am not sure there's anything misleading anywhere?

@tz-lom
Copy link
Author

tz-lom commented Aug 1, 2025

Do we have somewhere where we advertise passing e.g. GitHub() to deploydocs's repo=? I think right now we just accept a string there. I agree that it's a bit unfortunate that you can't just pass the object form there (we should probably expand that), but I am not sure there's anything misleading anywhere?

No, I can't find anything in documentation, but in output of the build I see this:
image
and with repo parameter in deploydocs this is totally misleading.

Anyway, I've double checked that this PR works in self-hosted environment, do you need anything else or you can merge it?

@tz-lom
Copy link
Author

tz-lom commented Aug 12, 2025

@mortenpi , can you review this please?

Copy link
Member

@mortenpi mortenpi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. Unfortunately, I do have a couple of questions. I think my main concern is to make sure this doesn't affect existing setups.

"""
GitHub(user :: AbstractString, repo :: AbstractString)
GitHub(user :: AbstractString, repo :: AbstractString, [host :: AbstractString])
GitHub(remote :: AbstractString)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took the liberty of removing host for the single-argument method. I think it would be better if it stays single-argument. If we want to add support for overriding the host, then let's figure out some syntax for the string, like github.selfhosted:Org/Repo.jl, which we can regex.

Comment on lines 339 to 350
github_api = get(ENV, "GITHUB_API_URL", "") # https://api.github.com

# Compute GitHub Pages URL from repository
parts = split(github_repository, "/")
github_pages_url = if length(parts) == 2
owner, repo = parts
"https://$(owner).github.io/$(repo)/"
else
""
end

return GitHubActions(github_repository, github_event_name, github_ref, "github.com", github_api, github_pages_url)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we hard-code github_host here, it feels like we should also hard-code the API URL? I also have a minor security concern here, where this could allow someone redirect the API calls somewhere (including the token) by somehow attacking the GITHUB_API_URL environment variable (for normal GitHub-hosted repos).

Suggested change
github_api = get(ENV, "GITHUB_API_URL", "") # https://api.github.com
# Compute GitHub Pages URL from repository
parts = split(github_repository, "/")
github_pages_url = if length(parts) == 2
owner, repo = parts
"https://$(owner).github.io/$(repo)/"
else
""
end
return GitHubActions(github_repository, github_event_name, github_ref, "github.com", github_api, github_pages_url)
# Compute GitHub Pages URL from repository
parts = split(github_repository, "/")
github_pages_url = if length(parts) == 2
owner, repo = parts
"https://$(owner).github.io/$(repo)/"
else
""
end
return GitHubActions(github_repository, github_event_name, github_ref, "github.com", "https://api.github.com", github_pages_url)

github_repository = get(ENV, "GITHUB_REPOSITORY", "") # "JuliaDocs/Documenter.jl"
github_event_name = get(ENV, "GITHUB_EVENT_NAME", "") # "push", "pull_request" or "cron" (?)
github_ref = get(ENV, "GITHUB_REF", "") # "refs/heads/$(branchname)" for branch, "refs/tags/$(tagname)" for tags
github_api = get(ENV, "GITHUB_API_URL", "") # https://api.github.com
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is GITHUB_API_URL something that is configurable for a self-hosted instance, or can we assume it's api.$(host)?

There's also GITHUB_SERVER_URL, which could potentially be used for determining host automatically? Or would that not be reliable and/or two automagical?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can assume anything about it, I don't know exactly if it is configurable but I would expect that it is as some orgs have strict rules about domain names

I need to check GITHUB_SERVER_URL , for some reason I haven't used that, don't know if it is because we don't have it or because I've missed it

if something like that exists I would prefer to used it indeed instead of specifying it manually

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that GITHUB_API_URL is mentioned in the GitHub manual as a variable they set in GitHub Action workflows.

And in the corresponding enterprise docs at https://docs.github.com/en/[email protected]/actions/reference/workflows-and-actions/variables the example value given is http(s)://HOSTNAME/api/v3 -- so no, we can't just assume it is api.$(host) (I assume that's the URL one gest if one enables "subdomain isolation")

@tz-lom
Copy link
Author

tz-lom commented Sep 3, 2025

@mortenpi I have number of issues in testing which I will address, but can you provide initial feedback about url parsing thing that I did based on your ideas, what do you think about it ?

Copy link
Collaborator

@fingolfin fingolfin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I think this is quite useful and sensible. I still have a bunch of comments :-)

Implementation of `DeployConfig` for deploying from GitHub Actions.
For self-hosted GitHub installation use `GitHubActions(host, pages_url)` constructor
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For self-hosted GitHub installation use `GitHubActions(host, pages_url)` constructor
For self-hosted GitHub installation use the `GitHubActions(host, pages_url)` constructor

Implementation of `DeployConfig` for deploying from GitHub Actions.
For self-hosted GitHub installation use `GitHubActions(host, pages_url)` constructor
to specify the host name and a **full path** to the GitHub pages location.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the "full path" referring to pages_url ? I find that confusing, a path is not an URL; maybe clarify the text?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • review this description, seems that API update is not reflected here (is there constructor that takes host as argument?)

github_repository = get(ENV, "GITHUB_REPOSITORY", "") # "JuliaDocs/Documenter.jl"
github_event_name = get(ENV, "GITHUB_EVENT_NAME", "") # "push", "pull_request" or "cron" (?)
github_ref = get(ENV, "GITHUB_REF", "") # "refs/heads/$(branchname)" for branch, "refs/tags/$(tagname)" for tags
github_api = get(ENV, "GITHUB_API_URL", "") # https://api.github.com
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that GITHUB_API_URL is mentioned in the GitHub manual as a variable they set in GitHub Action workflows.

And in the corresponding enterprise docs at https://docs.github.com/en/[email protected]/actions/reference/workflows-and-actions/variables the example value given is http(s)://HOSTNAME/api/v3 -- so no, we can't just assume it is api.$(host) (I assume that's the URL one gest if one enables "subdomain isolation")

parts = split(github_repository, "/")
github_pages_url = if length(parts) == 2
owner, repo = parts
"https://$(owner).github.io/$(repo)/"
Copy link
Collaborator

@fingolfin fingolfin Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course you are just moving this around, and that's already a good step; so what I say below is not a change request, but a general thought (perhaps we should record it it into an issue)

Instead of hardcoding the pages URL like that (which won't if e.g. a custom domain name is used, as e.g. Documenter.jl does) one could also query the pages URL from the GitHub API. E.g. if the gh tool is installed:

$ gh api repos/JuliaDocs/Documenter.jl --jq '.homepage'
https://documenter.juliadocs.org

But of course one also do that without, using just the github_api:

curl -s https://api.github.com/repos/JuliaDocs/Documenter.jl | jq -r '.homepage'

Obviously in Julia we'd use the JSON module, not jq, to parse this data.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be great, however I can't find any public API that will tell me where github pages are located.
homepage is not reliable and can point somewhere else
There is private api /repos/{owner}/{repo}/pages but it requires authentication.

# Regular tag build with GITHUB_TOKEN
withenv(
"GITHUB_EVENT_NAME" => "push",
"GITHUB_EVENT_NAME" => "push", "GITHUB_SERVER_URL" => "github.com",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep this to one per line?

Suggested change
"GITHUB_EVENT_NAME" => "push", "GITHUB_SERVER_URL" => "github.com",
"GITHUB_EVENT_NAME" => "push",
"GITHUB_SERVER_URL" => "github.com",

and then the same everywhere below.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • check that this is not because of auto format and do the change uniformly

logger = SimpleLogger(buffer, Logging.Debug)
with_logger(logger) do
withenv(
"GITHUB_EVENT_NAME" => "push", "GITHUB_SERVER_URL" => "github.com",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"GITHUB_EVENT_NAME" => "push", "GITHUB_SERVER_URL" => "github.com",
"GITHUB_EVENT_NAME" => "push",
"GITHUB_SERVER_URL" => "github.com",

@fingolfin fingolfin added the Status: Waiting for Author The issue or pull request needs some action by its author label Oct 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Waiting for Author The issue or pull request needs some action by its author

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants