Skip to content

Commit b8c9daf

Browse files
committed
Migrate 'private repository support'
1 parent 270d1b3 commit b8c9daf

File tree

1 file changed

+139
-0
lines changed

1 file changed

+139
-0
lines changed
Lines changed: 139 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,139 @@
1+
# Private repository support for a fixed set of private, GitHub organizations
2+
3+
### Written by
4+
- Jonathan West (@jgwest)
5+
- Originally written in September 26, 2023
6+
7+
At present, within AppStudio, all users' GitOps repositories are created within the [https://github.com/redhat-appstudio-appdata](https://github.com/redhat-appstudio-appdata) organization. The Application Service (HAS) component of AppStudio has the credentials for this repo, and uses the GitHub REST API to create/delete these repositories, and the Git API to push to them.
8+
9+
As of this writing, those 'redhat-appstudio-appdata' repositories are all public, but this is primarily because GitOps Service did not support private Git repositories in the early stages of the AppStudio project (and we have not been asked to add AppStudio private repository support since).
10+
11+
As part of [RHTAP-1023](https://issues.redhat.com/browse/RHTAP-1023), however, Git repositories may now be private, in order to support embargoed content. We in GitOps Service thus need to ensure that GitOps Service configures Argo CD to pull from Appstudio-managed private Git repositories.
12+
13+
Unlike [GITOPSRVCE-28](https://issues.redhat.com/browse/GITOPSRVCE-28), which allows users to provide credentials for their own private repos, instead, this Epic is limited to provide a single, global pool of tokens to be used for organization-managed GitOps Repos, such as [https://github.com/redhat-appstudio-appdata](https://github.com/redhat-appstudio-appdata)
14+
15+
* AFAIK this org is the ONLY org we need to provide private repository support for, at this time.
16+
17+
This point means this epic is much more limited in scope versus the more open, user-focused, GITOPSRVCE-28.
18+
19+
# Out of Scope
20+
21+
As above, with this epic, there are no changes to the following behaviours of AppStudio:
22+
23+
* Users will not be able to provide their own private repository credentials.
24+
* Users will not be able to provide their own GitOps repository URL
25+
* Users cannot customize their GitOps repository (beyond the ability to provide a custom devfile)
26+
27+
# Proposed Workflow
28+
29+
**1\) AppStudio maintains a list of GitHub API tokens (personal access tokens, PATs), either shared, or per team**
30+
31+
* Ideally we would be able to share HAS’ token pool, which would obviate the need for GitOps Service to maintain their own token pool.
32+
* BUT, this requires shared consensus between the teams.
33+
* The actual shared list of tokens is stored in app-sre’s Hashicorp vault instance
34+
* An [External Secrets resource](https://github.com/redhat-appstudio/infra-deployments/blob/main/components/has/base/external-secrets/has-github-token.yaml) reads the secret from Hashicorp vault, and writes it to a Secret in 'gitops' Namespace.
35+
* Hashicorp \-\> External Secrets is the standard AppStudio mechanism for this
36+
* See below for the format of the Secret (based on HAS' team format)
37+
38+
**2\) The 'cluster-agent' component of GitOps Service should be the only component that we need to provide access to this token pool**
39+
40+
* Add env var(s) to cluster-agent, referencing the token list Secret in the Namespace.
41+
* That would look like this:
42+
43+
```yaml
44+
# cluster-agent's Deployment
45+
apiVersion: apps/v1
46+
kind: Deployment
47+
metadata:
48+
name: controller-manager
49+
spec:
50+
template:
51+
spec:
52+
containers:
53+
- command:
54+
- gitops-service-cluster-agent
55+
# (...)
56+
name: manager
57+
env:
58+
# A group of Secrets for each org containing private repos: But, I currently expect we'll need only one, for 'http://github.com/redhat-appstudio-appdata'
59+
# TOKEN_POOL_1_*
60+
- name: TOKEN_POOL_1_ORG_URL
61+
value: "http://github.com/redhat-appstudio-appdata"
62+
- name: TOKEN_POOL_1_SECRET
63+
value: "token-pool-tokens" # reference to Secret in Namespace
64+
65+
# (...)
66+
# TOKEN_POOL_N_*
67+
- name: TOKEN_POOL_N_ORG_URL
68+
value: "(...)"
69+
- name: TOKEN_POOL_1_SECRET
70+
value: "(...)"
71+
72+
---
73+
74+
# Token Pool Secret (actual contents coming from Hashicorp vault via External Secrets)
75+
76+
kind: Secret
77+
metadata:
78+
name: token-pool-tokens
79+
data:
80+
# Secret format from HAS
81+
tokens: "token1:(...),token2:(...),tokenN:(...),token7:(...)"
82+
```
83+
84+
**3\) In cluster-agent, whenever cluster-agent creates/modifies an Argo CD Application CR (via an Application Operation), AND the '.spec.source' field of the Argo CD Application CR matches one of the TOKEN\_POOL\_X\_ORG\_URLs defined in the environment variables, we should do the following:**
85+
86+
Create (ensure there exists) a [Repository Credential Secret](https://argo-cd.readthedocs.io/en/stable/operator-manual/declarative-setup/#repository-credentials) for that repository:
87+
88+
```yaml
89+
apiVersion: v1
90+
kind: Secret
91+
metadata:
92+
name: "repo-cred-(sha-256 hash of repo url)"
93+
namespace: gitops-service-argocd
94+
labels:
95+
argocd.argoproj.io/secret-type: repo-creds
96+
97+
stringData:
98+
type: git
99+
url: "http://github.com/redhat-appstudio-appdata/(repo URL value from .spec.source field of Argo CD Application)"
100+
password: "(token from token pool, chosen using below algorithm)"
101+
username: username
102+
```
103+
104+
**4\) What value should we use for "(token from token pool)", in the previous step? Well, we can use the following algorithm to determine which token to use**
105+
106+
Pseudocode:
107+
108+
```go
109+
110+
githubTokenListFromSecret := { /* read from has-github-token Secret */ }
111+
112+
// hash the URL
113+
hashedValue := sha256.Sum256(gitRepositoryURL)
114+
115+
// use the first byte of the hashed value to index into token list
116+
secretIndex := hashedValue[0] % len(githubTokenListFromSecret)
117+
118+
repositoryTokenValToUse := githubTokenListFromSecret[secretIndex]
119+
```
120+
121+
**TL;DR**: hash the git URL and use that to index into the token list, to ensure an even distribution between tokens.
122+
123+
# Alternatives Considered
124+
125+
**Why not just define a GitOpsRepositoryCredential CR in each Namespace, containing the credentials for the repo URL?**
126+
127+
* GitOpsDeploymentRepositoryCredential works great for cases where users have their own private Git Repository, and their own private credentials
128+
* However, in this case, the user does not have the credentials for the private GitOps repository (these are only known by Red Hat)
129+
* With GitOpsDeploymentRepositoryCredential, the token is stored in a Secret in the user’s ‘(username)-tenant’ namespace
130+
* In AppStudio, users can view Secrets in their own Namespace
131+
* Thus, the GitHub token PAT that we use to communicate with the repo would necessarily be viewable to the user, with this approach
132+
* Thus, the only way this would work would be if we generated a PAT token PER USER, which would be excessive
133+
134+
**Rather than defining an Argo CD Repository Secret for each repository, why not define a single Argo CD Repository Secret to be shared by all the repos?**
135+
136+
* Since we have a large number of users on AppStudio, we want to ensure that we do not overuse a single PAT token for all our Git requests, but rather we distribute that work over multiple accounts (tokens).
137+
* On multi-tenant prod, 341 Argo CD Applications (and roughly the same number of Git repos)
138+
* This allows us to evenly distribute (via SHA-256 hashes indexing into token lists) the work across all available tokens
139+

0 commit comments

Comments
 (0)