You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
## Calculating code line changes for files in the last commit
174
+
175
+
This query will report how many lines of actual code (only code, not comments, blank lines or text) changed in each file of the last commit of each repository. It's similar to the previous example. `COMMIT_STATS` is an aggregation over the result of `COMMIT_FILE_STATS` so to speak.
176
+
We will only report those files that whose language has been identified.
177
+
178
+
```
179
+
SELECT
180
+
repo,
181
+
JSON_UNQUOTE(JSON_EXTRACT(stats, '$.Path')) AS file_path,
182
+
JSON_UNQUOTE(JSON_EXTRACT(stats, '$.Language')) AS file_language,
183
+
JSON_EXTRACT(stats, '$.Code.Additions') AS code_lines_added,
184
+
JSON_EXTRACT(stats, '$.Code.Deletions') AS code_lines_removed
185
+
FROM (
186
+
SELECT
187
+
repository_id AS repo,
188
+
EXPLODE(COMMIT_FILE_STATS(repository_id, commit_hash)) AS stats
|`commit_stats(repository_id, [from_commit_hash], to_commit_hash)`|returns the stats between two commits for a repository. If from is empty, it will compare the given `to_commit_hash` with its parent commit. Vendored files stats are not included in the result of this function.|
10
-
|`commit_file_stats(repository_id, [from_commit_hash], to_commit_hash)`|returns an array with the stats of each file in `to_commit_hash` since the given `from_commit_hash`. If from is not given, the parent commit will be used. Vendored files stats are not included in the result of this function.|
9
+
|`commit_stats(repository_id, [from_commit_hash], to_commit_hash) json`|returns the stats between two commits for a repository. If from is empty, it will compare the given `to_commit_hash` with its parent commit. Vendored files stats are not included in the result of this function. This function is more thoroughly explained later in this document.|
10
+
|`commit_file_stats(repository_id, [from_commit_hash], to_commit_hash) json array`|returns an array with the stats of each file in `to_commit_hash` since the given `from_commit_hash`. If from is not given, the parent commit will be used. Vendored files stats are not included in the result of this function. This function is more thoroughly explained later in this document.|
11
11
|`is_remote(reference_name)bool`| check if the given reference name is from a remote one |
12
12
|`is_tag(reference_name)bool`| check if the given reference name is a tag |
13
13
|`is_vendor(file_path)bool`| check if the given file name is a vendored file |
@@ -18,6 +18,7 @@ To make some common tasks easier for the user, there are some functions to inter
18
18
|`uast_extract(blob, key) text array`| extracts information identified by the given key from the uast nodes |
19
19
|`uast_children(blob) blob`| returns a flattened array of the children UAST nodes from each one of the UAST nodes in the given array |
20
20
|`loc(path, blob) json`| returns a JSON map, containing the lines of code of a file, separated in three categories: Code, Blank and Comment lines |
21
+
|`version() text`| returns the gitbase version in the following format `8.0.11-{GITBASE_VERSION}` for compatibility with MySQL versioning |
21
22
## Standard functions
22
23
23
24
These are all functions that are available because they are implemented in `go-mysql-server`, used by gitbase.
@@ -165,3 +166,121 @@ Nodes that have no value for the requested property will not be present in any w
165
166
Also, if you want to retrieve values from a non common property, you can pass it directly
166
167
167
168
> uast_extract(nodes_column, 'some-property')
169
+
170
+
## How to use `commit_file_stats`
171
+
172
+
`commit_file_stats` will return statistics about the line changes in all files in the given range of commits classifying them in 4 categories: code, comments, blank lines and other.
173
+
174
+
It can be used in two ways:
175
+
- To get the statistics of files in a specific commit `COMMIT_FILE_STATS(repository_id, commit_hash)`
176
+
- To get the statistics of files in a commit range `COMMIT_FILE_STATS(repository_id, from_commit, to_commit)`
177
+
178
+
The result of this function is an array of JSON documents with the following shape:
179
+
180
+
```
181
+
{
182
+
"Path": file path,
183
+
"Language": file language,
184
+
"Code": {
185
+
"Additions": number of code additions in this file,
186
+
"Deletions": number of code deletions in this file,
187
+
},
188
+
"Comment": {
189
+
"Additions": number of comment line additions in this file,
190
+
"Deletions": number of comment line deletions in this file,
191
+
},
192
+
"Blank": {
193
+
"Additions": number of blank line additions in this file,
194
+
"Deletions": number of blank line deletions in this file,
195
+
},
196
+
"Other": {
197
+
"Additions": number of other additions in this file,
198
+
"Deletions": number of other deletions in this file,
199
+
},
200
+
"Total": {
201
+
"Additions": number of total additions in this file,
202
+
"Deletions": number of total deletions in this file,
203
+
},
204
+
}
205
+
```
206
+
207
+
**NOTE:** Files that are considered vendored files are ignored for the purpose of computing these statistics. Note that `.gitignore` is considered a vendored file.
208
+
209
+
Because the result of this function is an array of JSON documents, we will need two functions to make use of its data effectively:
210
+
-`EXPLODE` which will make each element in the array have its own row
211
+
-`JSON_EXTRACT` to get data from inside the documents
212
+
213
+
For example, to get the stats of the HEAD commits:
214
+
```sql
215
+
SELECT
216
+
repository_id,
217
+
EXPLODE(COMMIT_FILE_STATS(repository_id, commit_hash)) AS stats
218
+
FROM refs
219
+
WHERE ref_name ='HEAD'
220
+
```
221
+
222
+
`EXPLODE` here will make sure a single row is returned for every single result returned by `COMMIT_FILE_STATS` instead of an array with all of them combined.
223
+
224
+
Then, to extract code additions from this:
225
+
226
+
```sql
227
+
SELECT
228
+
repository_id
229
+
JSON_EXTRACT(stats, '$.Code.Additions')
230
+
FROM (
231
+
SELECT
232
+
repository_id,
233
+
EXPLODE(COMMIT_FILE_STATS(repository_id, commit_hash)) AS stats
234
+
FROM refs
235
+
WHERE ref_name ='HEAD'
236
+
) t
237
+
```
238
+
239
+
**NOTE:** When extracting `Path` or `Language` using `JSON_EXTRACT`, by the way that function works, the result will be quoted (e.g. `"Python"` instead of `Python`). For that reason, for these two string fields `JSON_EXTRACT` should be combined with `JSON_UNQUOTE` like `JSON_UNQUOTE(JSON_EXTRACT(stats, '$.Path'))`.
240
+
241
+
## How to use `commit_stats`
242
+
243
+
`commit_stats` will return statistics about the line changes in the given range of commits classifying them in 4 categories: code, comments, blank lines and other.
244
+
245
+
It can be used in two ways:
246
+
- To get the statistics of a specific commit `COMMIT_STATS(repository_id, commit_hash)`
247
+
- To get the statistics of a the diff of a commit range `COMMIT_STATS(repository_id, from_commit, to_commit)`
248
+
249
+
`commit_stats` it's pretty much an aggregation of the result of `commit_file_stats`. While `commit_file_stats` has the stats for each file in a commit, `commit_stats` has the global stats of all files in the commit. As a result, it outputs a single structure instead of an array of them.
250
+
251
+
The shape of the result returned by this function is the following:
252
+
253
+
```
254
+
{
255
+
"Files": number of files changed in this commit,
256
+
"Code": {
257
+
"Additions": number of code additions in this commit,
258
+
"Deletions": number of code deletions in this commit,
259
+
},
260
+
"Comment": {
261
+
"Additions": number of comment line additions in this commit,
262
+
"Deletions": number of comment line deletions in this commit,
263
+
},
264
+
"Blank": {
265
+
"Additions": number of blank line additions in this commit,
266
+
"Deletions": number of blank line deletions in this commit,
267
+
},
268
+
"Other": {
269
+
"Additions": number of other additions in this commit,
270
+
"Deletions": number of other deletions in this commit,
271
+
},
272
+
"Total": {
273
+
"Additions": number of total additions in this commit,
274
+
"Deletions": number of total deletions in this commit,
275
+
},
276
+
}
277
+
```
278
+
279
+
**NOTE:** Files that are considered vendored files are ignored for the purpose of computing these statistics. Note that `.gitignore` is considered a vendored file.
280
+
281
+
The result returned by this function is a JSON, which means to access its fields, the use of `JSON_EXTRACT is needed.
282
+
283
+
For example, code additions would be accessed like this:
0 commit comments