Skip to content
This repository was archived by the owner on Oct 12, 2020. It is now read-only.

Documenting the algorithm and providing justification evidence #45

@changkun

Description

@changkun

Thank you for this very interesting project. Here I share a few of my tests while using the project.

I initially tested my personal project which has about 3.9k stars, the result seems wasn't so good.

$ docker run -t -e GITHUB_TOKEN=$GITHUB_TOKEN -v "/Users/changkun/dev/mct:/data/" ullaakut/astronomer changkun/modern-cpp-tutorial                                                                                          [22:00:10]
Beginning fetching process for repository changkun/modern-cpp-tutorial
Pre-fetching all stargazers...ok
  > Selecting 200 first stargazers out of 3930
  > Selecting 800 random stargazers out of 3930
Fetching contributions for 1000 users up to year 2013
Building trust report...ok

Averages                             Score           Trust
--------                             -----           -----
Weighted contributions:              4132              E
Private contributions:               65                E
Created issues:                      9                 D
Commits authored:                    238               C
Repositories:                        37                A
Pull requests:                       6                 E
Code reviews:                        2                 E
Account age (days):                  1444              B
5th percentile:                      9                 A
10th percentile:                     24                A
15th percentile:                     59                A
20th percentile:                     85                B
25th percentile:                     111               C
30th percentile:                     157               C
35th percentile:                     194               D
40th percentile:                     328               C
45th percentile:                     436               C
50th percentile:                     541               D
55th percentile:                     770               D
60th percentile:                     899               D
65th percentile:                     1255              D
70th percentile:                     1579              D
75th percentile:                     2599              D
80th percentile:                     3652              D
85th percentile:                     5277              E
90th percentile:                     6836              E
95th percentile:                     14190             E
----------------------------------------------------------
Overall trust:                                         D

✔ Analysis successful. 1000 users computed.
GitHub badge available at https://img.shields.io/endpoint.svg?url=https%3A%2F%2Fastronomer.ullaakut.eu%2Fshields%3Fowner%3Dbilibili%26name%3Dkratos

Then, I picked another project from GitHub trend page:

$ docker run -t -e GITHUB_TOKEN=$GITHUB_TOKEN -v "/Users/changkun/dev/mct:/data/" ullaakut/astronomer bilibili/kratos                                                                                                       [22:12:59]
Beginning fetching process for repository bilibili/kratos
Pre-fetching all stargazers...ok
  > Selecting 200 first stargazers out of 5739
  > Selecting 800 random stargazers out of 5739
Fetching contributions for 1000 users up to year 2013
Building trust report...ok

Averages                             Score           Trust
--------                             -----           -----
Weighted contributions:              2536              E
Private contributions:               71                E
Created issues:                      6                 D
Commits authored:                    137               D
Repositories:                        30                A
Pull requests:                       6                 D
Code reviews:                        1                 E
Account age (days):                  1545              B
5th percentile:                      9                 A
10th percentile:                     25                A
15th percentile:                     43                A
20th percentile:                     55                C
25th percentile:                     74                D
30th percentile:                     106               D
35th percentile:                     146               D
40th percentile:                     188               D
45th percentile:                     245               D
50th percentile:                     349               D
55th percentile:                     490               D
60th percentile:                     638               E
65th percentile:                     832               E
70th percentile:                     1092              E
75th percentile:                     1577              E
80th percentile:                     2072              E
85th percentile:                     3117              E
90th percentile:                     5329              E
95th percentile:                     9192              E
----------------------------------------------------------
Overall trust:                                         D

✔ Analysis successful. 1000 users computed.
GitHub badge available at https://img.shields.io/endpoint.svg?url=https%3A%2F%2Fastronomer.ullaakut.eu%2Fshields%3Fowner%3Dbilibili%26name%3Dkratos

OK, then let's test Tensorflow.

$ docker run -t -e GITHUB_TOKEN=$GITHUB_TOKEN -v "/Users/changkun/dev/mct:/data/" ullaakut/astronomer tensorflow/tensorflow                                                                                                 [23:32:47]
Beginning fetching process for repository tensorflow/tensorflow
Pre-fetching all stargazers...ok
  > Selecting 200 first stargazers out of 131149
  > Selecting 800 random stargazers out of 131149
Fetching contributions for 1000 users up to year 2013
Building trust report...ok

Averages                             Score           Trust
--------                             -----           -----
Weighted contributions:              7495              D
Private contributions:               190               C
Created issues:                      18                B
Commits authored:                    198               D
Repositories:                        16                C
Pull requests:                       10                D
Code reviews:                        3                 D
Account age (days):                  1145              C
5th percentile:                      1                 E
10th percentile:                     2                 E
15th percentile:                     5                 E
20th percentile:                     10                E
25th percentile:                     22                E
30th percentile:                     32                E
35th percentile:                     40                E
40th percentile:                     59                E
45th percentile:                     76                E
50th percentile:                     114               E
55th percentile:                     153               E
60th percentile:                     217               E
65th percentile:                     368               E
70th percentile:                     707               E
75th percentile:                     1076              E
80th percentile:                     2109              E
85th percentile:                     3390              E
90th percentile:                     14580             D
95th percentile:                     30685             D
----------------------------------------------------------
Overall trust:                                         D

✔ Analysis successful. 1000 users computed.
GitHub badge available at https://img.shields.io/endpoint.svg?url=https%3A%2F%2Fastronomer.ullaakut.eu%2Fshields%3Fowner%3Dtensorflow%26name%3Dtensorflow

Issues to the Algorithm

This repo is proposing a justice algorithm without previous study on the ratio of algorithm. As a user of your algorithm, I particularly expect the following supporting points on why the algorithm is accurate:

  1. Showing theoretical analysis regarding the influence of each of the defined factors, and providing regression analysis and statistical stability of the algorithm.

  2. Making benchmarks on various projects, illustrates how your algorithm match the theoretical analysis for the TOP10 valuable open source projects, like golang/go, torvalds/linux, etc.

    "Those random stargazers can then sometimes be responsible for slight changes in the results, but they usually represent a difference of 1% to 3%, which is negligeable." -- README.md

    May I have how did you have this conclusion? How large is your test samples? What are they? etc.

  3. Establish a user study, an important way of evaluating usability issue is to held an user study. Typically, a single score has lack of expression on many different aspects, and it is not easy to say if the star of a repo is seriously fake or unworthy. Making quantitative analysis on, for example, how other users feel about the score provided by the algorithm, does the score matches your mental expectation? why? how could we help? those are questions should be seriously considered.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions