Skip to content

Conversation

iamYashSinha
Copy link

@iamYashSinha iamYashSinha commented Apr 24, 2023

  1. Added a new field (github_user_id) to the Firestore database.
    Screenshot from 2023-04-24 20-58-26

2. Refer to the video

2023-04-24.21-31-14.mp4
  1. Tests Added:
  • UNIT TESTS
    ✔ should add the github_user_id to the user collection
    ✔ should update the github_user_id in the user collection (40ms)
    ✔ should be stored correctly in the database

  • INTEGRATION TESTS
    ✔ Should update the user
    ✔ Should return an unauthorized error when not logged in

    Test Coverage Stats

    Controllers

    image
  1. RFC DOCUMENT

  2. MIGRATION DOCUMENT

  3. Ticket for cleanup strategy Remove temporary/one-time-use migration endpoint for GitHub duplication fix #1331

@iamYashSinha iamYashSinha marked this pull request as draft April 24, 2023 15:31
@iamYashSinha iamYashSinha changed the title fixing_doc_duplication GitHub duplication Fixed Apr 24, 2023
@iamYashSinha iamYashSinha marked this pull request as ready for review April 24, 2023 16:37
Copy link
Contributor

@prakashchoudhary07 prakashchoudhary07 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done 🚀

Copy link
Contributor

@prakashchoudhary07 prakashchoudhary07 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are failing, please check

@prakashchoudhary07
Copy link
Contributor

Can you please add tests for the case where the GitHub id is updated?

@vikhyat187
Copy link
Contributor

The test are failing rest things, look good.

@iamYashSinha
Copy link
Author

Yes working on the test cases.

@prakashchoudhary07
Copy link
Contributor

@iamYashSinha Please share the migration plan for existing uses, how will we add github_user_id to them?

@iamYashSinha
Copy link
Author

iamYashSinha commented Apr 26, 2023

Migration Plan

initially, the plan was to revoke all the users & tell them to reauthorize by visiting the Real Dev Squad website. But this method ended up creating another document of that user and then adding the github_user_id field to it. As of now, I've written the logic for the new users, but in order to do migration I've to find a way where can add the new github_user_id field to the existing user without creating any other document
@prakashchoudhary07

@iamYashSinha
Copy link
Author

iamYashSinha commented Apr 26, 2023

Can you please add tests for the case where the GitHub id is updated?

Tests Added:
Screenshot from 2023-04-26 06-25-26

@iamYashSinha
Copy link
Author

iamYashSinha commented Apr 26, 2023

@iamYashSinha Please share the migration plan for existing uses, how will we add github_user_id to them?
Ans - will be matching the github_id (which is storing the GitHub username) of two documents and if a match is found, will be inserting the github_user_id field. Refer to the attached video @prakashchoudhary07

2023-04-26.21-20-29.mp4

Copy link
Contributor

@heyrandhir heyrandhir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One concern that comes to mind is the possibility of the request timing out. Since we are performing multiple operations for all users, there is a risk that it could impact the production data or what if some docs are changed and some fail before the request is timed out 👀

Comment on lines 638 to 642
const invalidUsers = {
userId: `${userId}`,
username: `${username}`,
githubUsername: `${githubUsername}`,
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit : we could have just used the shorter format of creating objects instead of template literals

const invalidUsers = {userId, username, githubUsername };

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

})
.catch((error) => {
if (error.response && error.response.status === 404) {
countUserNotFound++;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't count not found ++ and usersNotFound.push happen before the if block as if it doesnt go inside the if block then there will be a mismatch

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing it out. Updated the code

Copy link
Member

@iamitprakash iamitprakash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add test coverage stats

};

// one time script function to perform the migration - adding github_user_id field to the document
const migrate = async (req, res) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we use a better name for this please

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the name to addGtihubId

Comment on lines +608 to +611
const batchCount = Math.ceil(totalUsers / 500);
// Create batch write operations for each batch of documents
for (let i = 0; i < batchCount; i++) {
const batchDocs = usersSnapshot.docs.slice(i * 500, (i + 1) * 500);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than doing this why not directly fetch just 500 from db itself? something like paginated ones.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will lead to extra reads on our database as in firestore it reads all the documents up from the startAt document id and also this is simpler in terms of code readability

data: {
totalUsers: totalUsers,
usersUpdated: countUserFound,
usersNotUpdated: countUserNotFound,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we are sending this i think we should add the reason as well for not updating.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There can be different reasons. So IMO, in such a case we can check our logs for the reason

Sinon.restore();
});

it("Should update the user", async function () {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the title must be should add github_id to the user

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the title

.post(`/users/migrate`)
.set("Cookie", `${cookieName}=${superUserAuthToken}`);
expect(usersMigrateResponse).to.have.status(200);
expect(usersMigrateResponse.body).to.eql({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use to deep equal or strict equal

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the code

router.get("/suggestedUsers/:skillId", authenticate, authorizeRoles([SUPERUSER]), users.getSuggestedUsers);

// WARNING!! - One time Script/Route to do migration.
router.post("/migrate", authenticate, authorizeRoles([SUPERUSER]), users.addGithubId);

Check failure

Code scanning / CodeQL

Missing rate limiting

This route handler performs [authorization](1), but is not rate-limited. This route handler performs [authorization](2), but is not rate-limited. This route handler performs [authorization](3), but is not rate-limited.
Copy link
Contributor

@RitikJaiswal75 RitikJaiswal75 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make changes of removing the github_user_id from a few users and then testing if your script adds the field to the users.

router.get("/suggestedUsers/:skillId", authenticate, authorizeRoles([SUPERUSER]), users.getSuggestedUsers);

// WARNING!! - One time Script/Route to do migration.
router.post("/migrate", authenticate, authorizeRoles([SUPERUSER]), users.addGithubId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

migrate represents an action/verb verbs should not be a part of the API name

Copy link
Contributor

@shubham-y shubham-y Jul 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

github-id-migrator or user-data-migrator
Is this fine?

@shubham-y
Copy link
Contributor

Please make changes of removing the github_user_id from a few users and then testing if your script adds the field to the users.

Updated the test to add a user without github_user_id field and testing if the script added the field or not

beforeEach(async function () {
await addUser(userWithoutGithubUserId);
fetchStub = Sinon.stub(axios, "get");
});

const usersResponse = await chai.request(app).get(`/users`).set("cookie", `${cookieName}=${superUserAuthToken}`);
expect(usersResponse).to.have.status(200);
usersResponse.body.users.forEach((document) => {
expect(document).to.have.property(`github_user_id`);
});

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RFC] GitHub Duplication
10 participants