DeerLu1220/git-bias-banchmark
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
This is a repository of project: a unified gender bias metric for MLMs and generative models. TODO: - [ ] Generate a list of male/female names, compute the frequency - [ ] **How to calculate the frequency or the probability?** - [ ] Which prompt, if you use an LLM? - [ ] Where do you compute the frequency? in which corpus? - [ ] Generate a list of sentences which contain the social categories and the male/female names - [ ] Csheck the prompt(s), how effective is it? - [ ] How many sentences do we generate per social category? (currently 2: one for male, one for female; try more than 2) - [ ] How many sentences also contain gender-oriented adjectives/pronouns? - [ ] Create a readable/sortable TSV (tab-separated) file - [ ] Make sure the order of 'category' and 'proper name' is not always the same (through good prompt) - this means you might have category first, proper name second, or vice versa - [ ] Compute the probabilities of Y given the prefix containing X, where X and Y are 'category' or 'proper name' when the 'proper name' is switched - E.g. - It’s nice to hear that X just got a job as Y (here X is 'proper name') - We just gave a job as X to Y (here Y is 'proper name') - [ ] Go multilingual and check how it changes with languages that have more morphological information (e.g. Italian, German) and those that don’t have (Chinese, etc.) - [ ] evaluate the sentence level bias based on the meeting with simone - [ ] calculate the unify bias score for both MLL and GLM - [ ] using pierce relating score to explore beside the key gender effector which part of the sentence leads to the bias - [ ] make slides of the results - [ ] check the calculation process with Crowpairs