You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Done by @ludovictanguy Update of the French flagged words.
Here's what I did for the French set of flagged words:
- removed all common words according to the guide: words without
any porn-oriented meanings (coq, bride), polysemous (chatte),
insults and swear words (merde, putain, pédé)
- removed all multi-word expressions (with spaces) as the ratio is
calculated on single words ("faire chier" etc.)
- removed all non-French words supposedly resulting from translation
errors (bollok, boob, buceta...)
- removed old and very rare slang words (bigornette, cramouille, turlute)
- when in doubt, I checked on Google if the words led to porn pages in
the first 10 results (as suggested by HugoLaurencon), this led to
the removal of the masturb* and ejac* families for instance.
- when I found suitable words, I added alternate spellings, inflected
and derived forms (baiseuse, pornographique etc.), using the Google
test.
- in some cases there are interesting variations of meaning according
to suffixes (e.g. "enculage" vs "enculation", only the latter being
porn-oriented). I have done some work on this morphological
phenomenon a while ago (not specifically on this kind of words, but
we had to tackle them anyway), I'm pleased to put it to good use...
0 commit comments