Reddit bot that replies with the most relevant counterargument to debunk common anti-vegan myths.
It replies to a parent comment when summoned via mentions.
The bot can be summoned using mentions on Reddit, which means replying to the comment the bot should respond to with /u/animalsupportbot somewhere in the reply.
For example:
Non-vegan user: <Comment with myth(s)>
└── Vegan User: /u/animalsupportbot
└── animalsupportbot: <response>
The bot can also be helped to match up arguments by 'hinting' in the summoning process. This is done by putting in a helpful phrase/keywords in the summoning reply (which can be separated with commas or full stops):
Non-vegan user: <Comment with myth(s)>
└── Vegan User: /u/animalsupportbot plants feel pain. vegan pets
└── animalsupportbot: <response, using hints to match more easily>
Information for contributing to responses can be found by clicking here
Tested on Python 3.6.9 using a virtualenv. Requirements can be found in requirements.txt.
pip install -r requirements.txtpython argmatcher.pyThis populates ./preload_dicts/ with embeddings for each example for each myth. This saves us having to calculate them every time we want to restart the bot.
To run this step, config.yaml must exist in the repo directory. This contains the configuration for the classifier, in addition to things like secret keys to authenticate with the Reddit API. It looks something like this:
# Matching options
threshold: 0.8
certain_threshold: 0.9
n_neighbors: 3
# Hint matching options
hint_arg_threshold: 0.3
hint_threshold: 0.4
hint_certain_threshold: 0.8
hint_n_neighbors: 7
# Redditbot options
refresh_rate: 60
# User Info
user_info:
client_id: "XXXXXXXX"
client_secret: "XXXXXXXX"
password: "XXXXXXXX"
user_agent: "XXXXXXXX"
username: "animalsupportbot"
# Whitelisted Subreddits
whitelisted:
- testanimalsupportbot
- vegan
- debateavegan
- vegancirclejerk
- veganforcirclejerkers
- veganuk
# Blacklisted Subreddits (in addition to defaults)
blacklisted:
- depression
- suicidewatch -
thresholdis afloatthat determines the minimum similarity score that an input sentence must have to be declared a succesful match. -
certain_thresholdis afloatwhich determines the similarity score above which the top-1 neighbor is selected, over a weighted vote. -
n_neighborsis anintthat determines how many neighbors to consider in the weighted vote argument classification. -
hint_arg_thresholdis afloatthat determines the similarity score that the mention text must have to be matched with an argument. -
hint_thresholdis afloatthat determines the minimum similarity score that a sentence must have to be declared a succesful match to a hinted argument. -
hint_certain_thresholdandhint_n_neighborsare the same as above, except for hinted argument matching. -
refresh_rateis anintthat determines how long the bot will sleep for after checking mentions, in seconds. -
whitelistedis a list of subreddits that the bot can respond to comments in -
blacklistedis a list of subreddits that the bot is explicitly stopped from responding to in (depression,suicidewatchare both hard-coded in, and not actually required in the config file)
Once this exists, the reddit bot can be run with the following command:
python redditbot.pyAfter Step 1 has been completed, meaning embeddings have been precomputed, the argument matcher can be tested on the command-line using the following:
python argmatcher.py --testThis starts an interactive mode which can test various input sentences, for example:
Enter test sentence: but bacon though
Num neighbours with vote: 3
[{'input_sentence': 'but bacon though',
'matched_arglabel': 2,
'matched_argument': 'I Love How Animals Taste!',
'matched_text': 'but bacon though',
'reply_text': 'I think one of the most disconcerting things about the taste '
'excuse though, is that it is an excuse that bluntly admits '
'that the personal desires of an individual’s taste preference '
'matter more than the morality surrounding an animal’s life '
'and unquestionably horrific death. However, it doesn’t mean '
'that the person using this excuse necessarily believes that '
'their taste preferences are more important than an animal’s '
'life (most people I talk to don’t) but because they’ve never '
'been asked about it before they’ve never had to confront the '
'fact that through their actions they are placing their taste '
'higher. This is why when people say to me “I love the taste '
'of meat.” or, “I could never give up cheese.”, I like to ask '
'them “do you value your taste buds higher than the life of an '
'animal?” - most people will say no, but if they do say yes '
'make sure to ask them why. One of the big issues with this '
'excuse is that it seeks to validate a non-vegan diet by '
'claiming that we shouldn’t be held responsible for our '
'immoral activities because our selfish impulses are too '
'strong to be suppressed and as such, we can’t be held morally '
'accountable for the actions that we make. But where do we '
'draw the line?',
'similarity': 1.0}]
The information for the various arguments that the bot can match and respond to is located in knowledge. This folder has the following structure, with the myth "Plants Feel Pain" used as an example:
└── knowledge
├── myths
| ├── plants_feel_pain.yaml
| └── ...
└── responses
├── plants_feel_pain.md
└── ...
The .yaml files in knowledge/myths contain auxiliary information about the argument, and the .md files in knowledge/responses contains the actual response text.
Inside knowledge/myths are .yaml files containing the following information:
key: plants_feel_pain
title: Plants Feel Pain
full_comment: true
enable_resp: true
link: <URL>
examples:
- what if plants feel pain
- plants feel pain too
- how do you know plants don't feel painkeyis the unique identifier for this argument. The response text inknowledge/responsesmust have the filename:<key>.md.titleis the formatted title for this argument.full_commentis a boolean which indicates whether or not the full response should be posted. If this isfalsethen the most similar sentence to the input in the response text is selected (along with the proceeding 5 sentences).enable_respdetermines whether an argument should be responded to, if matched. This flag exists mainly to disable unfinished responses.linkan optional link to highlight the argument title with in the response, such as a YouTube video. If there is no link, this must be set tonan.examplesthe example sentences/phrases which should link to this argument. These examples make up the "training set" for the nearest neighbor classifier.
The responses are all stored in .md files in knowledge/responses. These should be written in the markdown style that Reddit uses.
Performance metrics for the bot can be obtained by using eval.py. If it has not been done before (or if changes to the knowledge base have been made), the pre-computed embeddings should be repopulated using:
python argmatcher.pyA test data csv file can then be evaluated by running something like the following command:
python eval.py --eval-csv <test_data.csv> --n-neighbors 3 --threshold 0.5 --certain-threshold 0.9This test_data.csv should be a two column comma separated file with the headings: text, label. An example of such a file can be found here.
This script will provide the following outputs in a unique eval_logs/<eval_run_id>/ folder:
- Results summary:
results.csv- Balanced accuracy
- Averaged Precision and Recall
"micro", "macro", "weighted": See sklearn-docs
- Confusion matrices:
cm_true.png: Normalized by true label (diagonal is per-class precision)cm_pred.png: Normalized by pred label (diagonal is per-class recall)cm_none.png: Un-normalized
- Every example evaluated:
raw_results.csv:- Contains predicted label, true label