-
Notifications
You must be signed in to change notification settings - Fork 1
Description
If possible and for privacy reasons, I would split the messages in two:
-
the long term storage: it is whatever is needed for analytics (how many emails per period/campaign) and to train the model (the embedding vector)
-
the short term storage: in principle, most of the personal information of the citizen (including the message) should be kept only until the politician has replied)
We might need to adjust a bit, for instance to allow the politician to delay the reply after the final vote, but conceptually, I'd like to consider them as two different storage with different purposes
long term storage
- messageid (UUID)
- channel (which provider if REST API, which email server if stalwart)
- timestamp
- senderid: hash (email sender)
- campaign
- bge-m3 vector
- date replied
- id reply sent
notes:
- one citizen can write to many representative as part of the same campaign
- one citizen can write many times to the same representative
We probably should store them, so far we are using dupe_rank (duplicate ranks) for the later anything more than 0 means the citizen is a bit spammy ;)
#short term storage
at least name + email+ timestamp of the sender, whatever is needed to reply