Skip to content

storing messages #4

@tttp

Description

@tttp

If possible and for privacy reasons, I would split the messages in two:

  1. the long term storage: it is whatever is needed for analytics (how many emails per period/campaign) and to train the model (the embedding vector)

  2. the short term storage: in principle, most of the personal information of the citizen (including the message) should be kept only until the politician has replied)

We might need to adjust a bit, for instance to allow the politician to delay the reply after the final vote, but conceptually, I'd like to consider them as two different storage with different purposes

long term storage

  • messageid (UUID)
  • channel (which provider if REST API, which email server if stalwart)
  • timestamp
  • senderid: hash (email sender)
  • campaign
  • bge-m3 vector
  • date replied
  • id reply sent

notes:

  1. one citizen can write to many representative as part of the same campaign
  2. one citizen can write many times to the same representative

We probably should store them, so far we are using dupe_rank (duplicate ranks) for the later anything more than 0 means the citizen is a bit spammy ;)

#short term storage

at least name + email+ timestamp of the sender, whatever is needed to reply

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions