we have a lot of code duplication in the udf code. if we want to reduce this, we should check what is duplicate and should be pulled out
i.e:
- both sequence classification udfs are basically identicall. have them be called over seperate templates, but consolidate the implementation? or pull out everything that is duplicate into a seperate file and import into both udfs?
- the code for generating the rank column appears in multiple udfs
goal: investigate what else is duplicate, and how best to improve this. write tickets accordigly