MLPP Featuring 1.1.0
Code run at the CNAM to get MLPP features.
Steps to run the featuring:
1) Run the jar with spark-submit. Example:
spark-submit \
--executor-memory 110G \
--class fr.polytechnique.cmap.cnam.filtering.mlpp.MLPPProvisoryMain \
./SNIIRAM-flattening-assembly-1.0.jar cnam 10 30
Note: the expected arguements are, respectively, environment, lagCount and bucketSize (in days)
2) The csv features will be written to /shared/mlpp_features/<broad|narrow>/csv/, so a cal to hdfs get is needed. Example:
mkdir mlpp_broad && cd mlpp_broad
hdfs dfs -get /shared/mlpp_features/broad/csv/*
3) Copy the MLPP_featuring.py script to the same directory of the local features and run it. example:
cp MLPP_featuring.py mlpp_broad && cd mlpp_broad
python MLPP_featuring.py
Note: results.tar contains the results of the longitudinal multinomial model implemented in MLPP-147. The archive contains an HTML extraction of the notebook used to produce the results, and the coefficients obtained for several parameters. The coefficients were saved in text files using numpy.savetxt