-
Notifications
You must be signed in to change notification settings - Fork 71
Open
Description
Hello, I am new to python and machine learning but need to use the library for a project. I read the website and the sample code but am still confused on how I can retrieve the features that have been (selected?) by each of the Relief algorithms.
Apologies if the site goes over this, but I didn't see any information on this. I had a couple questions:
- How do we get back the features selected by each algorithm?
- The sample code below for the ReliefF algorithm prints a number at the end of running the code, is this number relevant to feature selection?
import pandas as pd
import numpy as np
from sklearn.pipeline import make_pipeline
from skrebate import ReliefF
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score
genetic_data = pd.read_csv('https://github.com/EpistasisLab/scikit-rebate/raw/master/data/'
'GAMETES_Epistasis_2-Way_20atts_0.4H_EDM-1_1.tsv.gz',
sep='\t', compression='gzip')
features, labels = genetic_data.drop('class', axis=1).values, genetic_data['class'].values
clf = make_pipeline(ReliefF(n_features_to_select=2, n_neighbors=100),
RandomForestClassifier(n_estimators=100))
print(np.mean(cross_val_score(clf, features, labels)))
>>> 0.795
Thanks for any help, I've been trying to figure out this code using the internet for a couple weeks now but have not really gotten anywhere
Metadata
Metadata
Assignees
Labels
No labels