Skip to content

Questions about sample code? #57

@megancooper

Description

@megancooper

Hello, I am new to python and machine learning but need to use the library for a project. I read the website and the sample code but am still confused on how I can retrieve the features that have been (selected?) by each of the Relief algorithms.

Apologies if the site goes over this, but I didn't see any information on this. I had a couple questions:

  1. How do we get back the features selected by each algorithm?
  2. The sample code below for the ReliefF algorithm prints a number at the end of running the code, is this number relevant to feature selection?
import pandas as pd
import numpy as np
from sklearn.pipeline import make_pipeline
from skrebate import ReliefF
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score

genetic_data = pd.read_csv('https://github.com/EpistasisLab/scikit-rebate/raw/master/data/'
                           'GAMETES_Epistasis_2-Way_20atts_0.4H_EDM-1_1.tsv.gz',
                           sep='\t', compression='gzip')

features, labels = genetic_data.drop('class', axis=1).values, genetic_data['class'].values

clf = make_pipeline(ReliefF(n_features_to_select=2, n_neighbors=100),
                    RandomForestClassifier(n_estimators=100))

print(np.mean(cross_val_score(clf, features, labels)))
>>> 0.795

Thanks for any help, I've been trying to figure out this code using the internet for a couple weeks now but have not really gotten anywhere

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions