Skip to contents

Performing feature selection in a automatic way based on correlation and feature importance.

Usage

mi_filter_feat(data, cor_thresh = 0.7, imp_thresh = 0.99, union = FALSE)

Arguments

data

The data frame returned by mi_to_numer().

cor_thresh

The threshold set for Pearson correlation. If correlation value is over this threshold, the two features will be viewed as redundant and one of them will be removed.

imp_thresh

The threshold set for feature importance. The last several features with the lowest importance will be removed if remained importance lower than imp_thresh.

union

The method for combining the decisions of correlation method and importance method. If TRUE, any of the features calculated by the two methods will be returned. Otherwise, only features in the results of both methods will be returned.

Value

The names of the features that should be removed.