pythonmachine-learningscikit-learnscikit-learn-pipeline

Is it possible to toggle a certain step in sklearn pipeline?


I wonder if we can set up an "optional" step in sklearn.pipeline. For example, for a classification problem, I may want to try an ExtraTreesClassifier with AND without a PCA transformation ahead of it. In practice, it might be a pipeline with an extra parameter specifying the toggle of the PCA step, so that I can optimize on it via GridSearch and etc. I don't see such an implementation in sklearn source, but is there any work-around?

Furthermore, since the possible parameter values of a following step in pipeline might depend on the parameters in a previous step (e.g., valid values of ExtraTreesClassifier.max_features depend on PCA.n_components), is it possible to specify such a conditional dependency in sklearn.pipeline and sklearn.grid_search?

Thank you!


Solution