Skip to content

Documentation for Preprocessing

Special kind of Branch.

If a Preprocessing pipe is added to a PHOTONAI Hyperpipe, all transformers are applied to the data ONCE BEFORE cross validation starts in order to prepare the data. Every added element should be a transformer PipelineElement.

Examples:

1
2
3
pre_proc = Preprocessing()
pre_proc += PipelineElement('OneHotEncoder', sparse=False)
my_pipe += pre_proc
Some transformations should be performed bundled at the beginning. Here at the example of the OneHotEncoder. Due to the cross-validation split, some cateogries can no longer occur in any subsets. Therefore, a trained OneHotEncoding could fail on other subsets. By using the Preprocessing object, this effect can no longer appear.

Source code in photonai/base/photon_elements.py
class Preprocessing(Branch):
    """
    Special kind of Branch.

    If a Preprocessing pipe is added to a PHOTONAI Hyperpipe,
    all transformers are applied to the data ONCE
    BEFORE cross validation starts in order to prepare the data.
    Every added element should be a transformer PipelineElement.

    Example:
        ``` python
        pre_proc = Preprocessing()
        pre_proc += PipelineElement('OneHotEncoder', sparse=False)
        my_pipe += pre_proc
        ```
        Some transformations should be performed bundled at the beginning.
        Here at the example of the OneHotEncoder. Due to the cross-validation split,
        some cateogries can no longer occur in any subsets.
        Therefore, a trained OneHotEncoding could fail on other subsets.
        By using the Preprocessing object, this effect can no longer appear.

    """
    def __init__(self):
        """Initialize the object."""
        super().__init__('Preprocessing')
        self.has_hyperparameters = False
        self.needs_y = True
        self.needs_covariates = True
        self._name = 'Preprocessing'
        self.is_transformer = True
        self.is_estimator = False

    def __iadd__(self, pipe_element: PipelineElement):
        """
        Add an element to the sub pipeline.

        Parameters:
            pipe_element:
                The transformer object to add.

        """
        if hasattr(pipe_element, "transform"):
            super(Preprocessing, self).__iadd__(pipe_element)
            if len(pipe_element.hyperparameters) > 0:
                raise ValueError("A preprocessing transformer must not have any hyperparameter "
                                 "because it is not part of the optimization and cross validation procedure")

        else:
            raise ValueError("Pipeline Element must have transform function")
        return self

    def predict(self, data, **kwargs):
        warnings.warn("There is no predict function of the preprocessing pipe, it is a transformer only.")
        pass

    @property
    def _estimator_type(self):
        return

__iadd__(self, pipe_element) special

Add an element to the sub pipeline.

Parameters:

Name Type Description Default
pipe_element PipelineElement

The transformer object to add.

required
Source code in photonai/base/photon_elements.py
def __iadd__(self, pipe_element: PipelineElement):
    """
    Add an element to the sub pipeline.

    Parameters:
        pipe_element:
            The transformer object to add.

    """
    if hasattr(pipe_element, "transform"):
        super(Preprocessing, self).__iadd__(pipe_element)
        if len(pipe_element.hyperparameters) > 0:
            raise ValueError("A preprocessing transformer must not have any hyperparameter "
                             "because it is not part of the optimization and cross validation procedure")

    else:
        raise ValueError("Pipeline Element must have transform function")
    return self

__init__(self) special

Initialize the object.

Source code in photonai/base/photon_elements.py
def __init__(self):
    """Initialize the object."""
    super().__init__('Preprocessing')
    self.has_hyperparameters = False
    self.needs_y = True
    self.needs_covariates = True
    self._name = 'Preprocessing'
    self.is_transformer = True
    self.is_estimator = False