
.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/ensemble/plot_forest_importances_faces.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_ensemble_plot_forest_importances_faces.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_ensemble_plot_forest_importances_faces.py:


=================================================
Pixel importances with a parallel forest of trees
=================================================

This example shows the use of a forest of trees to evaluate the impurity
based importance of the pixels in an image classification task on the faces
dataset. The hotter the pixel, the more important it is.

The code below also illustrates how the construction and the computation
of the predictions can be parallelized within multiple jobs.

.. GENERATED FROM PYTHON SOURCE LINES 16-24

Loading the data and model fitting
----------------------------------
First, we load the olivetti faces dataset and limit the dataset to contain
only the first five classes. Then we train a random forest on the dataset
and evaluate the impurity-based feature importance. One drawback of this
method is that it cannot be evaluated on a separate test set. For this
example, we are interested in representing the information learned from
the full dataset. Also, we'll set the number of cores to use for the tasks.

.. GENERATED FROM PYTHON SOURCE LINES 24-26

.. code-block:: Python

    from sklearn.datasets import fetch_olivetti_faces








.. GENERATED FROM PYTHON SOURCE LINES 27-29

We select the number of cores to use to perform parallel fitting of
the forest model. `-1` means use all available cores.

.. GENERATED FROM PYTHON SOURCE LINES 29-31

.. code-block:: Python

    n_jobs = -1








.. GENERATED FROM PYTHON SOURCE LINES 32-33

Load the faces dataset

.. GENERATED FROM PYTHON SOURCE LINES 33-36

.. code-block:: Python

    data = fetch_olivetti_faces()
    X, y = data.data, data.target



.. rst-class:: sphx-glr-script-out

.. code-block:: pytb

    Traceback (most recent call last):
      File "/build/scikit-learn-5bAKby/scikit-learn-1.4.2+dfsg/examples/ensemble/plot_forest_importances_faces.py", line 33, in <module>
        data = fetch_olivetti_faces()
      File "/build/scikit-learn-5bAKby/scikit-learn-1.4.2+dfsg/.pybuild/cpython3_3.13/build/sklearn/utils/_param_validation.py", line 213, in wrapper
        return func(*args, **kwargs)
      File "/build/scikit-learn-5bAKby/scikit-learn-1.4.2+dfsg/.pybuild/cpython3_3.13/build/sklearn/datasets/_olivetti_faces.py", line 125, in fetch_olivetti_faces
        mat_path = _fetch_remote(FACES, dirname=data_home)
      File "/build/scikit-learn-5bAKby/scikit-learn-1.4.2+dfsg/.pybuild/cpython3_3.13/build/sklearn/datasets/_base.py", line 1432, in _fetch_remote
        raise IOError('Debian Policy Section 4.9 prohibits network access during build')
    OSError: Debian Policy Section 4.9 prohibits network access during build




.. GENERATED FROM PYTHON SOURCE LINES 37-38

Limit the dataset to 5 classes.

.. GENERATED FROM PYTHON SOURCE LINES 38-42

.. code-block:: Python

    mask = y < 5
    X = X[mask]
    y = y[mask]


.. GENERATED FROM PYTHON SOURCE LINES 43-44

A random forest classifier will be fitted to compute the feature importances.

.. GENERATED FROM PYTHON SOURCE LINES 44-50

.. code-block:: Python

    from sklearn.ensemble import RandomForestClassifier

    forest = RandomForestClassifier(n_estimators=750, n_jobs=n_jobs, random_state=42)

    forest.fit(X, y)


.. GENERATED FROM PYTHON SOURCE LINES 51-61

Feature importance based on mean decrease in impurity (MDI)
-----------------------------------------------------------
Feature importances are provided by the fitted attribute
`feature_importances_` and they are computed as the mean and standard
deviation of accumulation of the impurity decrease within each tree.

.. warning::
    Impurity-based feature importances can be misleading for **high
    cardinality** features (many unique values). See
    :ref:`permutation_importance` as an alternative.

.. GENERATED FROM PYTHON SOURCE LINES 61-77

.. code-block:: Python

    import time

    import matplotlib.pyplot as plt

    start_time = time.time()
    img_shape = data.images[0].shape
    importances = forest.feature_importances_
    elapsed_time = time.time() - start_time

    print(f"Elapsed time to compute the importances: {elapsed_time:.3f} seconds")
    imp_reshaped = importances.reshape(img_shape)
    plt.matshow(imp_reshaped, cmap=plt.cm.hot)
    plt.title("Pixel importances using impurity values")
    plt.colorbar()
    plt.show()


.. GENERATED FROM PYTHON SOURCE LINES 78-79

Can you still recognize a face?

.. GENERATED FROM PYTHON SOURCE LINES 81-90

The limitations of MDI is not a problem for this dataset because:

 1. All features are (ordered) numeric and will thus not suffer the
    cardinality bias
 2. We are only interested to represent knowledge of the forest acquired
    on the training set.

If these two conditions are not met, it is recommended to instead use
the :func:`~sklearn.inspection.permutation_importance`.


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 0.002 seconds)


.. _sphx_glr_download_auto_examples_ensemble_plot_forest_importances_faces.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_forest_importances_faces.ipynb <plot_forest_importances_faces.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_forest_importances_faces.py <plot_forest_importances_faces.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_forest_importances_faces.zip <plot_forest_importances_faces.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_
