Next, you will see a statistical overview and a set of visualizations for each feature.
They include descriptive statistics, feature distribution visuals, distribution of the feature in time, and distribution of the feature in relation to the target. What's cool here:
- For each feature type (numerical, categorical, and datetime), the report generates a set of custom visualizations. They highlight what is most relevant for a given feature type.
- If you are performing the comparison, it also helps detects the changes quickly. For example, notice the number of new categories for a categorical feature.
- The visualization of the feature's relationship with the target helps build intuition on how useful the feature is or detect a target leak.
What's more, each plot is interactive! Evidently uses Plotly on the back end, and you can zoom in and out as needed, or switch between logarithmic and linear scale for a feature distribution, for example.
For example, here is how the summary widget
for a numerical feature might look: