Replies: 11 comments
-
I don't think this is a great fit for seaborn. It's already in pandas (as you note) and also |
Beta Was this translation helpful? Give feedback.
-
@mwaskom Coincidentally, I might have an interesting use case for this where it would be beneficial to have an easy way to add additional axes (or at least a second one similar to I want to visualize the result of a grid search on a regression model while tracking two metrics/scores. The catch is that one metric (max_error) is absolute, and the other (MAPE) is a percentage. For me, both metrics are useful because they give me an estimate of both overall performance and worst-case performance. One way I can currently do this is by using a facet over metrics: (
so.Plot(grid_result, x="max_depth", y="score")
.facet(col="metric")
.add(so.Line(), so.Agg())
.add(so.Band())
.share(y=False)
) This is nice, but a bit hard to read, because I need to go back and forth between figures. With base matplotlib, I can use fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
sns.lineplot(grid_result.query("metric == 'mape'"), x="max_depth", y="score", color="tab:blue", ax=ax1)
sns.lineplot(grid_result.query("metric == 'max_error'"), x="max_depth", y="score", color="tab:red", ax=ax2)
ax1.set_ylabel("mape (blue)")
ax2.set_ylabel("max_error (red)") It would be nice if we could get this done in seaborn without having to drop down to matplotlib; especially so because this would free up the dimensions used by a facet to be used with by other variables, e.g., grid search parameters. |
Beta Was this translation helpful? Give feedback.
-
I think a |
Beta Was this translation helpful? Give feedback.
-
Isn't |
Beta Was this translation helpful? Give feedback.
-
I'm having trouble seeing it that way. In a parallel coordinates plot there isn't a separate (
sns.load_dataset("iris")
.rename_axis("example")
.reset_index()
.melt(["example", "species"])
.pipe(so.Plot, x="variable", y="value", color="species")
.add(so.Lines(alpha=.5), group="example")
) BTW
This seems to work for me? (Of course it has the same limitations of not playing nicely with faceting, etc., as the function interface) f, ax1 = plt.subplots()
ax2 = ax1.twinx()
p = so.Plot(healthexp, x="Year", group="Country")
p.add(so.Line(), so.Agg(), y="Spending_USD").on(ax1).plot()
p.add(so.Line(color="r"), so.Agg(), y="Life_Expectancy").on(ax2).plot() |
Beta Was this translation helpful? Give feedback.
-
In the first plot above, would it be possible to (minmax) normalise the data on the Y-axis? |
Beta Was this translation helpful? Give feedback.
-
Right! I have indeed misunderstood the parallel coordinates plot and they are separate things; sorry about that. @mwaskom Should I create a new issue/feature request to track
Cool! Then this was user-error on my side. I didn't call healthexp = sns.load_dataset("healthexp")
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
(
so.Plot(healthexp, x="Year", group="Country", y="Spending_USD")
.add(so.Line(color="tab:blue"), so.Agg())
.on(ax1)
)
(
so.Plot(healthexp, x="Year", group="Country", y="Life_Expectancy")
.add(so.Line(color="tab:red"), so.Agg())
.on(ax2)
)
@EwoutH Absolutely. Just transform your data before handing it over to the plot :) import numpy as np
import pandas as pd
import seaborn.objects as so
iris: pd.DataFrame = sns.load_dataset("iris")
def normalize(df, columns):
normalized = df.loc[:, columns].apply(
# min/max normalization of a column
lambda data: (data - np.min(data)) / np.ptp(data)
)
return df.assign(**{col: normalized[col] for col in normalized})
(
iris.rename_axis("example")
.reset_index()
.transform(
normalize,
columns=["sepal_length", "sepal_width", "petal_length", "petal_width"],
)
.melt(["example", "species"])
.pipe(so.Plot, x="variable", y="value", color="species")
.add(so.Lines(alpha=0.5), group="example")
) |
Beta Was this translation helpful? Give feedback.
-
This isn't good enough tracking for you? :) Line 602 in 021a20f
You don't need to invoke The key thing is explicitly calling |
Beta Was this translation helpful? Give feedback.
-
You could also do this with a move transform: class NormByOrient(so.Move):
def __call__(self, df, groupby, orient, scales):
other = {"x": "y", "y": "x"}[orient]
return df.assign(**{
other: df.groupby(orient)[other]
.transform(lambda x: (x - x.min()) / (x.max() - x.min()))
})
(
iris
.rename_axis("example")
.reset_index()
.melt(["example", "species"])
.pipe(so.Plot, x="variable", y="value", color="species", group="example")
.add(so.Lines(alpha=.5), NormByOrient())
) I'm 👎 on adding a move transform that does this specifically but open to having it work within a more general operation. The existing But also I suspect that in most cases where you're doing a parallel coordinates plot your data are going to be in "wide form" as that's how you'd hand them to an ML library so the |
Beta Was this translation helpful? Give feedback.
-
Indeed that's the crux. I actually think the documentation is fine as is; it's just a bit imperceptible because it is part of the detailed explanation of If you are willing to accept a PR for this I can look into that. |
Beta Was this translation helpful? Give feedback.
-
Duplication of the information doesn't sound like a great idea but maybe "notes" would be a better section, then again, the numpydoc standard says:
Of course, the docs don't really adhere to that standard religiously... |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
When visualizing high-dimensional datasets, parallel coordinates plots are sometimes very useful. I would love for Seaborn to have a build in function to do this!
Resources
Beta Was this translation helpful? Give feedback.
All reactions