Please consider this code for plotting multiple lines:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
x = [1,2,3]
y = [ [30, 4, 50], [300,400,500], [350,450,550] ]
plt.plot(x, y)
that produces:
I could not figure out how to do it in plotnine. So I asked a famous LLM, received a complex answer that I simplified as follows:
import numpy as np
import plotnine as p9
import pandas as pd
import matplotlib.pyplot as plt
xx = np.array(x * len(y))
yy = np.ravel(y)
yyy = [val for sublist in y for val in sublist]
gg = [i+1 for i in range(len(y)) for _ in range(len(x))]
data = pd.DataFrame({'x':xx, 'y':yy, 'gg':gg})
plot = (
p9.ggplot(data, p9.aes(x='x', y='y', color='factor(gg)')) +
p9.geom_line()
)
plot.draw(True)
The two images are different and the correct one is the first, built by matplotlib.
So the question: how am I supposed to do this simple plot with plotnine?
Another reason not to use the infamous LLM. Here's what you can do:
xx = np.repeat(x, len(y))
yy = np.ravel(y)
gg = np.tile(np.arange(len(y[0])), len(x))
data = pd.DataFrame({'x':xx, 'y':yy, 'gg':gg})
Or with pure pandas like:
data = (pd.DataFrame(y, index=x) # your data
.stack() # to long form for p9
.rename_axis(['x','gg',]) # rename the `x` and group
.reset_index(name='y') # rename the `y`
)
Then
plot = (
p9.ggplot(data, p9.aes(x='x',y='y',color='factor(gg)')) +
p9.geom_line()
)
plot.draw(True)
Output: