While doing EDA of Titanic dataset in Kaggle I combined "Parch" and "SibSp" values into single feature "relative" containing total no of relatives of each passenger.
data['relative'] = data['Parch'] + data['SibSp']
On plotting barplot relatives vs Survived using Seaborn I have got following output
Query
I can't understand this barplot as Survived feature in this dataset has only two values 1 and 0 and relative also has 6 values ranging from 1 to 6 so how this barplot is generated and What information it is showing?
From seaborn.barplot
documentation.
help(seaborn.barplot)
The plot is showing you the mean of the 'Survived' values per bin ('N of relatives'). You could specify the estimator
parameter to change this behaviour (e.g. estimator=median
).
A bar plot represents an estimate of central tendency for a numeric variable with the height of each rectangle and provides some indication of the uncertainty around that estimate using error bars. Bar plots include 0 in the quantitative axis range, and they are a good choice when 0 is a meaningful value for the quantitative variable, and you want to make comparisons against it.