I am trying to plot the PDF of some data I have on the angular orientation of a particles in python, using sns. The data cover the -180,180 degree range and I have problems with the fitting of the distribution especially around the edges
This is the snippet of code I use for the plotting
for shear_value, color in zip(unique_shear_values, shear_palette_colors):
subset = orientations[orientations['Shear'] == shear_value]
sns.histplot(data=subset['Angle'], stat="density", kde=True, label=f'Shear: {shear_value}', color=color, bins = 128)
Is there any way to obtain a PDF that better fit the behavior of the data, especially at the edges?
A trick could be to copy all data with the angles below 0 adding 360º. And similarly copying all data with angles above 0 subtracting 360º. Then create the histogram with the double of bins. Also limit the x-axis to the range -180 +180.
A drawback is that the y-axis will show everything with half the height. Creating a dummy twin axis allows to show the correct y-scaling.
The following code illustrates the idea:
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
# create some reproducible test data
y, x = np.random.randn(2, 600, 3).cumsum(axis=1).reshape(2, -1)
angles = np.degrees(np.arctan2(y, x))
# double the range by repeating the angles to the left and to the right
angles = np.concatenate([angles, angles[angles < 0] + 360, angles[angles >= 0] - 360])
fig, ax = plt.subplots()
ax_twinx = ax.twinx()
sns.histplot(angles, kde=True, stat='density',
binrange=(-360, 360), bins=360, kde_kws={'bw_adjust': 0.5}, ax=ax_twinx)
ax.set_xlim(-180, 180)
ax.set_ylim(0, ax_twinx.get_ylim()[1] * 2) # set the limits to twice the limits of the dummy axis
ax_twinx.set_yticks([]) # remove the ticks of the dummy axis
ax.set_ylabel(ax_twinx.get_ylabel()) # copy the y-label
ax_twinx.set_ylabel(None) # remove the label of the dummy y-axis
ax.set_title('repeating angles left and right')