I have a Postgres SQL table that looks like the following:
The index_constituents table:
id date opening_closing ISIN
1 2016-05-03 O AAAAAA
2 2018-04-03 C ABDFGJ
5 2020-01-02 C HIFHSA
10 2024-01-01 C FOHOAS
111 2022-05-05 O JIOFHS
15 2016-05-03 C AAAAAA
16 2018-04-03 O ABDFGJ
20 2020-01-02 C HIFHSA
....
The opening_closing field will only have vals of O and C.
The real table has over 13k rows and something like 2k dates in total.
What I am trying to do is check for each date. If the ISIN appears in both the O and C rows. If it appears in one but not the other, then I would like to return the ISIN.
So In the example above, ISIN=AAAAA appears in id 1 and 15, where row 1 has O and row 15 has C, so that ISIN is okay and I DO NOT want it to be returned.
Whereas ISIN=FOHOAS only appears in a C row (id=10) but does not appear in a row with O, so I would like it to be returned in the final result.
The final result would just be all the ISINs that appear in either a C row or O row but not both.
For the above example, I would expect FOHOAS and JIOFHS to be returned only.
You could group by the date and the isin, and then count the distinct number opening_closing values it has:
SELECT date, isin
FROM index_constituents
GROUP BY date, isin
HAVING COUNT(DISTINCT opening_closing) < 2