I'm of the opinion that "user perceived character" (henceforth UPC) iterator would be very useful in a unicode library. By UPC I mean the sense discussed in unicode standard annex 29, which is what a user perceives as a character, but might be represented in unicode as a codepoint or a grapheme-cluster. Since I typically work with latin languages, I always come up with examples like "I want to handle ü as one UPC, regardless of whether the UPC is a grapheme cluster, or a single codepoint".
Colleagues who are against a UPC iterator (or grapheme cluster iterator, take your pick) counter "You can normalize to NFC, and then use codepoint iteration", and "there is no use case for grapheme cluster iteration".
I keep thinking of latin-centric use cases, which maybe don't translate well to the unicode universe -- like I'm doing terminal output, I want to pad a column to N column widths, so I want to know how many UPCs are in a string...
I think what I want to know is:
It's unclear how your questions are not answered by UAX #29:
There are many such grapheme clusters, even for languages that only use the Latin alphabet as not all combining marks have compositions with all other letters/forms—for example, the gaps in this table on Wikipedia. Table 1a in UAX #29 has several non-Latin examples.
This is the purpose of UAX #29: to generalise grapheme cluster operations to all languages that are supported in Unicode.