On Chrome linux code such as the following
speak('<?xml version="1.0"?><speak>Intro <break time="200ms"/>the rest.</speak>');
has the TTS engine reading out the xml stuff. On Android browsers it understands it and introduces a break.
I don't want to browser sniff, but can't see what test I should use to take advantage of SSML where it is understood, but serve something plainer where it isn't
I don't know about Chrome on Android - perhaps it's using Google's online TTS? But the only browser that officially supports SSML is Edge (old one, not the new Blink-based one). See: https://github.com/WICG/speech-api/issues/37#issuecomment-416923362