More commonly, bulk micromachined MEMS devices are monocrystalline.
> and is very fragile at that
Normalizing for density, monocrystalline silicon is stronger than steel!
All the MEMS that are actually used in practice avoid this: they are effectively entirely flexural structures, nothing rubs against anything else, they just flex in place. This kind of structure can last almost indefinitely (especially since they're usually made of a single crystal, making the usual creep/cracking wear mechanisms mostly a non-issue)
Fundamentally, an oscillator/resonator is the sensing of nothing.
You want the frequency of output to be completely independent of acceleration, strain, rotation, pressure, magnetic/electric field, etc). That's really hard to do and involves a combination of building very robust silicon packaging, minimizing (making symmetric) all contacts to the outside to shield it, and compensating for every possible effect you can measure.
If xMEMS is that good, it should be easy to verify in frequency response and harmonic distortion plots.
I think there are graphs out there showing some pretty good distortion measurements for xMEMS tweeters (I googled [0]), but for what it's worth I'm not too big on FR and THD curves these days, as I've not found them to be a great predictor of actual subjective listening pleasure. For example, some of my favourite audio gear, like my Letshouer ej07 earphones or Audio.gd NFB-28.38 DAC, look like absolute trash on paper in terms of FR/THD; yet, for whatever reason, they're the ones I _enjoy_ listening to the most and can get lost in for hours.
I feel like I'm about to go off on one and risk sounding like one of those gold-plated hdmi cable people, and this next bit is going to be tenuous and probably not entirely correct, but I'm gonna write it down anyway. I also feel like I need to point out that the most expensive cable I own cost like 30 quid, and that I'm far from a neuroscientist and would love to be corrected on any of these assumptions that follow. With that said -
I have come to believe that there's more to listening experience and auditory immersion than the measurements that are typically published and most people look for on paper. This is why one of the characteristics of MEMS speakers that I find most interesting, that isn't typically measured, is the phase accuracy and coherence that they're capable of across their entire frequency response, probably due to their smaller size, lighter moving mass, flat driver surface & more consistent silicone-based manufacturing process.
I've always found the way our brains process audio signals fascinating. Think about binaural beats, and the fact that we're able to hear them _at all_. If you were to take a 'binaural beat' recording listen to just the left channel, you hear a drone; listen to the right on it's own and you get the same thing at a slightly different pitch. But put each channel to each ear and you'll also hear this lower frequency oscillation in the middle of your head. Sure you'd expect this from a mixing desk or a pair of speakers a few feet away, but these are earphones - there's no crosstalk, the waves never interact, they don't have a chance to interfere. There's no actual physics involved. Instead, somewhere in there, your brain is summing the two _perceived signals_, and what you're hearing is the interference pattern of the waves interacting. _In your brain_. The fact that we don't have mixing desks in our heads - our ears aren't microphones, there are no line level signals being summed - yet we can still somehow hear that interference, has always blown my mind a bit.
So whenever I've thought about this stuff, I've always kind of assumed that it's that same part of our brain that's responsible for our auditory spatial localisation, and that the delay/phase shift between the left and right signal that we get from off-centre sound sources must create some kind of comb filter in our heads that our brains use to say 'ah ok that's sound came from over there'. I'd assume this makes less of a difference at higher frequencies given that the wavelength of a 2khz sound is shorter than the distance between our ears, and obviously there is more to auditory spatial perception than just phase (volume, relative frequency response from our faces being in the way, etc.). Otherwise we'd be able to produce realistic binaural recordings without having to use those fake human heads with mics in their ears.
Anyway, I think this is why planar magnetic headphones, and to an extent electrostatic speakers (where the driver diaphragm is completely flat, way lighter than conventional cones with magnets attached to them, and hence can move much more uniformly keeping phase distortion almost non-existent) are capable of projecting a soundstage that our brains can at the height of listening enjoyment perceive as wider or more 'multi-dimensional' than what we expect based on our auditory experience of the world around us. My assumption here is that when we overload our brains with an unexpected amount of detail that's too coherent and consistent to be discarded as noise, they start to make things up to help us interpret that detail, and that's where the real listening magic happens. I guess it's a bit like when you're having a really good coffee or wine, and you're convinced that you can taste peach or cinnamon, but you know for a fact that there's no peach or cinnamon in there - it's just your brain communicating its interpretation of detail that it doesn't know how to process.
So the xMEMS earphones I've got are far from my favourite & they were a fraction of the price of some of my other gear. However, in terms of soundstage they're just really impressive, especially given that they're just bluetooth earbuds. Along the same lines, I also think that they have the most unobtrusive ANC of any earphones I've listened to, which makes sense if you think about how ANC works & the part that phase accuracy plays there (especially with the higher frequencies, which is where you tend to notice bad ANC the most).
I'm really curious to hear what xMEMS tweeters are capable of in a well-tuned wired earphone. I'm pretty sure that it's the Sonion EST electrostatic tweeter in my EJ07 that makes them sound 'special' to me, but that's expensive and heavy (and is annoying me because it's detached itself from the shell so it's just rattling around in there). So if I could get something smaller/cheaper in a wired set I'd be very interested.
Anyway TL;DR - I don't think perceived soundstage has anything to do with frequency response graphs, and probably little to do with THD measurements. Although phase accuracy would affect those measurements on paper, I don't think it can be inferred from them unless specifically measured. I'm not even sure if anything I wrote here makes sense, but if you've read this far thanks for coming to my ted talk.
Where can I read more about that?
You do not want to be in the quartz crystal business going forward; it's almost as dead as vacuum tubes, even if the manufacturers don't know it yet. Nothing will be left to fight over but the very cheapest commodity parts.