If you didn't know already, I'm telling it here again: I'm not very keen of Wayland and the mentality it was created in. If you had to summarize it in one sentence it is this:
"Graphic clients take full control of the rendering process, the graphics system shall be only a thin layer that passes control of the output device over to the clients."
In Wayland terms, Wayland manages shared memory (framebuffers) and defines a protocol to pass those regions of graphics memory to clients which then use their method of choice to draw on it. Sounds good, right?
WRONG!
High quality graphics rendering is a notoriously difficult subject. To make matters worse many parameters directly depend on the output device. Just to name a few:
The high quality printer business knows this for years. All the better printers expect their input data as either PostScript or PDF. Why? Because those are not readily rendered raster images (which is what a printer actually puts on the medium), but abstract descriptions of the final result. Similarly the original TeX did output a format called DVI (DeVice Independent). Then a raster image processor (RIP) turns it into a raster image, carefully tailored to the characteristics of the device. There are businesses specialized in the development of device specific RIPs.
So lets say we get all those issues solved in the Wayland infrastructure. Then you're still stuck with the Wayland compositor being responsible for window management and input event retrieval/disposal. You've read right folks: The Wayland compositor is responsible for reading events from /dev/input/* processing them (a nasty business BTW, because many input devices out there are really fucked up), and handing it out to the clients.
Then it defines the whole window management behavior. Yes! With Wayland you can't simply switch your window manager, leaving the rest of the system untouched. You want another window manager, you need to implement a full Wayland compositor. Taking care of all the scrunities involved. Even worse, if the Linux developers decided on creating a new input device interface, say to address some upcoming issues, all Wayland compositors need to be updated. There's a nice German term for this: "Arbeitsbeschaffungsmaßnahme".
And after all of this it's responsible for compositing the windows to the screen(s), which means all the "eye candy" (I call it distractions) apply.
Wayland severely violates one of the core principles of X11: The separation of method and policy. A Wayland compositor mixes method and policy.
I tell you where this will end: In a plugin/module system. A core/mainline Wayland server (managing buffers of square pixel framebuffer memory regions), to which modules are attached that deal with input processing, window management and composition-effects. For stability reasons those will run in separate processes communicating with Wayland through some IPC mechanism (and if Murphy applies this will probably be D-Bus). Then, to tackle all those problems with device dependent rendering, an abstract rendering protocol/library will be introduced. Congratulations! You've just reinvented X11! The whole complexity of X11 is not some anachronistic burden, it's a necessity. I fully understand the X11 as we have it now, doesn't implement all the things we'd require for true device independence as I outlined above, mostly because X11 itself is pixel based.
On the Wayland homepage you can find this picture:
X architecture according to Wayland developers.
However this picture is terribly simplified.
First it should be said that the Compositor, just like the Window Manager (they don't need to be the same!) are just regular X Clients themself. Designating the Compositor as something "special" is just wrong.
Second it completely omits the fact, that the X server is not monolithic and does not accesses all the different kernel interfaces from the same core code. There are in fact several modules, often in multiple instances, each responsible for one specific interface or device.
This is a much more accurate picture of what the X architecture looks like:
A more accurate picture of the X architecture.
If you want to replace X11 with something better (not worse), be my guest, you're preaching to the choir.
My dream graphics system was completely abstract. Creating a window didn't involve selecting visual formats, framebuffer configurations. It was just "a window". Only when actual content is involved I want to tell the rendering subsystem, which color space I use. Ideally all applications worked in a contact color space (e.g. CIE XYZ or Lab), but sending images in some arbitrary color space, together with color profile information. Fonts/Glyphs would be rendered by some layer close to the hardware, to carefully adjust the rasterizing to the output devices properties. And last but not least the whole system should be distributed. Being able to "push" some window from one machine's display, to another machine's (and this action triggering a process migration) would be pinnacle. Imagine you begin writing an email on your smartphone, but you realize you'd prefer using a "usable" keyboard. Instead of saving a draft, closing the mail editor on the phone, transferring the draft to the PC, opening it, editing it there. Imaging you'd simply hold your smartphone besides your PC's monitor a NFC (near field communication) system in phone and monitor detects the relative position, and flick the email editor over to the PC allowing you to continue your edit there. Now imagine that this happens absolutely transparent to the programs involved, that this is something managed by the operating system.
This is where I want to go. Not that cheap effects lollipop desktops Ubuntu/Canonical, Intel, RedHat/Fedora and Gnome(3) aim for.