2011-09-30

A Case against Wayland

If you didn't know already, I'm telling it here again: I'm not very keen of Wayland and the mentality it was created in. If you had to summarize it in one sentence it is this:

"Graphic clients take full control of the rendering process, the graphics system shall be only a thin layer that passes control of the output device over to the clients."

In Wayland terms, Wayland manages shared memory (framebuffers) and defines a protocol to pass those regions of graphics memory to clients which then use their method of choice to draw on it. Sounds good, right?

WRONG!

High quality graphics rendering is a notoriously difficult subject. To make matters worse many parameters directly depend on the output device. Just to name a few:

  • subpixel arrangement

    Just do a Google image search for "subpixel arrangement" to get an idea what's in the wild already and what will hit us in the near future. The 5 subpixel layouts offered by FreeType ({horizontal,vertical}{RGB,BGR}, None), do not nearly account for what's out there. To make matters worse, a system may be confronted with different subpixel arrangements on the same workspace (think multihead configuration). If rendering (of subpixel dithering/antialiasing) happens client side you'll get serious problems if you have connected, say, an AMOLED screen and a video projector (no subpixels at all) in clone mode; you'll get weird colour seam artifacts on the projector if you render for AMOLED and blurry fonts on the AMOLED if you render for the projector; oh and if you add a pivot function (display rotation) you'll have to switch arrangements in situ. So far you have to restart applications on Linux after a switch to use it.
  • color management

    Same problem like with subpixels, but another subspace. If you get direct access to the framebuffer you'll write exactly those values as they are sent to the device; unless your framebuffer was in a so called "contact color space" and a color transformation happened before sending it to the device. Admittedly, one could do color management in the compositor.
  • physical resolution

    Pixels are not necessarily square -- or rectangular in the future; I hope that one day we'll work with displays using a tightest circle packing arrangement for pixels (honeycombs), with >200 pixels per cm; you'd need not antialiasing on such high resolution displays, but still subpixel dithering. Now imagine a honeycomb pixel display sharing the workspace with a square pixel video projector.
  • rendering performance

    Most of the time the suggested rendering backends for applications are OpenGL and/or OpenVG. Unfortunately OpenGL is not the best choice when it comes to font rendering. However (font) glyphs are ther major information carrier on your typical computer screen. There are some attempts for high quality font rendering with OpenGL (Google "Vector Texture Maps" or "Valve Distance Maps"), but they're not very efficient: A 600kB vector font file blows up to several MB of texture data for just one single glyph size. Technically modern graphics cards are more than capable of rendering vector fonts, filled curves that is, directly (the Cairo developers are working on a curve renderer based on OpenGL Fragment Shaders), but then you have two additional layers of indirection. Also it still doesn't solve the problem of device dependence.

The high quality printer business knows this for years. All the better printers expect their input data as either PostScript or PDF. Why? Because those are not readily rendered raster images (which is what a printer actually puts on the medium), but abstract descriptions of the final result. Similarly the original TeX did output a format called DVI (DeVice Independent). Then a raster image processor (RIP) turns it into a raster image, carefully tailored to the characteristics of the device. There are businesses specialized in the development of device specific RIPs.

So lets say we get all those issues solved in the Wayland infrastructure. Then you're still stuck with the Wayland compositor being responsible for window management and input event retrieval/disposal. You've read right folks: The Wayland compositor is responsible for reading events from /dev/input/* processing them (a nasty business BTW, because many input devices out there are really fucked up), and handing it out to the clients.

Then it defines the whole window management behavior. Yes! With Wayland you can't simply switch your window manager, leaving the rest of the system untouched. You want another window manager, you need to implement a full Wayland compositor. Taking care of all the scrunities involved. Even worse, if the Linux developers decided on creating a new input device interface, say to address some upcoming issues, all Wayland compositors need to be updated. There's a nice German term for this: "Arbeitsbeschaffungsmaßnahme".

And after all of this it's responsible for compositing the windows to the screen(s), which means all the "eye candy" (I call it distractions) apply.

Wayland severely violates one of the core principles of X11: The separation of method and policy. A Wayland compositor mixes method and policy.

I tell you where this will end: In a plugin/module system. A core/mainline Wayland server (managing buffers of square pixel framebuffer memory regions), to which modules are attached that deal with input processing, window management and composition-effects. For stability reasons those will run in separate processes communicating with Wayland through some IPC mechanism (and if Murphy applies this will probably be D-Bus). Then, to tackle all those problems with device dependent rendering, an abstract rendering protocol/library will be introduced. Congratulations! You've just reinvented X11! The whole complexity of X11 is not some anachronistic burden, it's a necessity. I fully understand the X11 as we have it now, doesn't implement all the things we'd require for true device independence as I outlined above, mostly because X11 itself is pixel based.

On the Wayland homepage you can find this picture:

X architecture according to Wayland developers
X architecture according to Wayland developers.

However this picture is terribly simplified.
First it should be said that the Compositor, just like the Window Manager (they don't need to be the same!) are just regular X Clients themself. Designating the Compositor as something "special" is just wrong.
Second it completely omits the fact, that the X server is not monolithic and does not accesses all the different kernel interfaces from the same core code. There are in fact several modules, often in multiple instances, each responsible for one specific interface or device.

This is a much more accurate picture of what the X architecture looks like:

A more accurate picture of the X architecture
A more accurate picture of the X architecture.

If you want to replace X11 with something better (not worse), be my guest, you're preaching to the choir.

My dream graphics system was completely abstract. Creating a window didn't involve selecting visual formats, framebuffer configurations. It was just "a window". Only when actual content is involved I want to tell the rendering subsystem, which color space I use. Ideally all applications worked in a contact color space (e.g. CIE XYZ or Lab), but sending images in some arbitrary color space, together with color profile information. Fonts/Glyphs would be rendered by some layer close to the hardware, to carefully adjust the rasterizing to the output devices properties. And last but not least the whole system should be distributed. Being able to "push" some window from one machine's display, to another machine's (and this action triggering a process migration) would be pinnacle. Imagine you begin writing an email on your smartphone, but you realize you'd prefer using a "usable" keyboard. Instead of saving a draft, closing the mail editor on the phone, transferring the draft to the PC, opening it, editing it there. Imaging you'd simply hold your smartphone besides your PC's monitor a NFC (near field communication) system in phone and monitor detects the relative position, and flick the email editor over to the PC allowing you to continue your edit there. Now imagine that this happens absolutely transparent to the programs involved, that this is something managed by the operating system.

This is where I want to go. Not that cheap effects lollipop desktops Ubuntu/Canonical, Intel, RedHat/Fedora and Gnome(3) aim for.


PGP Key:479F 96E1 2B49 8B0D 69F1 94EE F11B E194 2E6C 2B5E (last 32 bit of fingerprint are ID).

Impressum

This is a noncommercial private webpresence / weblog (blog), of
Dies ist eine nicht-kommerzielle Webpräsenz / Weblog (Blog), von

Wolfgang Draxinger

Please obtain the technical and administrative contact information from the WHOIS data of this domain. Technische und administrative Kontaktdaten sind bitte den WHOIS-Daten dieser Domain zu entnehmen.