Sun, 01 May 2005

Its rough, its dirty, its still pretty cool. Diana and I originally put together a mockup for the RH blog aggregator, to replace the existing monstrosity:

It sounds like the RH blog aggregator will be moving to Planet soon anyway, so I thought I'd get an early start on the template/CSS munging. And since I was already doing it for Planet, I figured, why not take a whirl at the Planet I know and love?

Here's Planet Gnome with the new template/css. Obviously I'm not running the planet.py updater periodically, so its fixed where I left it last night. Lots of little cleanups left to do, like using day names instead of dates, hardcoding the little image sizes to improve render speed, making it work at narrower sizes, make titles link, etc. There's also a couple visual details off relative to the mockup (some of the spacing, the blog entry titles should be darker which looks better and is easier to read). I chose the colors 10 seconds ago, so they suck :-)

This design isn't just purty, its designed to improve reading too.

  • The primary contextual information used for orienting the conent of a blog entry is "who". I think most regular readers are picking up the "who" from photos, so those are visually distinct off to the left.
  • The name and face are also grouped closely together, which should help people build the association.
  • Your eye can skim down the un-noisy left hand side (also note that we break days using a color band with the day name, so the lightly coloured bar at the left is basically the orientation/skimming bar) to find entries
  • Once you've located the start of an entry (this isn't just for "searching" through the page, its a frequent orientation procedure while you read), your eye shifts over to the text in a familiar left-to-right reading direction (compare with the existing layout where people's eyes tend to try and scan on the right which is mixed with noisy text, then once they are oriented/context loaded as to the person, have to scan left and find the start of the text line to read... the little bits of extra work add up :-).
  • Most of the white on the page is inside the actual blog entry content boxes (in the word balloons). Restricting white like this draws your eye into the boxes. This reduces some of the visual overload problems in the existing pgo (its even worse in the RH blog aggregator right now). In other words, the layout draws your eye into the text, which is what the blog is really about, and also keeps the amount of text from seeming overwhelming (which is what "wandering eye with no strong visual reference" tends to do to people). The existing pgo has a strong wandering eye effect, which seriously discourages people from actually reading, whereas if the text seems more managable people are more likely to dive in.
  • I think the titles of blog entries are usually useless, so I almost dropped them altogether, but didn't, as you can tell.
  • We avoided strong visual lines and dividers to make it easier to pleasantly "read through" the whole page. Lines get in the way of your eye, so they should only be used when you actually want to disrupt or control the eye's flow.
  • As far as high level design goals, I think the "word balloon" increases the feeling of attribution. Its suggestive that there's a real person saying these things. I think its less of an issue for pgo, but for Red Hat that both improves the humanness of the blogs (the main reason companies are starting to have them, I think), and it makes it clear that the statements are individual opinions. Its subtle, but I think it has impact on how people interpret the information.
  • The width of text is restricted. Its easier to read relatively narrow text columns.

Sat, 26 Mar 2005

Since he caught a glimpse of Kristian's wobbly windows, Bryan has stalked Red Hat's dark and hallowed halls, breathing fire, demanding his chance in the directorial seat. So it is that we bring you Monkey Hoot productions first, uh, production. Since a lot of people have asked, these videos show Luminocity running on two different laptops, both with fairly slow/old video cards (Intel i830 and ATI Radeon 7500 mobility) and open source drivers.

Luminocity


Theora | MPEG4
Kristian showing off his spring-modeled "wobbly windows" effect in Luminocity,Owen's crack-tastic OpenGL based window/compositing manager. This is the only effect that requires GL hardware acceleration in Luminocity (and not even much at that, Kristian's development machine uses an embedded Intel video card). Notice that menus and tooltips are also animated as they pop on and off the screen. The animation effects on window impulses are implementable in a modular manner, allowing anyone to write new effects. Monkey Hoot productions would like to thank "The Blair Witch Project" for its inspirational camera work and lighting, and apologize to our viewers.


Physics Models for Window Moving


Theora | MPEG4 | MJPEG
The wobbly window effect is mildly addictive. Kristian hasn't gotten much work done since he wrote it. He (and now I) spends all day moving windows around and watching them settle. This video shows off the motion a little better. It also demonstrates Luminocity's live workspace switcher (aka pager) which updates in synch with the screen. We were surprised by how much more tangible windows felt when they gave a little (i.e. less than in this video) as you moved them (like a real world object). Of course, we turned the effect on "high" for this demo so it'd be very visible.


 

Live Updating Workspace Switcher


Theora | MPEG4 | MJPEG
The workspace switcher in luminocity is updated in-synch with the window contents. Also notice that the workspace switcher renders each window rather than just "capturing" what each workspace looks like (this can be seen in the absence of a background in the pager), allowing us to do nice UI tricks in the future. Since its just re-using the existing window textures, applying them to a new (smaller) surface, the workspace switcher has basically no performance overhead when using hardware accel (other than a few new surfaces for your graphics card to render, no biggie for the card). Its a little hard to see in this video, but Luminocity also has a nifty workspace switching animation. It zooms out as it pans down to the next workspace and then zooms back in. Of course, since its also the compositing manager, any on screen action doesn't freeze as you switch workplaces. Watch for this as we switch into the 3rd workspace containing an animated circle-o-icons, the icons keep spinning as you switch.


 

Movies Still Play as the Window is Warped


Theora | MJPEG
A GStreamer movie pipeline rendering into Luminocity. Notice that its warping the movie as it plays without slowdown (and of course, updating the workspace switcher live, which is just re-using the same GL texture rendered onto a smaller surface).


 

OpenGL Accelerated Alpha Compositing


Theora | MPEG4 | MJPEG
 

JPEG
Luminocity uses GL for hardware accelerated alpha compositing. It works well with software GL implementations too. Of course, since Luminocity is a technology testbed, we use it for "unfocused windows" here, probably not a very good long term use ;-). In one of his earlier demos Owen hijacked the mouse scroll wheel to control window transperancy. Bad Owen! This video also has another nice demonstration of wobbly menus. They feel really nice, though they'll probably need to be faster in a "real world" version. The screenshot shows fdclock rendering in Luminocity. fdclock, unlike the video, actually uses a 32-bit ARGB visual to specify where (and how much) transparency it wants. No videos because our camera man is tired (you can run it yourself with "fdclock -ts").


 

Border/Contents Resize Synchronization

No videos yet, alas.
Wicked, naughty, camera man.
And there's only one punishment...
When you resize a window inside Luminocity it doesn't redraw the borders until the application is done redrawing the window contexnts. This means that they feel like "one piece" instead of the staggered redraw you see on traditional window managers (where the border gets ahead of the contents, and then they catch up). The effect on the perceived "reality" of windows on the screen is excellent, i.e. windows feel more like solid real objects (the same sort of improvement as double-buffering of widgets gave). It also, ironically, makes window resizing feel smoother (though each redraw is slightly slower and not progressive).


 

Cairo

While not as sexy as the Luminocity videos, here (finally) are screenshots of GTK+ themes rendering with Cairo enhancements. Cairo both increases the rendering quality of GTK+ widgets, and allows for widgets that scale beautifully to different sizes (of course, we also have a Cairo driven SVG renderer, knock yourself out). When you get your 600 dpi monitor, we'll be there :-)

Dynamic Themes - each widget unique

Tiger Stripes

Planet Rings

Sketch
In my last X rendering post I discussed dynamic theme rendering, where every time a widget is rendered it looks slightly different. By writing algorithmic renders rather than fixed pixbuf based widgets, we can increase how dramatic the visual effects are without driving people nuts. For example, the tiger stripe buttons have proved very reasonable for long term use. However, any single rendering of a tiger stripe button would get old very quickly when repeated all over the screen ad nauseum. Currently visual designers are extremely restricted in what they can do without a theme being unusable. That's largely the reason all themes look basically the same. We hope dynamic themes will allow visual designers to increase the variety of their palette without producing themes that wear quickly. Of course, its still easy to go overboard, *grin*. By providing higher level drawing primitives, Cairo makes it much easier to implement dynamic themes.


 

Resolution Independent Rendering


A large checkbox rendered with Cairo. This would look fantastic as a checkbox "normal size" on a next-gen 600dpi display. :-)


Cairo makes it easy to draw well rendered custom-widgets. Here's an example of how the GTK+ color picker looked before and after Cairo integration.

Getting Luminocity

It took me about half an hour of work (and some compiling time) to get Luminocity running using jhbuild. Eventually we'll add a jhbuild target for compiling Luminocity. Luminocity is not intended to turn into a real world window/compositing manager. Instead, its a technology test bed. We're trying stuff out in Luminocity and will be rolling them into Metacity (and hence stock GNOME) as they mature. Don't expect Luminocity to have the frills and smarts you'd expect from a normal window manager. You'll need hardware GL acceleration enabled to have wobbly windows work, though you can try the other bits of luminocity without it. Emebedded Intel video cards (which have open source DRI drivers) will work just fine. ATI and NVidia cards, of course, work even better.

This section has been superceded by the Luminocity wiki page which has simpler more up to date build instructions

  1. If you have not used jhbuild, get jhbuild from Gnome CVS module 'jhbuild'. Then run jhbuild bootstrap to compile basic tools such as autoconf and automake (just agree with its defaults).
  2. Run jhbuild build xserver Xcomposite Xdamage Xrender Xext Xcursor X11 Xtst. This will build the freedesktop.org xserver, including the damage and composite extensions, and the Xephyr/Xfake nested X servers.
  3. Apply a small patch that evilly hacks around some issues with DAMAGE in the X server
  4. Checkout module "luminocity" from Gnome CVS.
  5. With the jhbuild buildroot at the start of your PATH (so you get autoconf, automake, etc from the buildroot): from the luminocity directory run ./autogen.sh --prefix=PATH_TO_JHBUILD_TREE, then make and finally make install. Alternatively, see README.jhbuild for instructions on adding a "luminocity" target to jhbuild (eventually we should just include this in jhbuild).
  6. Now to get things running. Luminocity grabs windows from an existing X server and renders them in its own GL context. This technique is not intended to be particularly efficient, but it works surprisingly well for a development testbed. We will use "Xfake" as the X server. Xfake doesn't display windows sent to it, so they only get rendered on screen once (by luminocity). Xfake is included in the "xserver" module built by jhbuild above. If you are running at 1024x768, run jhbuild run Xfake -ac -screen 1024x3072x32 :1 to start Xfake on display :1. Basically, use XRESULTIONx(4*YRESOLUTION)x32. This is because Luminocity starts with 4 workspaces by default.
  7. Now lets display something on the Xfake display. From a new window, set DISPLAY to ":1" (e.g. export DISPLAY=:1). Then run any program you want to use, e.g. gnome-terminal. Of course, you won't see a window since its displaying to the fake X server.
  8. Start Luminocity with luminocity -f :1 PATH_TO_BACKGROUND_IMAGE. Including a background image is important since a bug in the wobbly windows rendering code causes major performance problems when the background is missing. Luminocity should now be running fullscreen, display whatever application you launched earlier (in this example, gnome-terminal). Another bug in wobbly windows increases the animation timeout every time you open a new window. This means that for every window you open, wobbly windows get jerkier and jerkier. Oops! Don't worry, this is a silly bug and not a sign that we're overloading your card or something.
  9. Sometimes windows start with their titlebars off screen. To move them onto the screen, you'll need to drag them while holding down the super key (if you have a Windows key on your keyboard, try this). You may have to remap your super key to make this work, esp. if you have no Windows key, e.g. xmodmap -e 'keycode 95=Super_L', which will then allow you to move windows by dragging them while holding down the F11 key.

If you need help, or you're interested in contributing to Luminocity etc, you can probably find some knowledgable people on #fedora-desktop on irc.gnome.org (naturally, you don't have to be running fedora ;-) Eventually we'll probably have a channel for this. People to look out for are: "owen", "ssp" (Soeren Sandmann), "krh" (Kristian Ho/gsberg) and "seth" (Though I'm just a user, *grin*).

Update: I wrote a little more explaining how Luminocity relates to xcompmgr/metacity/Xgl in another blog entry

Update: People have been asking what sort of hardware this was done on. Videos were shot on a mix of an IBM thinkpad X30 (with a paltry Intel i830 video card using open source drivers) and an IBM thinkpad T41 (with a slightly beefier but still pretty old Radeon Mobility 7500, also using open source drivers). Everything we're doing so far is light on hardware requirements. FYI, a locking bug was introduced in Luminocity that causes wobbly windows to get increasingly jerky as more windows are opened (or if there's no background image present, go figure!). This is not related to its CPU or graphics card use, and should be easy to fix without major codebase changes.

Update: If you're having build problems, I've updated the "jhbuild" line to include more luminocity dependencies than just xserver. Also added a note about "jhbuild bootstrap" for building the initial dev environment (auto*, etc).

Update: Build section now superceded by the Luminocity Wiki page

Just created a wiki page for Luminocity with improved build instructions. Should be a lot easier now, esp. thanks to all the people who have reported problems and found solutions on #fedora-desktop. Its basically "jhbuild build xserver luminocity" at this point, except that a patch has to be applied to xserver first.

Thu, 24 Mar 2005

Relation to Metacity

When it has proved itself, Luminocity's compositing manager will probably be moved into Metacity (along with any effects / extra features we consider good and stable). We originally considered doing the work in Metacity itself, but didn't want to destabilize it until various approaches were tested. Luminocity is, effectively, a testbed for Metacity. It provides a smaller/simpler codebase to test interesting rendering code with, and means we don't have to worry about fucking up Metacity in the process. Soeren's computer is (as of tonight, at least, that's the first I saw of it) running a version of Metacity that's apparently using the compositing manager code from Luminocity to render to a GL context.

Relation to xcompmgr

Luminocity has an internal compositing manager that performs the same function as xcompmgr. The compositing manager / window manager integration allows Luminocity to do things that an individual compositing manager or window manager couldn't. Of course, Luminocity composites using OpenGL, unlike xcompmgr. This apparently can be an upside and a downside, but I don't know anything about it so I'll shut my trap.

Relation to Xgl

This is the complicated one :-). I'm loathe to stick my toes in these waters because I'm the wrong person to do it, but I'm also afraid that we're going to end up with two rendering infrastructures down the road and no clarity for application developers as to which (if either) they can use. I don't know if that's where we're headed, I hope not, but I have this vague (probably, hopefully unfounded) fear... The effect will be slow adoption and general suck. I should premise this by saying that I have no direct knowledge of the Xgl codebase. I have knowledgable sources, and I know what Xgl generally is, but I haven't personally used Xgl, let alone looked at its codebase (I've barely looked at the Luminocity codebase either, for that matter).

Xgl is an X server implementation that, rather than directly accessing chip specific hardware drivers, does its low-level drawing using OpenGL calls. That means Xgl is functionally equivalent to a traditional X server, it just uses a different rendering path. Put another way, Xgl is to X11 as Glitz is to Cairo: it provides the same APIs rendered in a much smarter way.

Luminocity, on the other hand, is a compositing manager / window manager fusion that composites using OpenGL. Compositing and Window managing are all about what you do with client-rendered windows. Luminocity doesn't know what's inside windows, and it doesn't care. Xgl, on the other hand, I would characterize as primarily being about how the contents of windows are drawn (in this case: quickly and with less CPU load, *grin*). Xgl can do some other non-inside-window things like drop shadows, but I'm going to argue later those are mostly expedient demos of cool technology and Xgl is probably not the place we want to be doing those things long term. From the perspective that Luminocity is mostly about rendering windows and Xgl is mostly about rendering window contents, they are theoretically complimentary. At the moment, they can not be used in conjuction with one another (since they both want to directly drive the GL hardware), but they're goals are at least compatible.

Neither Xgl nor Luminocity are complete on their own. Xgl provides an X server and requires a window manager (and a compositing manager?) (and an X server for doing GL calls into, but see below, that will hopefully cease to be an issue eventually). Luminocity provides a window manager and a compositing manager but requires an X server (currently using Xfake or Xephyr, though supposedly there's some plan for modifying the core fd.o X server so Luminocity will work using only the host X server?). With some hand waving (in particular there's no way to hand OpenGL textures residing in the video card between processes), perhaps we could get Xgl to render windows into textures on the video card, and then use Luminocity to figure out what do with those textures. All graphics computations are done by the card, and data flows only once to the card. Perfect! Other than those niggly make-or-break technical details ;-)

As far as I know (and I'm pretty sure of this), there is no systematic way (besides GLX inside a running XFree86 / fdo.o X server) to access the "hardware accelerated OpenGL drivers". That means that Xgl and Luminocity are currently forced to have a traditional host X server, open a fullscreen window on the host server and draw into it using OpenGL. Both Luminocity and Xgl are premised on OpenGL as the standard API through which vendors can provide graphics hardware acceleration (as opposed to, say, RENDER).

Update: Soeren, one of our X hackers, thinks that Xgl actually includes no cross-window stuff but just uses an existing compositing manager (and of course, accelerates its rendering). In that case, the next couple paragraphs are totally unnecessary. Like I said above, I don't know anything about the Xgl codebase.

In addition to traditional X server features, Xgl performs some cross-window effects (such as drop shadows). This is the main area where Luminocity and Xgl could be seen as overlapping. As a mentioned before, I would argue that the X server (including Xgl) should not be doing these things long term, for a few reasons. I am not sure if David considers this point contentious or not. It could well be that he too considers these effects just a quick way to get some neat effects in play, not the best way long term, I have no idea.

  1. Drawing drop shadows on windows in the X server is equivalent to drawing titlebars on windows in the X server (instead of the window manager). One (dumb) example is that this will mean they are outside the purvue of themes (short of having an "X server theme", *wink*). If you believe in the separation of window manager and xserver (fwiw, I think its valid to believe that wm and xserver should be merged), that's an argument against doing this sort of effect in Xgl.
  2. The X server does not have high-level information available to it, compared with the information made available to the compositing/window managers. For example, using our drop shadow example again, window manager hints will let applications tell the window manager not to shadow something (say, the gnome panel). An X server like Xgl is forced to resort to guessing (of course, sometimes window managers resort to guessing too since WM hints are often vague and implemented differently ;-). To give another example, consider the window border/contents synchronization on resize feature of luminocity. This relies on WM<->application communication to specify when a redraw has been completed so the WM doesn't draw its borders to the screen until the application is redrawn, and compositing manager support to double buffer the change when its actually applied, removing the last little bit of flicker. If its even possible to do this in the X server, its going to require some serious hackery (with the emphasis on hack), and probably some guessing in addition.
  3. Loosely related to both #1 and #2, putting this stuff in the X server means you have to upgrade your xserver (or add some sort of effects plugin system to the xserver) to get changes to the visuals. It sort of defeats the idea of the X server as the low-level no-nonsense piece.

I would not take something I say here as authoritative! My knowledge of this stuff only scratches the surface. But many people have been saying even less informed things, so I wanted to get slightly more accurate info out there (esp. on online forum comments). Enjoy :-)

Fri, 18 Feb 2005

The three immediate design stakeholders in the 'enterprise desktop' are: end users, help desk staff, and desktop system administrators. Most design work for GNOME has gone into improving the end user experience, which is really the dominant stakeholder of those three. Some improvements aimed at end-users, like promoting preferences instead of settings you can get wrong, have also made life a little easier for help desk staff (as people are that much less likely to hose things). Recently Mark's work on Vino has added a very large improvement for help desk staff: the ability to remotely view and operate user's desktops (there is nothing more frustrating than blindly stepping people through computer operations over the phone).

So what about sysadmins? Sabayon is GNOME's first major design targeted at improving the user experience for people who administer GNOME systems, and hopefully the start of an initiative toward designing for this important group of users. I'm jazzed about Sabayon as the first step toward a historic goal: GNOME as the definitive desktop management experience for sysadmins. We have a long way to go, but if there's a hundred possible improvements to make over Windows and MacOS/X toward the end-user experience, there's a thousand for admins. But big things start with small steps, right? I see promise for Sabayon as the ground floor of the revolution! <seth takes a deep breath and returns back to earth> In any case, whatever the future holds, this is fertile territory because the status quo is so much worse than it needs to be.

GConf, with its support for mandatory settings and system defaults, was supposed to be a big improvement for system administrators, but it ended up being something of a boondogle because the features were hard for sysadmins to use. In most cases it actually made things harder as sysadmins struggled through the giant XML files for defaults (most probably tried to edit schemas instead, which isn't even the right thing, but its not their fault because we didn't publicize this well). Even apart from the XML files being long and verbose, remember that most sysadmins in the world (think Windows), esp. desktop sysadmins, are not uber-leet Unix haxors who adore vi and the command-line.

Speaking of leetness, two super-leet Red Hat desktop hackers with funny accents are kicking off work on Sabayon: Mark McLoughlin (panel maintainer, etc) and Daniel Veillard (libxml & gamin maintainer). There was a tuffle over the name, but the French (what with their centuries of cultural sophistication and all) beat out the elves. As Mark explains it, DV probably just wanted to be able to say, "Hello I'm Daniel Veillard and I pronounce Sabayon 'Sa-ba-yon'". Our Irish hackers really are like little elves that write code. You go to bed and when you wake up in morning a bunch of code has magically appeared. In retaliation, I was assigned the mythical character of a "Troll" by DV, but this does not hinder my speaking the truth. I may be a troll, but I am a truthful troll. The only thing that serves to dampen Mark's elf-nature is when he lights up like a chimney stack, strangles me with scarves, whacks me with bats, drives through red lights and otherwise engages in behavior liable to result in death. But back to Sabayon.

Humble Beginnings, What Sabayon Does Today

First and foremost, Sabayon provides a sane way to edit GConf defaults and GConf mandatory keys: the same way you edit your desktop. Sabayon launches profiles in an Xnest window. Any changes you make in the Xnest window are saved back to the profile file, which can then be applied to user's accounts. Want to add a new applet to the panel? Right click on the panel and add one just like you usually would. Of course, you're also free to use gconf-editor to change keys at a lower level, or download any GNOME setting tweaking program from the internet and use that. Sabayon also uses gamin to watch changes you make to the filesystem. So if you want to change the font for your users, you can drag a TTF to ~/.fonts, change it in "Font Preferences", and voila. When you're done making changes, you can save the profile. A change log will automatically be generated so an organization with a number of sysadmins can track down what changed when. Hopefully in the future we'll also have revision support for desktop profiles.

Right now Sabayon has support for tracking: GConf settings, panel applet addition/removal, general files and special Firefox profile support.

The Illustrated Tour of Sabayon HEAD

  1. First we launch Sabayon (if a non-root user it uses console helper to get root).

  2. Lets create a new profile for panty-waist designers. We base it off our existing Office Desktop profile.

  3. Sabayon starts an instance of that profile in an Xnest, including the sabayon monitor window.
  4. Designers need to be coddled, we create a welcoming text file for them and save it to the desktop.
  5. In response to saving the new text file, two new entries appear in the sabayon monitor. We don't actually want to change the recently used list, so we tell sabayon to ignore that setting.
  6. We drag a new Gimp launcher to the panel. Gimp is like crack for designers.
  7. In response to the new launcher, sabayon monitor shows some new entries (and I have a continuity error in taking screenshots, there should still be the two items for creating the text file because we haven't yet saved, oops). Notice that Sabayon records a "Panel object added" change rather than a dozen GConf keys being added. Sabayon can be taught to aggregate standard groups of changes together to make it clearer to admins what's going on when they read through the change log.
  8. Designers like pretty things, lets change the background. (As a total aside... the background capplet rewrites its GConf keys constantly a couple times a second whether they have changed or not, which makes the sabayon monitor flash a bunch in the background. The monitor has been interesting in revealing a lot of apps that are setting keys / saving settings files at weird times, which would be sucky in a networked environment)

  9. And, as expected, the Sabayon monitor shows a bunch of GConf keys being changed. We've also gone ahead and checked the keys for adding the Gimp launcher to be "mandatory". That means users that have this profile applied will be unable to remove the Gimp launcher. Unexpectedly, there's a bunch of ".fonts.cache" files in the list too. Sabayon has a list of files and directories to ignore, but its not complete yet. For now, some operations will generate a bunch of file change noise.
  10. If we just quit now, the all-in-one Desktop Designer.zip profile in /etc/desktop-profiles would not have been updated. If we're happy with the changes, we can save them back to the profile.
  11. The profile can then be distributed to computer(s) and applied to the relevant user's homedirs. We haven't started working on the mechanisms for this yet, Sabayon is the first piece in a bigger framework. For example, once we get the Netscape directory server code released and have a robust free ldap server, we can potentially host e.g. the GConf settings there and push to the server instead of applying bits to actual hard drives (or NFS shares).

    In the interim, the SabayonProfile class already knows how to apply profiles onto a directory. Actually, every time you edit a profile, a new temp directory is created first, and the profile is then applied to it. Consequently, it should be pretty easy for sysadmins to cook up their own python scripts using the SabayonProfile class that work on their custom systems today.

To Infinity, And Beyond!

Sabayon is just the first step in improving the manageability of GNOME. We (well, I) wanted to get something concrete landed that will help us focus on sysadmins as users, rather than designing a bunch of abstract features. It also exposes manageability features GNOME has theoretically had, but never exposed in a way people could easily exploit, which is good. I'm rambling now, again, but here are some random things markmc, dv and jdennis might be working on in the future:

  • Making sabayon solid. Its still a very young project (its one month birthday is tomorrow), and is rather rough around the edges. Things are falling into place pretty quickly now, but there's a lot of work still to go just in making the current feature set work better. Some simple improvements like expanding the "ignore changes to these directories" list will make things a lot better. We also have a number of UI features that aren't in the current codebase.
  • Supporting revision history on profiles
  • Figure out how Stateless Linux (in a nutshell, where the root partition is mounted read-only and synched transparently with a central source, giving the central-state advantages of thin client with the low hardware and network infrastructure costs of cheap-intel-box thick client) and Sabayon work together. Stateless Linux makes it easier for one admin to support many machines. Sabayon (particularly sabayon of the future) will make it easier for one admin to support many users. The intersection of these two is a very nice place to be!
  • We might try to figuring out a short term solution to distributing profiles to user home-directories (whether those be on an NFS share or spread across a couple dozen computer hard drives).
  • A real icon and a logo, because self-respecting GNOME projects these days need kewl logos from day one. By showing the world the icon I barfed up (), Diana will be forced to make us a new icon, pronto. Designers can't stand ugly graphics.
  • Backing GConf with some sort of network store, perhaps LDAP. If we could get a drop in and run GConf server using the better-be-freed-soon netscape directory code, that would be awesome.
  • Reducing the pain of panel management and upgrading by moving to a new layout/storing model where applets are either "on" or "off". Panel cursors allow control over where applets go. This means adding/removing/changing applets in upgrades becomes possible. Currently it breaks everything, which is a management nightmare for distros, let alone the lone sysadmin
  • Figuring out how to improve managebility of the Frankendesktop (word thanks to Luis). OO.o and Firefox mean that GConf support alone isn't enough for now. But if we're tied into supporting all these systems, we may never have the ability to do something as nice and universal as Windows group policy. So one project is to figure out if we can back OO.o and Firefox preferences using GConf. Then we can support GConf with all our heart, soul and mind in the tools and on the server.
  • Extend GConf to support features that allow small numbers of admins to support hundreds or thousands of users (such as group policy). We don't just want to copy giant technical architectures blindly, and we haven't started looking into this design yet, so its very vague for now.

Getting Sabayon

Sabayon is a little buggy atm, but its pretty easy to get running :-). Python source is available from the sabayon module in GNOME cvs. The major dependencies are pygtk and the gamin python bindings (these are available in fedora core HEAD, but gamin-python is not in FC3, I think). I think the GConf parts will still work even if you don't have the gamin python bindings, but YMMV. You'll also have to paste in two one-line text files in /etc/gconf/2 as per the README, but its pretty easy.

Thu, 17 Feb 2005

And now for a less sexy blog post. I just sent this message to desktop-devel, but as per the message, I know many GNOME hackers no longer read lists completely, soooo....:

Revitalizing the Urban Center of GNOME

We need to get desktop-devel back to the useful hacker exchange it once was (probably only in the soft glow of memory, but hey). That means not only do GNOME enthusiasts need to be more restrained, but we (the core hacking community) need to get back on the list, start using shared channels like #gnome-hackers (even for hacker-to-hacker social purposes) again, etc.

Wed, 16 Feb 2005

Forward: For a drawn out post on next-generation X rendering, this blog entry is really short on eye candy. I apologize, but I'm at home, separated from my beloved eye candy, and figured I should write this while I felt motivated. As a way of forcing my own hand, I'm making a link now to a blog entry I haven't yet written that will contain screenshots in the future :-)

Next-Generation Rendering For the Free Desktop

For the past half year or so Red Hat's desktop team has had people working toward making accelerated graphics rendering on the free desktop badass, but doing an ass job of actually talking about what they're doing in a larger public / GNOME context. They've been doing a combination of experimentation (from that cracktastic OpenGL compositing/window manager luminocity to xsnow for the Xcomposite generation) and knuckle-down no-holds-barred infrastructure work (like making Win32 GTK work on Cairo so GTK can move to cairo as the default backend). With RHEL4 kicked out the door we've been able to rebalance day-to-day work on GTK and X onto other people to give the nextgenren hackers free hands. Currently the full-time nextgenren team at Red Hat is Owen Taylor (gtk/pango maintainer), Søren Sandmann (x hacker), Diana Fong (visual designer), Kristian Høgsberg (x hacker) and Carl Worth (cairo maintainer).

I'm really excited because these guy's expertise is across a broad chunk of the rendering pipeline, from the toolkit down to the x server, which is going to give this effort the ability to work on this from a global perspective rather than optimizing the bits where we happen to have influence in. I'm doubly excited because other companies (well, Novell at least, but hopefully others will join) are starting to invest in this effort too!

I'm hoping to drag Owen into spinning this off into an umbrella effort (ala project utopia) to help maintain a coherent story/platform even as lots of people pour work into lots of different packages and distros. There are so many different ways to attack the X rendering issue that I'm a little worried about seeing a lot of fragmentation of effort and the result not being particularly coherent. I do hope people experiment with lots of different approaches, but I also really hope that in we can give developers a consistent platform for doing cool graphics on the free desktop. It would be a real shame to end up with the message in two years being "well, platform X has the feature you want, but you have to worry about also working with Y because X won't work well on distro Z". This sort of technology-choice morass can really dampen developers playing with this stuff and adding support all over GNOME, which is exactly the sort of quick-fiddling big-payoff stuff I think we'll see a lot of as soon as this stuff starts landing. In other words, lets push toward the point where people can feel confident and start hacking up cool things for this system inside GNOME.

What It Might Look Like

A really good system needs to have lots of pieces in place all hooked together....its not something that can be hacked apart and replaced by arbitrary random incompatible bits (though there are points of commonality, such as OpenGL or Render). For example the pieces in one imaginable architecture - by no means the decided-upon final one or anything - might look like:

  • A sophisticated drawing layer (cairo using glitz/opengl or render as backends)
  • Stock renderers built on top of that drawing layer (pdf/ps rendering backed by cairo - such as Alex Larsson's xpdf fork in evince, svg rendering backed by cairo, etc)
  • A toolkit that agressively takes advantage of the features in the drawing layer, exposing them to applications and themes (gtk+)
  • A window+compositing manager that can work closely with the toolkit but essentially takes the window contents as a static image in compositing (metacity with luminocity-like GL compositing manager features fused in to deal with window effects, synching up smooth resizing, live window thumbnailing, crazy pagers, etc)
  • A hardware driver system to expose a low-level hardware accelerated rendering path to the drawing layer (opengl or render with hardware accel)

With that model we can implement things like:

  • Toolkit themes that draw with layer blending effects, delightful bezier curves, and irritating alpha gradients
  • Indiana Jones buttons that puff out smoothly animated clouds of smoke when you click on them
  • Alpha transparency in applications whenever and wherever the urge strikes us
  • Live window thumbnails
  • Hardware accelerated PDF viewers
  • Hundreds of spinning soft snowflakes floating over your screen.... without messing up nautilus
  • A photograph of a field of long dry savanna grass as your desktop background... where the grass is gently swooshed around by a breeze created by moving your mouse across the background
  • Windows that shrink scale and move all over the fucking place with cool animations
  • Synchronized smooth resizing so there's no disjunct between window borders moving and the contents redrawing (you should see the demos of this in luminocity... it really makes a difference in how real the interface feels, just as double-buffering did for stuff moving)
  • A shared path between on-screen display and printing (using Cairo's PDF/PS backends)
  • Vector icons with very occasional super subtle animations rendered in realtime...a tiny fly which buzzes around the trash every several minutes, etc... think mood animations as in Riven (which as a total random aside is still a shockingly beautiful and atmospheric game years after it came out, postage stamp sized multimedia videos notwithstanding)
  • Workspace switching effects so lavish they make Keynote jealous
  • Brush stroke / Sumi-e, tiger striped, and other dynamically rendered themes where every button, every line looks a little different (need to post shots / explanation of this stuff, but another day)
  • Progress bars made with tendrils of curves that smoothly twist and squirm like a bucket of snakes as the bar grows
  • Text transformed and twisted beyond recognition in a manner both unseemly and cruel
  • A 10% opaque giant floating head of tigert overlayed above all the windows and the desktop.
  • etc etc. In short: awesome.

And that's a conservative approach to this: each window essentially renders into a texture which are then combined in a separate rendering pass by the compositing manager. A lot of the work Diana does challenges our assumptions about what this rendering system should be able to do. For example, something as simple as a swoosh that cuts across both the window and the titlebar is currently very tricky. Diana's work has illustrated something that may be obvious, but seems to be forgotten in the excitement to build the One True Graphics Pipeline (this does not exist!): Its very important to figure out many of the things you want to do with the graphics system before you get in too deep and dirty, because there are a lot of directions we could go that call for rather different architectural choices. To give one example, if we decided we really cared about having lots of animations throughout GNOME (this isn't something we're pushing, but we talked about it) that would dictate a very different approach from a graphics system where we really really cared about printing. You can't always have your cake and eat it too... especially not when you consider implementation constraints.

Another example of how prioritizing "what do we want to improve with this" can change the direction: Since taking advantage of these new toys would require a new theme system, Havoc and I have been talking about how a very different theme / widget rendering system might work with this that allows for custom design of any window, widget, or anything in between. One of the things us designers have been experimenting with behind closed doors is what you can do with a window's design when its not drawn out of a bunch of stock widgets but you have a freer hand. (This does not mean visual inconsistency, just as a magazine can maintain a consistent look but still do a fresh layout for each page using a mix of stock and new elements.) The results can be really good. No matter how good the artist, you can only get so far designing a crude palette of some fixed number of widgets which are then used in preset. A good theme/widget rendering framework would help us negotiate this balance between re-using stock elements, and overriding the rendering of widgets at appropriate points to customize how a "Control Center Preference Page" is drawn or to simply shift the text in buttons over 10 pixels to the left. Figuring out how this stuff works, or if we just want to leave the theming issue alone (which would sort of be a shame given how much of the old flooring we're tearing up around it), may also have a significant impact on the final architecture.

A radical model (which also avoids multi-pass rendering without opening up security issues present in sharing direct access to existing graphic cards between processes) might involve a centrally rendered scene-graph where each client is given a subtree to add higher-level primitives. That could give us access to candy like pixel and vertex shaders (which we experimented with several months ago as part of rendering subtle but live backgrounds of grass fields, etc), which are attached to nodes on the render tree. Of course, there are many paths for leveraging shaders short of a full scene graph system. The scene graph model has a lot of significant concerns that are not as relevant to, say, 3D games where this model is common. Text rendering is one example.

Owen and company have slides from the X dev conf, but the punks did them as SVGs so unless you have their k-rad Cairo backed SVG slide presentation program, or if you're willing to view slides in Inkscape... they're not much good (though it is cool that you can find the slide you need using Nautilus thumbnails, but I digress) (hmmm, you can also open them in eog). Honestly, not the most inspiring OR detailed slides in the world either. I don't think they'd had much sleep when they wrote them up. *grin*

Anyway... I'm rambling. I've given a couple points too much depth, most points not enough depth, many points I've missed, and doubtless some I've gotten wrong, but I knew if I waited to write the perfect post on this there'd be only more backlog of material to share... so a braindump it was. :-) I guess in the end I'm pretty excited. It feels like we're running the last couple miles to get to the giant great-rendering payoff Keith Packard kicked off in the X world several years ago.

Code and stuff

  • Cairo I think everyone knows about... writing for Cairo in Python or Mono is especially cool. Its really easy to get something that looks good going in short order. If you haven't played with it, you should!
  • Luminocity is in GNOME cvs with the module name 'luminocity'
  • Metacity compositing work is in 'metacity' with the branch 'spiffifity'
  • GTK+ / Cairo integration.... gtk+ HEAD!

Apparently they also have a jhbuild setup that'll build all this stuff thats headed for CVS in fairly short order.

And for my last point...

Hula!

Sat, 22 Jan 2005

I promised my next blog manifesto would be handed over to The Journal, and, behold, the latest GNOME Journal is upon us.

Read Experimental Culture

In it, I chronicle the rise and fall of GNOME. Its a rousing tale of charred corpses and classical chrome starring Enlightenment as the wayward prostitute and George Jirka as Her Royal Majesty the Queen of England. Cameos by Beagle, PyGTK, and the cultural revolution.

In all seriousness (well, more seriousness, at least), I hope after reading the article people will at least talk about the problem: GNOME is sort of boring right now. When you interpret usability soley as restraint and polishing it can really dampen project enthusiasm over time. All work and no play makes jack a dull boy.

Design not Usability

The partial solution I would proffer is to focus on design instead of usability. There's a big difference. I'm sure there will be a big hoopla over Apple today owing to the expo, and they deserve it. I think it would be very hard to argue that the things Apple does are not interesting. Part of the reason Apple is interesting is because they encourage designs that change market norms. Good design is challenging. I mean that two ways: both that it is hard to do, and that it tends to shake things up.

Extreme shaftation is an oft used and effective approach to producing really good designs. That's part of the reason its far harder to do a good design in a non-1.0 product. In a 1.0 product you don't have existing users, there's nobody to shaft. You can choose who you want to target, and do it well (unless you position yourself, say, as a Microsoft Word replacement in which case you inherit the set of expectations!). As soon as you have users, its very very hard to drop things from the requirements list. The point of the shafting isn't to remove individual features, or to increase simplicity (necessarily). Simplicity sucks if it doesn't do anything. The point is expand the scope of possible designs, its to let you do new and more interesting things.

Focusing on usability devolves into a sort of bean counting. You divide up the "requirements list" and figure out how to cram all of it in, and then trying to organize the minutia (button labels, menu organization, etc) so it somehow still all makes sense. The result isn't very sexy, and is agressively mediocre. Every point on the requirements list pins you down. In the end the requirements list does the design instead of you. When everybody else is producing nutso apps with a billion buttons and no sort of consistency (c.f. GNOME 1.x), the result of usability looks pretty good. But by shedding some constraints, losing most of the requirements, and focusing carefully you can usually make something much better.

Shedding the Requirements List by Zeroing User Expectations (MS Office)

Microsoft Office exemplifies usability in action. They have a huge list of features that Office must have or users will be angry. They have done a good job of taking that massive list and producing something sane. I am sure that every dialogue and menu in MS Office is poured over with excruciating care: "Will that wording confuse people?", "What are people most likely to be looking for in this menu?" etc. It shows. Office is very polished. Its also a very poor design.

If I were commissioned by Microsoft to dramatically improve Office, my first step would be to position the project not as a next-generation Microsoft Office, but as a new product. I might even start with the Office codebase, but I sure as hell couldn't work with the smothering mantle of user expectations that looms over Office. Done well, I think you'd largely displace Office in the market (assuming this was a Microsoft product, I don't mean to imply that anybody could just make a better product and flounce Office in the market). So you are meeting the goals people have in using Office. What you're not doing is slogging through trying to meet the specific needs people have of the existing software. If you do that, you'll just end up writing Office again.

New Software Resets the Requirements List Anyway (E-mail)

Its important to understand that most 'feature' or 'requirements' lists are a reflection of user's needs and desires relative to existing implementations. If you improve the model enough, most of this is renegotiable.

E-mail is a great example of this. Lets say the internet hadn't appeared until 2004. You are right now in the process of designing the first E-mail app. Clearly users need the ability to make tables, right? I mean, that's "word processing 101". And to format them precisely, oh and insert drawings. And equations. And to edit graphs inline, and to set the margins and page settings. etc etc.

You could easily end up with the requirements list for Microsoft Word: a design for creating multi-page labour intensive laid-out documents. These are the requirements you'd extract from the "word processor + postal mail" model. But E-mail totally renegotiated this. Short little messages are the norm, not multi-page documents. You receive many dozens of mails a day, not several. There's no question that being able to insert a table here and there would be nice, but its by no means a requirement. E-mail's one compelling feature, instant and effortless transmission of text, renders the old model's "must have requirements" list a moot point.

Mon, 17 Jan 2005

Dear Professor Harris, your course has been remarkably useful to me. I recently discovered I can view archived copies of your past lectures through stanford online. Reliving those memories has helped me recapture something I had lost since leaving your class. Now whenever I find myself off-center, struggling with my personal demon, I log on to the website and help is only a few key clicks away. (P.S, the issue I've been struggling with is insomnia)

Wed, 12 Jan 2005

Just released gnome-blog 0.8. New features include drag and drop uploading of images (to compatible blog software), spell checking, more blogs supported, and proxy support. Currently we are known to support: pyblosxom, advogato.org, blogger.com, movable type, livejournal.com, and wordpress. It should work with any MetaWeblog or bloggerAPI compatible blog, but YMMV.

See the gnome-blog web site for more info, tarballs, rpms, etc

Sat, 18 Dec 2004

I wrote this article the better part of a year ago and forgot about it. I just noticed it was pushed live:

Improving Usability: Principles and Steps for Better Software

Actually, I don't see any steps in there. Apparently I was also interested in the history of design at the time (which is a cool topic, really, so I guess I'm still interested). But I enjoyed rereading it, and its nice to notice that, while I would have written the article from a very different angle today, the principles are still the same. You know its been a good year when your principles are still the same at the end of it. :-)

Executive Summary:
The article covers a number of design principles, situating them in the historical context that made the principle relevant. The principles are:

  1. User Knowledge Principle Figure out who your user is, what they do, and what they need.
  2. Feature Bloat Principle Recognize the cost of each feature you add and each exceptional use case you accommodate.
  3. Focus Principle Good design requires editing. Focus the design on one principle class of users.
  4. Abstraction Principle Keep track of conceptual model your software requires, and work at making it simpler. Reduce cognitive friction.
  5. Direct Manipulation Principle Enable the illusion of direct manipulation when there is a reasonable physical metaphor.

Then the article dives through four of the most important phases (suppose this is the wrong word since they often overlap, repeat, occur simulateously, etc) of software design.

Sun, 12 Sep 2004


Jamie's Silhouette in Prague Castle
I was on vacation last week in the Czech republic with Jamie. We ended up spending most of the week in Prague, but did escape into the countryside a little.

Photographic Glut
Before we left Jamie said she was bringing a digital camera. I delivered my usual spiel about how "anything worth remembering doesn't require a photograph to remember it". Alas, while my relationship with the camera  was initially frigid I warmed up to it. Eventually it possessed me, and I'm afraid I might be hooked on photography now. Jamie can attest that I tried to stop: "this is the last photograph, last photograph, really. this time." Currently I am staving off the desire to drop $1400 on a Nikon D70 digital SLR. Bad Seth! So that is my excuse why this blog entry is a series of photographs instead of a long winded trip journal. Actually, that doesn't sound so bad. Nobody reads long textual things anyway. In fact, I doubt anybody has read this far (except my grandmother) and have instead prefererred scanning through the photos.

I'm too lazy to make all these pictures in the blog link to the full sized image, but they're all found in my "best of prague" 2004 designer collection. I've also got a larger 70 photo album.


Czech Countryside


The Shows
I can not recommend Prague too highly if you like "high art" performances and/or are a miser. There are at least a half dozen chamber music performances every night, the opera is cheap (we paid $15/ticket for very reasonable seats), and unconventional performance art forms abound (of note were the national marionette theatre and Laterna Magicka).

The marionette theatre performed Mozart's Don Giovanni, which might sound dry, but it was amply laced with humour and was somewhat vulgar - true to traditional puppetry. The entire audience was in hysterics by the end. That said, their performance didn't make a mockery of the opera at all. They found a perfect balance between sucking you into the drama, and then breaking up the boring bits with comic relief. This is particularly impressive because, of course, all the spoken (well, sung) words were in Italian; though I'm familiar with the material, so that might have aided with the dramatic bits.


Prague in one photo: gothic spire, quaint old buildings, 1960s Soviet cement block apartments


Laterna Magicka is possibly the best performance I have ever seen. It was certainely the weirdest. It is basically ballet with some silent (good) acting. The catch is that they use three movie projectors projecting onto white cloth to construct the "set". The characters move in and out of the "movie" part seemless and interact across the boundary. For example, a live actor will run through the sheet and suddenly pop up in that location on the projected image. They'll then turn around and continue a conversation with a live actor on the stage. Laterna also has a penchant for flying objects and people on ropes. For example, they'll remove the middle projection cloth, and a character on the left projection will toss a rose to the right. A physical rose will then go flying through the air (and do a loop or something) in the middle. Its very hard to explain, but the net effect is abstract, colorful, and a total mindfuck. In a way, I would say that Laterna is a spiritual extension of (the impressive but often tedious) non-narrative cinema that uses the presence of physical actors to draw the audience in and keep them interested. Its engaging high art. Very cool.

In the realm of the national opera, we were fortunate enough to catch Verdi's Aida, one of the "great operas". The national opera was having a Verdi week with a different opera of his each day. This was definitely the most well known, and we were able to warp our schedules to make it (thanks Jamie!). I was not familiar enough with Aida to closely follow the plot (what opera has a good plot and libretto anyway?!? I think if we're honest most operas' plots suck. its about the music stupid). However, the music was absolutely terrific, and the performances were top notch. I really liked the lead tenor (who played Radames) vocal performance, but his acting was terribly rigid. He didn't seem able to emote and/or move and sing at the same time. Oh well. The mezzo who played Amneris was both a fluid actress and delivered a phenomenal vocal performance. Aida herself was also good, though her voice lost some resonance seemed thin in its upper register (of course, she had resonance to lose...). Oh how easy it is to be a critic *grin*. Anyway, the long and short is that they delivered a "world class" opera performance at prices that mortals can absorb without getting a nosebleed.

Speaking of Verdi, we sadly missed a performance of Verdi's Requiem in favor of visiting the Church of St. Nicholas...which turned out to be closed. Too bad because its one of my favorite choral worksi, and the performance was in a large gothic church which would have doubless contributed an interesting mood (not to mention the effect on the timbre!).


Rail Control Station


Charles Bridge over the Vltava River

Carved Doorway in a Sidestreet

The Sights
My favorite sights were non-historical: sitting on a bench and watching the river, walking down random sidestreets in Prague, riding the underground aimlessly and popping out at random stations to see what's there, visiting a department store to czech out the latest clothing fashions (I swear I will never use that pun again, please keep reading) and grocery items, watching people cavort around the town square at 1am, and strolling through the countryside. Jamie was more into visiting all the "must see" locations, and this generated a little friction for the first few days. Fortunately we resolved this and the rest of the trip was marvelous.


Rowboats on the Vltava River

Of course, many of the historical things we saw were incredible too. I was particularly pleased wit the St. Vitus Cathedral and Karlštejn Castle. Many period religious structures (*cough* church of st. nicholas) are terribly ornate. I tend not to appreciate structures just because they are old. Many grand and/or famous old structures do have beautiful design that tickle my fickle modern aesthetic sensibilities. Many do not. In any case, the Cathedral, while painfully gothic and overwrought on the outside, is composed internally with sparse shapely arches and the best stained glass I have ever seen.

The tower of the St. Vitus Cathedral is quite a ways up and is accessed by a narrow spiral staircase with no windows or railing. The tower is, I am sure, eminently defensible, but not pleasant when jammed with people going up and down with barely enough room. It was particularly unpleasant when the lights went out. However, the view from the top paid us back with double dividends. Many of the best photos from the trip were taken from the tower, which affords a panoramic view of the city with few obstructions. Its also perfectly situated along the river to capture many of the arching stone bridges.


Old Town Square in Prague

Karlštejn Castle

Food & Beer
What can I say, beer was literally cheaper than water. Food was a mixed bag. I wasn't blown away by "local cuisine" (I mean, ghoulash is fine, but its not thrilling). On the other hand, restaurants were very cheap and Prague has reasonably good foreign food (particularly a lot of Italian). Lots of hitting cafes at night for hot chocolate or coffee. We ate at KFC once (I take full responsibility for this, I was stressed out, hungry, and things were closed. Jamie was dragged there). We hit a grocery store and went through grabbing things that looked interesting. The result was a basket piled with chocolate and junk food. The cashier looked at us funny. I am pleased to report that the Czechs apparently share my affinity for hazlenuts. Juice, tarts, and other fruity things were a highlight of the trip, particularly for Jamie. I probably should have indulged in juice more, but I was too happy to have cheap decent espresso.

We finally stumbled upon an absolutely stupendous "fancy restaurant" one night for desert. We'd gone up the funicalar railway into the hills around Prague at night to see the view of the city. We never got a really satisfactory view (though it did provide a nice walk), but one of the stops was for a high class restaurant. We got out on a whim and grabbed desert and coffee there. Fresh rasberry, pear and lemon sorbets, and a desert cheese filled with pear chunks and drizzled in a tangy sauce, open night view of prague, live piano, a silky cappucino, cool night air. Completely-off-the-charts sort of good. We came back the next day (our last) for dinner, and had 4 incredible courses for $15 a head (starting with those sorbets... yum). Main courses (Saffron, mint. Enough said.), which we split for maximuum effect, were not just delicious but beautifully arranged. It was a perfect way to end the trip... we left for the airport 5 hours later.


St. Vitus Cathedral over the Vltava River

Statue At Bat

Broken Ankle
My broken ankle has mostly mended, so I was able to get around pretty well. Some days it didn't feel as good as others and I wore my "robo leg" brace, but most days I got away with a shoe-compatible brace designed for sprained ankles. We mostly took the underground around prague, and walked from point to point. Prague also has a nice tram and bus system, but we didn't figure out the routes until the last couple days. Too bad, it would have been interesting to ride a tram around the town. Just yesterday I extracted begrudging permission from the physical therapist to start cycling again. She walked back in a minute later and asked how far I was planning to ride. Busted! We compromised on 10 miles. I'm going stir crazy: haven't ridden this whole year. First it was winter, and then right when spring was coming and I had gotten my bike back into shape (lost the rear wheel in transit across the US) I did my ankle in.

Taking pictures of people still elludes me. It didn't help that the camera had a 4 second delay from when you squeezed the button to when it took the picture; ruins the possibility of capturing spontaneous moments, save by freak accident.


Jamie in the Great Hall of Prague Castle

Jamie Outside Something-or-Other

A 30 photo album is here, which is a subset of a larger 70 photo album. All the photos here are in the 30 photo album.

Tue, 24 Aug 2004

Let the record show...

I'd just like to state, for the record, that Owen Taylor has sullied his fancy-pants GTK engineering self. Not content merely to perpetuate and even initiate nasty hacks on python internals, his lust for for evil not sated by working on an IRC bot. No! Owen had to go and work on an X-Chat plugin. Is this really a man you'd trust your widgets with?

Whiteboard!

So a number of us (owen, colin, jrb, bryan, blizzard, j5 and myself) hacked this weekend on an allegedly multiprotocol whiteboard that currently supports direct TCP connections and, most notably, IRC. Hopefully we'll get jabber support and gossip integration too. There's an X-Chat plugin for it. There's also a plugin for SupyBot for keeping a whiteboard with persistent state sitting on a channel. It doesn't look pretty atm, but its a pretty good technology foundation.

Despite my constant bitching and moaning about having to implement the whiteboard protocol in the model, its actually pretty cool. Clients broadcast actions to create/delete generic objects or modify their properties. Currently we only support text and stroke objects, but it should be pretty easy to add others to the system now that the base infrastructure is in place. The protocol looks something like this:

  WHITEBOARD [channelname] 0+ <create ><text requestId="[uuid]" x="0" y="0", text="Hello"/>

If you're using the SupyBot, it serves as the authoritative "master client" and echoes back actions if it accepts them or rejects them. The client-side model (this is the part I'm obsessed with because its where I spent most of my time) journals actions it initiates, and can snoop the channel when other clients broadcast (so you don't have to wait for the server to echo, reduces latency which is important w/ IRC rate limiting) but only commits the changes as authoritative when the master client confirms them (otherwise they are rolled back).

I'm particularly proud to be able to say that I'm doing transaction stream compression by smooshing sequential modifications together before comitting them to the journal. "Look mommy, I'm Hans Reiser!". OK, so its not really that hard, but it sounds 31337. Humor me, ok?

Code is in CVS module 'whiteboard'. Its all written in python with pygtk and shouldn't need anything special to work except for Cairo and pycairo. Now that we've done a good pass at the base pieces I think the actual drawing bits will get some more love/features in the next few days. High on my list are: erasing (*cough*), variable line width, hand-drawn-shape smoothing, and a highlighter. I've also got most of the pieces done for adding graphics tablet support. All that should be pretty easy except maybe shape smoothing (just don't know how hard the algorithms for doing this are).

Wed, 21 Jul 2004

The Unix Credo: We strive to never improve, hacks excluded.

Fri, 16 Jul 2004

  • 2:00 AM: Set alarm for 10 am (physical switch)
  • 2:03 AM: Tape piece of paper over alarm with the text "Why are you ruining my life?"
  • 2:07 AM: Go to bed
  • 2:15 AM: Fall asleep
  • ??? AM: Alarm is switched off, and the piece of paper is retaped over the alarm by a mysterious force. Abducted by aliens? Gremlins? Cruel alter ego?
  • 12:17 PM: Wake up with no memory of the alarm being disabled. Paper is still taped over the alarm like the alarm was never turned off (?!?)
  • 12:18 PM: Perform thorough exam for signs of alien abduction: scars, incisions, chips in the back of my neck, probes in various orifices. Results, negative
  • 12:19 PM: Inspect apartment security fixtures. Deadbolt: in place. Physical chain slider thing: in place. Pole blocking sliding glass door: in place. Grill over fan vent in bathroom: in place. Gremlin trap: empty

Conclusion: I have a cruel alter ego who wakes up when the alarm goes off, disables it for who knows what reason, laughs mischeviously, and then goes back to bed.

Solution: Tie myself up before going to bed.

Problem: How do I get out of bed when I'm back to my calm mild mannered normal self?

Wed, 14 Jul 2004

I get a lot of messages asking me to compare and contrast Storage, WinFS, and sometimes Dashboard and Medusa. More recently, I've gotten a lot of questions about Spotlight and Beagle. I've generally avoided commenting (which usually means not answering the e-mail...) on these things both because its impossible for me to do an unbiased comparison, and because the goals seem to be quite different.

  • Medusa, Beagle & Spotlight are similar, though of course Spotlight is much more mature. I would call them metadata index systems.
  • Storage & WinFS are similar, though of course WinFS is much more mature. I would call them document stores.

Caveat: If indexing and search were the primary goals, a document store would be a ridiculously overengineered approach. The medusa/beagle/spotlight model is much more sane if this is your only or primary goal. I'm not saying this to suggest document stores are better or worse than metadata indexing systems, only to point out that there's an element of apple-orange comparison at work here.

Metadata Index Systems

Medusa:

Medusa was originally written by Eazel integrated tightly with Nautilus 1.0 and was slated for inclusion with the GNOME 1.4 release. It was primarily written by Rebecca Schulman, but also had major contributions from Maciej Stachowiak & some by myself. Medusa ran as root, which worried some people (but of course, so does updatedb for slocate...), but unfortunately had a major bug that caused it to be pulled from GNOME 1.4 at the last minute. Rebecca fixed the bug after the release, and re-architected Medusa to run as a normal user. But unfortunately Eazel collapsed before GNOME 2.0 and nobody promoted its inclusion. Curtis Hovey & I ported it to GNOME 2.x platform later, and Curtis is currently maintaining it and adding lots of new features / fixes. In particular he seems to be working on a UI for it. Medusa allowed very fast searches over large indexes. Indexes were built by scanning the disk every night (like slocate, unlike Spotlight which does things better). It also provided a search: URI scheme that allowed creation of dynamic "search folders". So you could have a "Spreadsheets" folder for example that always contained any spreadsheets on your system. The biggest hurdle for Medusa today is that the set of indexers is not very extensible, and so it doesn't know how to index very many different file types.

Spotlight:

Of course I haven't looked at Spotlight's code or used it, so what I know about it is from what Apple has published and discussions with friends at Apple. Spotlight appears to be a sophisticated well implemented approach to building a metadata layer an top of an existing file system. Changes to files appear to be noticed at the kernel layer, and indexers are quickly run to update the metadata cache (with information about filename, album name, size, file contents, keywords, etc). I don't know whether it is guaranteed that indexers will be run before the data can be accessed, but it is supposed to happen very quickly in any case so it appears instant to the user. Spotlight is the work of (among others, there are probably more people I just don't know) Pavel Cisler (BeOS tracker & Eazel Nautilus) & Dominic Giampaolo (BeOS BFS, which had a similar sophisticated metadata system). Spotlight also has a lot of work gone into the UI, for doing grouping, measuring relevance, etc. Its easy to underestimate how much work this is, in some ways the "indexing" is the easy part. Spotlight appears to index a lot more than just the filesystem, including things like calendar and mail, but I don't know the full extent of what it can do.

Beagle:

My knowledge of Beagle is based on playing with it and reading through a fair bit of the code, but I could definitely be missing large aspects because I haven't talked with Jon. Beagle's code appears to be fairly immature at the moment, but I would expect it to grow. It uses a port of Apache Jarkarta's Lucene. Lucene primarily provides a way to *store* indexed metadata and do fast *searches* over lots of metadata (including full text, of course), but it doesn't provide the indexers for specific file types. In some sense, Lucene as a specialized "database" for storing the results of indexers. Currently Beagle has indexers for HTML, JPEG, MP3, OpenOffice.org (very cool) and Text. Unlike Medusa (I have no idea about Spotlight for this) Beagle is designed to index "byte streams" rather than files, so it can index, e.g. "The current page you are looking at in Epiphany". This makes it very compatible w/ Dashboard, since Dashboard wants to index any and all contextual data, not just things on the hard disk. At the moment Beagle appears to contain only very simple UI, so its primarily a document indexing system.

On the filesystem side, Beagle currently works like Medusa and requires a "crawler" to update its metadata cache (say nightly), vs. spotlight which updates instantly. Beagle also has crawlers for Mail and IM logs. Beagle also includes a renderer system for displaying the relevant metadata etc for different file type results. AFAIK, Jon Trowbridge at Novell is the person mainly hacking on Beagle atm, but I think the code was refactored out of Dashboard, and a number of other contributors are listed.

Document Stores

Both WinFS & Storage are aimed at doing a lot more than document indexing... in many ways document indexing is only a nice side effect of their larger aims. Storage and (AFAICT) to a lesser extent WinFS both intend to store the actual documents themselves inside the store. That means that more than just metadata is inside the store. Both WinFS & Storage provide a query system, though WinFS' has developed a nice object oriented language (which I think they compile to SQL) whereas Storage currently uses straight SQL which is harder for other developers to use.

Storage:

I know most about this so I'll talk about it most of course ;-) Storage is fairly immature, and the architecture has shifted a lot in the past few months.

"storage-store" provides a DBus service that allows fetching objects over the FreeDesktop DBus getting their attributes, relating them to eachother, running queries etc. "storage-store" uses postgresql to store the structured objects and perform queries. Because objects are accessed "live" rather than as "buffers", changes are instantly propagated across the bus, so multiple applications or users can work on the same document and instantly see changes other people make.

I'm currently working on architecture to storage-store into standard IM presence information so you will be able to see buddy icons of other people and what part of the document they are working on inside storage applications. I have a lot of user experience goals for Storage (or more accurately, for applications and desktop that use storage). You can find information about most of them on my blog and at the storage homepage. Though these goals are more important to me than document indexing, and have a lot more impact on Storage's architecture as a result, I will focus on document indexing in order to compare and contrast with the other systems.

libstorage-translators provides a framework for translators that can take structured object data in the store (metadata and the actual data itself) and translate it to and from byte streams (such as files). The goal is not indexing files, but for providing a way to move files in and out of the store. So for example, if your friend sent you a PDF file by e-mail, you could drag that file into your local store and the libstorage-translators will automatically decompose the information for placing in the store (and of course extract lots of metadata like album name, description, image width, etc etc in the process). Currently I have only worked on the "importer" side of translators, not the "exporter", so they are effectively like indexers. There are currently importers for: DocBook, HTML, any image format supported by gdk-pixbuf (JPEG, PNG, BMP, GIF, and several more obscure formats), PDF, text, and any format supported by gstreamer (MP3, OGG, AVI, MPEG2, etc). Importers can also create thumbnails for the data for convenient display later. Storage also includes a renderer system for displaying the relevant metadata etc for different sorts of results to a query. A major drawback is that I don't have translators for common document formats like Gnumeric or OO.o at the moment.

Queries can either be performed using an SQL-like format (slightly higher level than SQL but not much, it gets translated to SQL) or using natural language queries. A large chunk of storage code is currently in its NL system which uses very sophisticated HPSG grammars and other techniques to translate human language phrases into the SQL query format.

A storage:/// VFS URI is provided which automatically invokes translators when files are dragged into the store. That means you can, e.g. open a nautilus window to storage:/// and drag files in to add them to the store. It also provides query folders like Medusa. So for example you can have a folder "spreadsheets" or "songs by John Lennon that don't have the word 'love' in them" that is live updated to contain objects matching those criteria.

WinFS:

I know the least about WinFS of any of the systems discussed here. I need to read up on it more... but the last time I looked at it heavily was more than a year ago when MS was still very ellusive. It looks like a lot of info is up on the web now, so what I'm saying could be out of date. WinFS is backed by both NTFS & Microsoft's SQL server. It provides a very nice API for querying and working with objects. Currently the set of object types it can used is fixed and predefined by MS (but the list is long). In the future they will probably open this up and allow anyone to define new object types. AFAICT, WinFS is currently targeting primarily the storage of metadata, though it is tightly coupled to the files themselves stored as byte streams in NTFS. It does look like in the future they intend to more completely store things in WinFS. WinFS provides a very cool set of hooks for performing actions in response to changes in the store. WinFS uses this to provide indexing services, but users can also define their own actions (e.g. you could say, "whenever an e-mail from George is created, copy it into my "to burn" directory").

Tue, 29 Jun 2004

Unfortunately my ankle was fractured pretty badly and it was important I have surgery on Wednesday. Unfortunately this precluded my flying to Norway for GUADEC on saturday. I actually proposed that I fly to Norway on Saturday to my orthopaedic surgeon. He gave me a look that was darker than oil at midnight, and went back to what he was doing without saying anything. Some people tell me I should have interpreted this as a "sounds ok". However, he later said some things about our goal being to "reduce the chance of having arthritis in the ankle for the rest of your life". That scared me into behaving.

There's a more formalish storage paper for the occasion here. But honestly, I think the speaking notes are more informative for getting at the soul of the material. In my experience that's often true of talks vs accompanying papers. So I'm including my speaking notes here. I blame oxycontin for any incoherent bits. They're a little random but I hope you press through because some of the good stuff is near the middle/end ;-). Maybe I'll do sketches on whiteboards for all the places I was going to do live sketches and take pictures, but for now the notes are all booooring woooords. Unfortunately in many cases the sketches are the meat of the thing, but I think you can get some idea what I'm talking about from the text. I've fleshed it out past the notes in some places where it was totally incomprehensible:

  • Storage is designed to support a more general user experience than just “find files more easily”. Storage isn't a silver bullet, but it can serve as a toolkit for making new user experiences easier to extend across the desktop. In the process it helps dissolve the application/desktop boundary a little.


The Experience

  1. Intro: Related to many existing systems

    1. Wiki – anybody can edit or work with information. Information is not super formal to start with, but can become “formalized”. Unlike wiki, allow for rich in place editing and better tie in to the OS for noticing changes and tracking “change threads” (which are themselves communication often).

    2. Whiteboard – support quick informal live collaborations. Don't force things into a particular “format” or medium but allow people to mix it up. Share a space with lots of presence information, etc. Also envision this working when people are in the same place.

    3. Groupware – handle objects that people need to deal with to get their job done. People, teams, projects, tasks, deadlines. These are more central to knowledge workers than even documents. Like groupware, track threads of communication, but don't tie people down to text messages. Let them respond with people, projects, tasks, etc. Rather than “posting to lists” you just append items to a topic in the (or a) central store.

    4. Bugzilla – tasks, and schedules, process, status, owner, etc. Track more interesting metadata in a way that people can shape to their organization.

  2. Build “objects people care about”

    1. This is more about what gets built on top of Storage, but its a major part of the overall experience. The file manager (atop the filesystem) is about managing formal documents and folders to group documents in large concrete chunks. The <some name here> (atop storage) should focus on objects that fill people's daily lives.

    2. People, Projects, Teams, Tasks, Messages, Topics, Discussions, Managers, Proposals, etc, etc, etc (and yes, Documents too) are objects people care about. Many others that are specific to particular industries and job roles. Some of these objects currently live in specialized applications like evolution, and most of these will still be handled primarily through a specialized interface. <sketch the two specialized interfaces>.

    3. Its usually a good idea to have specialized tools for targeting specific use cases.

    4. OTOH, although we work on text documents mostly in the office suite, we still expose common operations to the base OS (the filemanager mostly in this case). How can we extend the set of useful things that can be done with information across the information boundary? In a less generic sense, can we build support for the objects people deal with on a day to day basis more deeply into the OS. It doesn't have to be done by a univeral component system, but base libraries like storage can make it easier to support the important “one off” optimizations in the base OS (such as for projects).

  3. Support informal work

    1. Most office applications are focused on producing deliverables: formal documents. But deliverables are the exception. Most knowledge workers spend most of their time processing, sharing, and extending information not producing deliverables. We want to build interfaces that allow for some degree of information soup. <sketch the process flow for organzing SubsByTheInch2005>

    2. Informal work can eventually turn into formal deliverables. Make this process as convenient as possible.

  4. Information is information, don't force large chunks

    1. We currently have odd granularities of information. “Files” in the case of “formal documents” (but since we don't have informal constructs, many things are pushed into this).

  5. Access items within large bodies of information

    1. The storage “research-y” solution to this is object reference using human language phrases

    2. This aspect of storage still interests me, and has been where most of the work has gone until now.... but it is more researchy because it is prone to being technically infeasible (jury is still out ;-). As such, other parts of storage are not predicated on it.

  6. Provide the components for collaboration

    1. If storage is the physics, social interaction is the chemistry. Storage needs to provide some very basic structures that will give rise (when people, environments, tasks, etc) are thrown into the mix to social interactions. Rather than trying to control things rigidly, as traditional computer environments have done, we allow social behaviors to regulate things more (as things work normally outside computer world).

    2. Presence information is the substrate for coordinating social interactions. Who is where and doing what is the most relevant context for social interactions.

    3. Access by multiple threads/computers/people. Rather than “versioning” documents and the associated problems (e.g. merging is a nearly insoluable UI problem) we allow “live” (or at least effectively live) access to documents.

    4. Fine granularity. If we have access from multiple places, the temptation is to use locking of “documents”. Even inside formal documents, however, this will greatly limit collaborative ability. If we have rich fine grained presence information, combined with very fine grained data access, we can provide the ability to socially manage interactions rather than requiring “forced” lockouts.

  7. Track information flow

    1. E-mail showed the importance of threads of communication between people. An e-mail thread morphs into a task (like a bug), which morphs into a few more tasks (which might have discussions associated with them), which turns into a full fledged project with an associated team, which eventually produces a policy document. All this stays tied together. <show interface idea>


A Brief History (aka excuse):

  • Storage was initially implemented as project Gargamel by a team of Stanford CS (and one EE, and yours truly) students as a senior project. Brian Quistorf, James Farwell, Khalil Bey, Josh Radel. It gets to a nice demo-able point before they graduate.

  • It gets even more finished as I work on it after graduation while not looking for work. Web page is written, screenshots made, etc.

  • I foolishly decide to rewrite the NL parser (and lose the old CVS history when importing to cvs.gnome.org). I get sidetracked writing the NL parser.

  • Slashdot etc hit. Lots of developer interest, but I'm snowed for other reasons and don't succesfully get development moving with other people. Plus I still have to finish the NL rewrite before things will function again.

  • The summer is completely crazy, and I stop working on Storage for 8 months.

  • Today: NL rewrite is now done. Its a much stronger foundation, but the semantic grammar is still small. However, even with the small grammar it can do very sophisticated (correct) interpretations of phrases like “songs that aren't by 'John Lennon' but have the word 'love' in them”. This would be very difficult to parse with a traditional “naive” scavanging search interpretation. Marco is also contributing to Storage, as well as some other Epiphany dudes. Things are starting to pick up, and I'm determined to not kill storage by bottlenecking again. I'm looking for a “project manager”.


What's there today:

Non-NL

  • storage-store – manages the postgresql server, handles notification

  • libstorage – GObject interface to store items

  • libstorage-translators – serializes / deserializes data streams from / to storage items'

  • GnomeVFS module – automatically invokes translators on read/write into the store allowing existing GNOME apps to use the store like a normal filesystem

  • NL

  • PET – parses sentences into Head-Phrase Structure Grammar (HPSG) trees, by Dr. Ullrich Callmeier.

  • libmrs – interface to the Minimal Recursion 'Semantics' information in the HPSG tree

  • libmrs-converters – translates MRS into a more meaningful XML statement using a client chosen semantic grammar

  • libstorage-nl – translates using storage-specific semantic grammar into the intermediate XML form, and then to an SQL query


What's in the near future:

  • Currently libstorage, the VFS module, and some translators directly access the postgresql server. This is undesirable: it means permissions on a shared store would have to be enforced using a collection of SQL views, it means locking becomes very tricky, and it means that libstorage and other things link directly against postgresql libraries (though this could be addressed by gnome-db).

  • Support for NL searches in select non-English languages (probably Spanish first, but perhaps Japanese). Storage is built on a “language neutral framework”, but grammar engineering is a very difficult task. Some of the availability of NL searches will depend on what the linguistics community produces and distributes freely.

  • A nifty collaborative application to provide a test bed for the collaboration/locking framework. <sketch collaborative whiteboard/wiki design> (also shows informal work) Ideas? ;-)


<demo NL search interface>


<show NL slides and explain basic NL process>

Sun, 20 Jun 2004

A slivver more than two weeks ago I "sprained" my left ankle playing barefoot soccer. The ankle felt like it was getting better, but as swelling receded my foot felt awful. I could step on it, but the first few steps hurt like crazy. Other steps hurt but I didn't have to brace myself for them. I finally caved and decided to see a doctor.

Unfortunately, as predicted, this turned out to be a very frustrating affair. I know some people like lots of choice in Doctor, etc. But I sort of like the Kaiser-Permanente (HMO? in CA) model where they have their own big buildings with everything in it. You show up, and they'll figure out what to do with you. Anyway, I called the Blue Cross Blue Shield of North Carolina (*sigh*, since RH is based in Raleigh) advice nurse line twice. They were both confident I should go to an urgent care facility, and were basically unwilling to believe there are no urgent care facilities here.

There are tons in Conneticut, there are tons in Rhode Island, there are tons in North Carolina, and there are tons in California. There are almost no urgent care facilities in New Hampshire or Massachusetts. Some puritans probably made a law against them a few hundred years ago. Or maybe, its because human life is oh so critically important that even non-emergencies should go to the emergency room "just in case". Or maybe its because the NE sucks in general. I dunno.

I finally decided to just drive to Rhode Island, because I really hate the thought of going to an emergency room for a non-emergency. Despite having to clutch with my hurt ankle/foot (which fortunately you don't have to do much on an interstate... just stay in 5th), it was actually a very positive drive. I was feeling pretty blue, and driving into the sunset in the outdoors is really nice. So I drive across Massachusetts and show up at this small town urgent care center. They X-Ray my foot, and my ankle. Nothing seems wrong, which surprises them given how my foot looks. Anyway, they X-Rayed my leg and it turns out my fibula (the small bone of the lower leg) is pretty badly fractured. So they splinted the area, and told me to go to an orthopaedic surgeon.

Lovely. Very strange that the pain was in my foot. I'm still a little paranoid that there's an occult fracture of the fifth metatarsal causing the foot pain. So the good side to all this is they gave me the X-Rays to take to the orthopaedic surgeon. I've been studying them and reading medical research papers from medline about what I see. I'm finding this very interesting. Ankle fractures (and sprains) turn out to be extremely varied. Looking at the damage from lots of different angles has also made it possible to reconstruct in more detail how I must have fallen.

Anyway, I don't have a scanner but I (very appropriately) gimped an online X-Ray of a healthy ankle to be a fairly good replica of mine. I cheated a little because I made it look like my posterior projection of my left foot. The online image is, I believe, a front projection of a right foot. From other projections it looks like this may be a spiral fracture, but from this projection it looks mostly like an oblique fracture. I'm not really sure either way, they apparently often look very similar from non-axial projections. I also labelled some stuff to give bearings.

The yellow areas are the (from left to right) lateral and medial malleolus. That's the boney bump on the left and right of your ankle. The pink area is the tibiofibular syndesmosis, which connects the fibula (smaller bone) and tibia (the larger weight bearing bone) together. Sprains are often a result of stretching this. Anyway, because the fracture is proximal to the tibiofibular syndesmosis, this is probably a suppination with external rotation (Weber B). That means the injury probably occurred with the weight leaned on the outside edge of the foot, and then the foot was rotated. It is possible that its pronation with external rotation (a form of Weber C).

So the bad news is that most Weber C injuries require open reduction (reduction is placing the bones so they align for healing). That would mean cutting my poor ankle open, and possibly even using syndesmotic screws that would have to be removed some weeks later :-/ The other problem with open reduction, besides the fact I'd need surgery, is that studies of outcomes suggest that open reduction results in a far slower recovery and goes awry far more often. With any luck its a Weber B.

Thu, 10 Jun 2004

Its 2:30 pm, I've been awake for a little over an hour, and this is turning out to be a very miserable day. Actually, take that back, its an agressively bad day. The context for this, is that I managed to sprain my ankle pretty badly over the weekend. So I figured there are two common sets of "bad things" in daily life: things that are annoying, and things that hurt. Sprained ankles have a way of taking the set of things that are annoying and making them also lie in the set of things that hurt. Trying to fall asleep is one of those things. I've been having trouble sleeping because, despite the ankle not hurting much during the day anymore, it always manages to throb at night. So I wake up to get a drink, hobble over to the sink, and then lie awake for another couple hours.

So to start my day, last night I forgot to reset my alarm clock which had been wiped by a power outage. Anyone who knows me knows the results of this: rather than waking up at 10am, I woke up at 1:15pm (and I'm lucky it wasn't 4pm) and missed an important meeting. I feel totally shitty about this. So I stormed off to the shower (well, limped agressively), and turned it on w/o thinking. It was freezing cold because I forgot to warm it a little first. In my panic, I put more weight than I should have on my hurt ankle, and fell. My head just missed the faucet, but I did manage to hit my head into the wall and felt dazed for a minute or two.

After a hasty shower (which I hate and makes my eyes feel sleepy the rest of the day, but I hold out hope that I won't entirely miss all the meeting) I go to get tylenol and a drink of water from the kitchen. In the process, I knock over the knife block and it falls to the floor. One of the knives hits handle first with the weight of the block behind it and the blade bends. Fortunately one of the cheaper crappy knives, but I still have enough scotsman in me to be very grumpy about this.

And to add insult to injury, I get stuck behind a dump truck going 20 mph for 2/3 of my commute. Normally I'm very intentional about not getting worked up about this sort of thing, because it doesn't really matter, but I get really annoyed. This only makes things worse. Of course, its also one of those ratty diesel things, so I'm stuck between sucking down fumes or roasting in the car with the windows shut and only internal ventilation on (no A/C).

So its now 2:41pm. I'll probably be at work at work until midnight, and head straight to bed. That gives the day 9 more hours to take me down.

Tue, 01 Jun 2004

Apparently, according to an article in the Economist, cicadas have prime-numbered life cycles of 17 or 13 years. Simplistically, when the number of prey increase after some time lag the number of predators increases, driving the number of prey down, resulting in equalibrium. Call this smoothly varying population. Prey that have a cycle where you spike every n years rather than smoothly varying have a leg up on a smoothly varying predator. When cicadas bloom there is a number of predators appropriate to no cicada. They reproduce before predator numbers rise, and then disappear. Effectively non-smoothly varying popultions can avail of the time lag before predator numbers rise to match prey.

This results in selection pressure for predators that have the same length of cycle as the prey. While same length is perfect, a predator cycle that is a factor of the prey length will also work. E.g. if prey has a cycle of 6 years, a predator with a cycle of 3 years can still arise in numbers to consume the prey. So the problem, from the predator phenotype's non-existantperspective is to guess (through random mutations, etc) the cycle length that overlaps most frequently with the prey. Factors of the prey's cycle length will, of course, overlap more frequently. The best length the largest factor of the prey's cycle length, namely the prey's cycle length itself. (As an aside, interesting abstract algebra connections with cyclic groups, etc.)

From the prey phenotype's non-existant perspective: It now becomes an information hiding game. Given a cyclic group of order t (constant time between cycles), how do we minimize the overlap with groups of all other orders? The answer is choose a large t with the fewest factors, i.e. to choose a long cycle that is also a prime number. Cicada's long prime cycles are a very rudimentary form of encryption to keep random mutations in predators from "guessing" a compatible cycle length. Cool!

Now of course, using a non constant function for time_between_cycles(cycle_number) would work even better. And in some sense Cicadas have that too by having two different cycle lengths. According to the economist article populations have even been observed shifting from a 17 year to a 13 year cycle in response to selection pressures caused by a fungus that developed a 17 yeard cycle.

Wed, 19 May 2004

Argument In Brief

  1. Microsoft's C#/CLI licensing people, at high levels, are aware of us.
  2. Microsoft can choose to do damaging things in the current C#/CLI licensing ambiguity.
  3. Microsoft considers the free software / Linux community to be a major competitive threat
  4. Microsoft does not "compete" gently
  5. A + B + C + D = ?

The word pile amassed below defends points (1) and, in particular, (2). I take points (3) and (4) as given. I leave point (5) an exercise for the reader. ;-)

Stupid Disclaimer

Since I'm not a lawyer, I don't know if these disclaimers are important. But given the nature of the topic, I'll play it safe and write one. I'm not a lawyer, and this ain't legal advice, its just a dump my current thinking on an issue. It does not represent my employer's opinion. It may represent my cat's opinion, but only on the second tuesday of summer months.

Restatement of the Issue

Miguel has repeatedly stated that the patents necessary to implement the standards ECMA-334 (C#) and ECMA-335 (CLI) are available from Microsoft "RAND + Royalty Free". This seems like an effective open patent grant and encouraged me initially that we could do Mono. I really like Mono. Its terrific technically, and I'd love to be able to use it. But two problems upon further consideration the past couple months:

  1. I've not seen an official statement by Microsoft that will let me trust the royalty free assertion. I think we are remiss if we do not assume Microsoft is looking for ways to, quite frankly, screw us. So unless there is a statement from Microsoft that they will have to stick to in a court, I feel (at the very least) uncomfortable.
  2. "RAND + royalty free", can still seriously screw Free Software. I think this is more important than the first point. Even with RAND + royalty free you still have to execute a license agreement with Microsoft, and license agreements can stipulate things that are RAND from a corporation perspective but still screw over Free Software. Also, there is evidence that key Microsoft people are already aware of (or planned?) incompatibilities between the licensing scheme for C#/CLI and, at least, the GPL. The eye of Sauron is upon us. RAND + royalty free is very different from a patent grant.

In short, we are in an adversarial situation. Microsoft does not want us to succeed. Thus we cannot trust Microsoft, even if we'd like to, and must consider Mono based upon the question "What is the worst thing MS can reasonably do?". We can only trust Mono if we are convinced Microsoft doesn't have weasel room. The current situation appears, to me, to have lots of weasel room. The technical merits of Mono are basically irrelevant if its a trojan horse in the long term.

The Horror Story

So here's the obligatory horror story based upon what I see as our current course. Actually, I don't think this is taken to extremes at all. The GNOME actions look to me like the path we are currently on, and the Microsoft actions are not out of character, and look legally tenable based on what I know today. Microsoft can choose to not exercise these actions, but they will have the possibility (and will be more likely to the more successful the Linux desktop is).

  • Act 1 - Novell hackers continue to push Mono. Novell hackers code most new independent programs/functionality in Mono and gradually start writing extensions to software like Evolution in Mono. Evolution's core continues to remain Mono free, but if you want features X, Y, and Z you have to use Mono. A few GNOME hackers write apps in Mono, some as toys, and perhaps a couple more serious. Red Hat hackers complain. Some try to weakly push Java and some stick with working in C & Python. Sun makes noise, and does their own thing, starts some wacky projects, tries to push Java with OpenOffice.org, and is generally ineffectual.
  • Act 2 - As the number of Mono-only features grows, Red Hat's unwillingness to ship Mono begins affecting sales. Novell holds a competitive advantage (self-inflicted by Red Hat) because Red Hat-written features can be shipped by SuSE, but Novell written features require Mono. A couple years down the line, Red Hat caves and begins shipping Mono. Evolution or some other major GNOME application begins to convert their core to Mono. Maybe a couple do. GNOME starts to move toward Mono.

So far, no real problems. We've got a better technical infrastructure, and new features are developed more quickly. There are some road bumps and schedule slippage as major GNOME apps (or core) begin to convert pieces more aggressively to Mono. There might be a loss of focus on user features for a while, such as happened with GNOME 2, but it won't be terribly bad, and the gains will be substantial.

  • Act 3 - Its been 4 years. Desktop Linux has made a large impact on the market, and Microsoft is even more determined. Large pieces of GNOME are written in Mono, and other parts of the Linux stack are considering it. Some may already be using it. Microsoft starts gently nudging companies, reminding them that they are required to license the C#/CLI patents. Novell already has a license so it can distribute Mono, and Red Hat is in the process of finishing the agreement with MS. (As an aside, notice that this doesn't totally screw over corporations, so beware treating their willingness or unwillingness to use Mono as a useful indicator of whether mono is safe).
  • Act 4 - Eventually Microsoft starts dropping barbs, saying things in the press, etc reminding people that to distribute C#/CLI implementations you need a license from Microsoft. It slowly works up to the point that they've made it very clear that individual contributors not working for their corporation etc all need to execute license agreements with Microsoft. In the best case, these can be done by individuals, in the worst case, RAND excludes license agreements that are "too small". In either case, people have to work with Microsoft to get a license (who stalls and takes a long time) and agree to terms that include restrictions on sub-licensing. Microsoft uses other license features to exert leverage in irritating ways. In the worst case (and this is unlikely for MS PR reasons) Microsoft actually drops the royalty free bit.

At the end of the day, big chunks of GNOME are based upon technology that is substantially encumbered. Microsoft has used the tactic of allowing technically illegal behavior and only later coming down to exert control / extract money in the past. For example, from an article by Dave Malcolm, "Chris Williams, Microsoft's director of product development, explained his attitude to software piracy in the Far East: 'We're just flooding the market with copies... The goal is... that when people actually end up having to buy software, they [will] already know our software and it's the one they will have to buy when the laws get passed. We're basically getting market share. As soon as we start to get a return on that investment, it will be humongous'."

From a paranoid conspiracy theory perspective, the current ambiguity affords Microsoft the most future possibility. I don't consider it ridiculous that it could even be an intentional trojan horse (of course, its dangerous whether intentional or not). If they came out and declared C#/CLI unencumbered in a satisfactory way, we would adopt Mono and life would be good. If they came out and gave the license terms which were in fact damaging, we would not adopt Mono and life would be OK. By providing just enough hooks to make those of us who really like the technology ignore the danger, but without providing details or statements that stand up in court, we buy into Mono without Microsoft having to give up a useful hold over us.

Can We Trust it will Always be Royalty Free?

The number of online MS-affiliated official-seeming sources (that I can find) suggesting that Microsoft will offer necessary patent licenses under royalty free terms is very small. I do not doubt that Microsoft will offer royalty free licenses at the present date. However, I can't find strong evidence that would legally lock them in (serving as promissory estoppel, or something like that) to providing the patent licenses royalty free in the future.

The Mono FAQ links to a posting to the "dotnet-sscli" mailing list from Jim Miller, one of the CLR architects. The relevant sentence (a fifth of the message!) is:

"But Microsoft (and our co-sponsors, Intel and Hewlett-Packard) went further and have agreed that our patents essential to implementing C# and CLI will be available on a "royalty-free and otherwise RAND" basis for this purpose."

The message contains almost no detail. Further, and perhaps more importantly, it doesn't seem to involve Dr. Miller representing Microsoft in an official capacity. In fact, in the first sentence he says "...I'd like to explain why I've never felt [emphasis mine] the two are in conflict." To me the whole message is premised as being a personal opinion. It is not Microsoft's official promise to provide "royalty-free and otherwise RAND". This could probably serve as a bit of evidence that Microsoft was presenting its licensing terms as royalty free. Its not the smoking gun, and probably won't serve as promissory estoppel. In short, this doesn't seem like the sort of evidence we should trust to protect us as a Free Software project.

Miguel wrote in a message to desktop-devel, "But lets not waste our time on this discussion on the mailing list, forlegal matters, you should get legal counsel. Have your lawyers engage Microsoft on this topic, that is the only way of getting a solid answer."

Its not enough that Red Hat or Novell's lawyers can call Microsoft and be told "we will license this to you royalty free". As a Free Software project, I want legal weight with public accountability to hold Microsoft to royalty free, not "call Microsoft and they'll tell you". That means some sort of public (web, preferably) legally binding page that says: we will offer the technology to anyone on these terms. Given the need to still execute a license, knowing its royalty free isn't enough. I think we need a public statement as to the terms of the license itself.

The other source of semi-official Microsoft statement about being royalty free, that I've found, is a ZDNet article from 2002 by David Berlind. He talks to Michele Herman, "Microsoft's directory of intellectual property". She states that Microsoft "...will be offering a conventional non-royalty non-fee RAND license". This is a pretty good source, and if it was on an official Microsoft PR website, I would agree Microsoft is probably locked in to royalty free. I'm a little more dubious about it coming from a magazine article, esp. given how historically off the mark I've found magazine article's to be about technology. Nonetheless, I do find it more reassuring than Dr. Miller's message. I don't know if this would hold up in a court, but its a lot closer.

Now the message from Dr. Miller does suggest that Microsoft (and Intel and Hewlett Packard) have made an official agreement somewhere to provide the patents "royalty-free and otherwise RAND" (perhaps on some ECMA form somewhere???). I would absolutely love to see that in some sort of official form! I've looked, but it doesn't seem to be available online. If somebody has a solid source (online or otherwise), on a Microsoft web page, some sort of ECMA statement or form or assurance, or in a direct verifiable form from a Microsoft spokesperson (not processed through some reporter), please email me.

RAND + Royalty Free Isn't Enough

ECMA pretty clearly requires companies to agree to license patents under RAND (Reasonable & Non-Discriminatory) terms. If you don't, you get booted out of ECMA. ECMA explicitly does not define what is and what is not RAND license terms. If I were a corporation, I would still feel pretty safe here, because I believe there is sufficient legal precedent defining what is reasonable and what is non-discriminatory (in the context of dealing with licensing to other companies). I am convinced Microsoft is going to provide licenses to C#/CLI patents under terms that corporations will find perfectly acceptable. So unlike the "royalty free" part where things look murky from what I know now, it looks like Microsoft is locked into "RAND".

Unfortunately, what is reasonable and non-discriminatory toward a corporation may not prove particularly reasonable for, and may discriminate against, Free Software. To my knowledge, there's no precedent suggesting that you have to accommodate Free Software to avoid being discriminatory. RAND, as I understand it, is clearly premised for a corporate context.

So lets assume that Microsoft was locked into "royalty free", and will provide "RAND + Royalty Free" license terms on C#/CLI to anyone, anywhere, for all time. What can MS do to Free Software now? RAND + Royalty Free still means you have to get a license. Licenses can stipulate things. And that's where the problems come in. Big problems. Compared to this, the question of Microsoft's commitment to "royalty free" seems like small fry.

Having to get a license at all is a major burden for hobbyists, individual contributors, and even small companies. At the very least, its irritating. At worst, if acquiring a license takes a bunch of paperwork and two months, it will very effectively deter (legal) individual contribution. Interestingly, in the aforementioned ZDNet article, Herman (MS directory of IP) provides a number of quotes about what the royalty free + RAND license might stipulate in the case of C#/CLI:

  • "Reciprocity": Herman says this means, "This is where I say I will license a royalty-free license to my essential patents, and in return I expect you to license your essential patents to me on an royalty-free basis." Now GNOME doesn't have any patents to license back to MS, but what this does suggest to me is that getting this license entails a real legal agreement where parties negotiate back and forth. Its not some shrinkwrap "sign on the dotted line" deal where MS is prepared to rubber stamp anyone who wants a license. You still gotta play with Microsoft, even if RAND means they have to play nice with other companies.
  • "Defensive suspension": They have the right to revoke the license if you sue them. Pretty standard, though still I'd rather not have to agree to.
  • "Field of use limitation": you only get the patent license for implementing the standard. Not a huge irritation practically, though it is GPL incompatible.
  • "Sub-licensing prohibition": you can't transfer the license, or use somebody else's license. This is the major practical problem. The sub-licensing raises all sorts of issues as to who has to get a license when mixed with free software. Do you only need a license for distribution (ala GPL)? Or do you need a license for each user (ala MP3)? If Microsoft wanted to really screw Free Software, could they require "per user licensing" (albeit royalty free)? In either case, each person, org, and company that wants to redistribute the software will probably have to license directly from MS (and given the indications from reciprocation, this probably isn't a totally trivial process)

Herman (remember, MS director of IP) explicitly says, "the field of use (...) and the prohibition on sub-licensing are inconsistent with the requirements of Sec. 7 of the GPL. Sec. 7 of the GPL says that if you do not have the rights to distribute the code as required under the GPL then you do not have the right to distribute at all. The GPL says you must have the rights to sublicense and to freely modify outside the field of use limitation." The GPL incompatibility presumably isn't a big problem for Mono since (I think) its under an X style license, and GPL'd apps can still run atop it. However, this underscores that Microsoft knows full well that their particular terms have interactions with free software. Given the potential for sub-licensing to wreak havoc (as outlined above), I'm very worried that we, the free software community, are not flying "under the radar".

Until I read that quote, I thought there was basically no chance this was an intentional trojan horse, that this was all just dangerous possibility. I'm not so sure anymore.

In conclusion, I refer you back to my opening argument.

Thu, 08 Apr 2004

So I got a steering wheel and pedals that are compatible with Linux for $25. No force feedback, but makes TORCS a lot more fun.

Fri, 12 Mar 2004

Usability testing (perhaps more aptly called "learnability testing") is all the rage. If you look on the web for information on usability you'll be bombarded with pages from everybody from Jakub Nielsen to Microsoft performing, advocating, and advising on usability testing. Its unsurprising that people who have been educated about HCI primarily by reading web sites (and books recommended on those web sites) equate usability with usability testing. Don't get me wrong, usability testing is a useful tool. But in the context of software design its only one of many techniques, isn't applicable in many situations, and even when it is applicable is often not the most effective.

  • Why is usability testing lauded all over the internet? The most visible and growing area of HCI is web site usability, because it has received broader corporate adoption than applying usability to other things (e.g. software). In other words: most usability discussed on the internet today is in the context of web page usability, and web page usability is profoundly improved by usability testing. Thus it is not surprising that much usability discussed on the internet today deals with usability testing.

    Desktop software usually presents a substantially different problem space from web pages. Compared to each other, desktop software represents more complex and varied operations where long term usability is crucial, whereas web sites represent a simple operation (very similar to 100 other websites users have used) where "walk up and use perfectly" is crucial. Design of infrequently used software, like tax software, is much more similar to web site design. One simple example... In most web pages, learnability is paramount: if on the first time visiting a web site users don't get what they want almost instantly and without making mistakes they will just leave. Learnability is the single most important aspect of web page design, and usability tests (aka learnability tests) do a marvelous job at finding learning problems. In a file open dialog learnability is still important, but how convenient the dialog is to use after the 30th use is more important.

  • A good designer will get you much farther than a bad design that's gone through lots of testing. (A good design that had testing applied to it is even better, but more comments on this later) Usability testing tends see the trees instead of the forest. You tend to figure out "that button's label is confusing" not "movie and music players represent fundamenatlly different use cases". Because of this usability testing tends to get stuck on local maxima rather than moving toward global optimization. You get all the rough edges sanded, but the product is still not very good at the high level. Microsoft is a poster child for this principle: they spend more money on usability than anyone else (by far), but they tend to spend it post-development (or at least late in devlopment). Its not an efficient use of resources, and even after many iterations (even over multiple versions) the software often still sucks. A good designer will also predict and address a strong majority of "that button's label is confusing" type issues, so if you do perform usability testing you'll be starting with 3 problems to find instead of 30. That's especially important because a single usability test can only find several of the most serious issues: you can't find the smaller issues until the biggies can be fixed. In summary: with a designer you're a lot more likely to end up optimizing toward a global maximuum rather than a local maxima, AND if you do testing it will require far less usabilty testing to get the little kinks out.

  • Usability testing is not the best technique for figuring out the big picture. Sometimes you will get an "aha" experience triggered by watching people using your software in a usability test, but typically you can get the same experience by watching people using your competitor's software too. Also a lot of these broad observations are contextual, they require an understanding of goals and how products fit into people's lives that is absent in typical usability tests. Ethnographic research is typically a much more rewarding technique for gaining this sort of insight.

  • Producing a good design requires more art than method. I think a lot of people are more comfortable with usability testing because it seems like a science. Its methodical, it produces numbers, its verifiable, etc. Many designers advocate usability testing less because it improves the design, and more because its a useful tool for convincing reluctant engineers that they need to listen: usability testing sounds all scientific. Usability testing can be a very useful technique for trying to get improvements implemented in a "design hostile environment". This is part of why I pushed/did more usability testing early on in GNOME usability. Companies would love it if there were a magic series of steps you could follow to produce genuine guarunteed ultra usable software. Alas, just like programming, there isn't. A creative insightful informed human designing the software will do much better than any method.

  • Usability tests can't, in general, be used to find out "which interface is better". I mention this because people periodically propose a usabilty test to resolve a dispute over which way to do things is right. Firstly, you'll only be comparing the learnability. There are many other important factors that will be totally ignored by this. Secondly, usability tests usually don't contain a sufficiently large sample of users to allow rigorous comparison. Sure, if on interface A 10 people used it without trouble, and on interface B 10 people used it and 40 serious problems were reported you can confidently say that interface A was way more learnable (and at these sort of extremes you can probably even assert its much better overall) than interface B. But its rarely like that.

    Example: We test interface A on 10 people and we find one problem that effects 8 of the people, but only causes serious problems for 2 people, and 3 serious problems that effect one person each. We test interface B on 10 people and we find one serious problem that effects 3 people, another serious problem that effects 2 people, and 3 serious problems that effect one person each. Which interface is better? Its a little harder to tell. So lets say we argue it out and agree that interface A is better on usability tests. But we've only agreed that interface A is more learnable! Lets say our designer asserts interface B promotes a more useful conceptual model, and that conceptual model is more important than learnability here. How do we weight this evidence against the usability test? We're a little better off than we were before the test, but not a lot, because we still have to weigh the majority of evidence that's not directly comparable. If we always accept "hard data!" as being the final authority (which people often, somewhat erroneously, do in cases of uncertainty), even when the data only covers a subset of the problem consideration then we are worse off than before the test.

So am I saying that usability testing is bad or doesn't improve software? No! If you take a good design, usability test it to learn about major problems, and use that data and experience to improve your design (remembering that design is still about compromise, and sometimes you compromise learnability for other things)... you will end up with a better design. Every design, even very good ones, that a designer pulls out of their head has some mistakes and problems. Usability testing will find many of them.

So why don't I advocate usability testing everything? If you don't have oodles of usability people, up front design by a good designer provides a lot more bang for buck than using that same designer to do usability tests. You get diminishing returns (in terms of average seriousness of problems discovered) as you do more and more fine grained tests. Its all about tradeoffs: Given n people hours across q interface elements (assuming all people involved were equally skilled at testing and design, which is obviously untrue) what is the optimuum ratio of hours spent on design vs. hours spent on testing? For small numbers of people hours across large numbers interface elements, I believe in shotgun testing, and spending the rest of the time on design. Shotgun testing is testing the interface in huge chunks, typically by taking several large high-level tasks that span many interface aspects and observe people trying to perform them.

An example high-level task might be to give somebody a fresh desktop and say: "Here's a digital camera, an e-mail address, and a computer. Take a picture with this camera and e-mail it to to this address". You aim at a huge swath of the desktop and *BLAM* you find its top 10 usability problems.

Anyway, like practically everything I write this is already too long, but I have a million more things to say. Oh well ;-)

Tue, 17 Feb 2004

GNOME's desktop-devel-list today is just what gnome-hackers list used to be. Its not like this is a new problem. Lists start out good, but then too many people get on them, so we eventually restrict who can be on the list, and then some people think we are too elitist and start a new list. Which is non-elitist and has a high signal to noise ratio... until the effects of non-elitism creep in, and we have these problems all over again.

  1. Having a central desktop list seems like a thing that happens naturally, and is also the list I'm most likely to read. Thus I personally at least consider it good.
  2. Restricting access removes much of the cluelessness, but at the cost of greater administrative burden, and locking out valuable potential contributors
  3. Restricting access does not typically make lists regain their old "high signal to noise ratio" status. For example, gnome-hackers was periodically prone to extended technical discussions (by clueful people) that became tiresome for most people on the list and ideally would have jumped list. They were often good discussions to have, but not everybody needed to be party to them.
  4. Fragmented lists tend to be ignored, even by the people they are most relevant to (such as the relevant maintainers, often)

In short, it is best to have fewer lists, but we need to alleviate the problems that make a few central lists occasionally painful.

It seems that the real problem is not the variety of the threads, but that some threads don't die which we'd really rather not have on the list-that-everyone-reads (or at least, used to read ;-). Flags and the recent release name discussions come to mind. What if there was a way to create quick temporary break out discussion lists? Something that required no admin maintenance. That way rather than fragmenting general discussion, we could create immediate outlets for in-depth (and sometimes important, othertimes not) discussions that most people don't want to read (or in my case Mark As Read).

Rather than fragmenting lists by "general topic", which seems not to work, why don't we fragment list traffic on a per discussion basis. Very few discussions will need this, but the few that do we can not destroy the public list's readability for the week+ it takes to run its course.

Say we have 10 or so responsible people who can create a breakout discussion "list". There's a little web form one of these people can use to break-out a discussion. The person gives the subject of the discussion into the web form, and an "End of Discussion" ultimatuum/message gets automatically posted to ddl. In this ultimatuum is a link. When the message is received, clicking on the link pops up a form where you can enter your e-mail address, and *poof* you're in on the discussion in the breakout list. Every post to the breakout list has a link appended for leaving the list. Basically people interested in the conversation can keep at it, but the conversation moves off-list in a convenient manner. Some considerations for the breakout lists:

  • We don't try to do any security or passwords or confirmation e-mails for adding/removing people from lists, because these things are supposed to be cheap, dirty, and ephemeral. They need to have a ridiculously low barrier of entry.
  • We don't want too many people who can create breakout lists, or any discussion that generates a dozen messages will get somebody trying to break out the discussion. When breakouts happen too often, people will start ignoring the breakout ultimatum and will keep posting on d-d-l, destroying the efficacy of the technique when it is really needed. On the other hand, we need enough people who can create lists that at least a few of them are active on the list every day. That way it doesn't become some onerous task for which an "admin" has to be tracked down, and coaxed to waste their precious time on (for example, its not like trying to get somebody to do CVS surgery for you).

It is interesting to compare the "list problem" to how discussions work in the "real world". In the real world we would have serious trouble if everybody had to listen to every discussion involving more then four participants. The fragmented lists suggestion is somewhat akin to having 25 separate rooms, each devoted to a particular topic. This is a sort of weird division, and people are probably going to drift into larger rooms (or have off topic conversations). Naturally people control conversation and topic interest pretty well by drifting in and out of groups. Basically, breakout discussion lists is a way to try and accomodate that sort of ephemeral shift.

Mon, 16 Feb 2004

January 22, 1984: the Apple Macintosh is unleashed on the world. The world blinks and keeps on turning.

The release of the Macintosh wasn't the revolution, it was a symbol of the revolution. It wasn't merely the introduction of an "insanely great" product line but of the debutante ball of the process that birthed it. And at the heart of that process (human-centered design) was a paradigm shift. The question was no longer "What will this computer's specs be?" but "What will people do with this product?". That question is as relevant (and almost as frequently overlooked) today as it was twenty years ago. The importance of the revolution was less in Windows Icons Menus and Pointer and more in approaching product development from the right direction. Until widespread development and design in the computer industry is focused on a question like that, the Macintosh revolution is far from over.


The Star desktop, circa 1981

There is widespread disagreement as to when and where this revolution began, but it is not contentious that the ideas took root in the feracious ground of Xerox PARC in the 70s. The end result was the Xerox 8010 (aka Star) desktop, released in 1981. To a large extent the Star interface is extant in modern desktops, but this belies the importance of the Star: it was the result of human-centered design. Engineers and researchers at Xerox tried to create a computer that could be used to "do people things" rather than just crunch numbers. Focus was not on specs and technology but on what Star could accomplish.


The Alto's "Executive", circa mid 1970s

It is interesting to compare the Star interface with the interface of the Executive program from the equally famous Xerox Alto (from the mid 70s). The Alto was a technical marvel, with a bitmapped display, windows, a mouse, and ethernet. But while the Star really adds nothing to this impressive list of technology, the difference between the two, in terms of user experience, is like night and day. Technological invention can enable real improvement, but its not enough (usually its not even necessary). Anyway, enough historical meandering. The story of the Macintosh, Star and Alto is very interesting, and there's a lot of period documents dealing with that subject... maybe I'll post a list of links another day. But back to my agenda: :-)

At best I think most people ask "What could people do with this computer". That's a very different question from "What people will people do with this computer"... there are so many nifty features that if people pushed themselves they could use, but have a high enough barrier to entry that people don't bother.

Example: I have a nice thermostat in my apartment. Its fairly well designed and has quick push buttons for "Daytime", "Night" and "Vacation". It was even straightforward to set these to my preferred temperatures for "In the apartment, awake", "Out of the apartment or asleep", and I haven't bothered with the vacation button. Now I have noticed that I don't like to get out of bed in the morning because it is sort of cold. In fact, sometimes I'll lie in bed for 30+ minutes because its cold, which is a big waste of time (I'm not very rational when I'm waking up). I have noticed that my thermostat supports scheduling changes between day and night temperature. I even looked at the instructions beneath the faceplate, and it looks like it'd be fairly easy to program. But I haven't done it. The device is usable in the sense that if I wanted to, I could program it, and probably get it right on the first or second try. Its not hard to use. But its a little too inconvenient, because I'd have to special case my weekend schedule, I'd have to set several different times using the fairly slow "up", "down", "next item" interface for setting time (on most alarm clocks etc). The point is, its not hard to figure out, but its stills too much hassle. So while I could program the thermostat, I won't. There's always something that seems better to do with my time, and I can't be bothered (even though rationally I know it'd be better overall if I just program the silly thing).

The Macintosh revolution, at least how I see it, was about conceiving your (computer related) product in terms of what people will do with it. Sometimes we need to "get back to the basics"...

Wed, 28 Jan 2004

 

The GEGL!!!
 

Tue, 27 Jan 2004

Lesson: Do not leave soda cans in the car. It might seem obvious to those of you accustomed to cold climates, but I finally realized I have to think of my car as a freezer. So I left 12 cans of mountain dew in my car right behind the driver's seat. What happened? Of course about half the cans exploded when they froze. So that's not good... unfortunately I didn't notice this for three weeks. Well, turns out one of these days must have gotten above freezing, because the mountain dew in the exploded cans melted. Of course by now its all frozen into my carpet. I hate this place.

Lesson: Microfiber pants really help. Thanks to a suggestion from Carl-Christian Salvesen in Norway I've swapped out jeans for microfiber slacks if I'm walking around in the cold. They're much lighter so I sort of assumed they wouldn't work as well as jeans. Not so, they appear to trap a lot more heat. When its really windy I have windbreaker-pant things I can pull over them.

Lesson: If you want to buy gloves and scarves you have to do it before winter. People in this place exhibit extraordinary wishful thinking during the winter, it seems. The stores have already had their "get rid of everything" winter clearance sales. Sears had not a single scarf or pair of gloves left. The stores are filled with people bundled up to the rafters in coats, gloves, scarves, etc buying... swim suits and light skirts. Its ridiculous. I mean, I know you start selling before the season starts, but its not even February yet! What are you supposed to do if you lose your gloves? (anyway, I finally found some nice black leather gloves, but I tried a bunch of stores that used to have them but don'nomo')

Wed, 21 Jan 2004

Today is a bad day for banking.

So this morning (not 20 minutes ago) I pulled up to a bank's drive-up ATM with the intent of withdrawing $40. I ended up with $400. I also managed to lose my bank card.

Despite the fact that most ATMs only handle money in multiples of $20, they still require you to enter the "cents". So asking for $40 entails the button sequence 4-0-0-0. I almost always withdraw $40, so I perform this series of presses without a lot of higher brain involvement. Unfortunately, this ATM fixed a "bug" in the way 95% ATMs work: they only let you enter whole dollar values.

So I pressed 4-0-0.... and caught myself before pressing the final 0 based on the feedback on screen (actually, if I'd pressed the final 0 things would have turned out better because ATMs won't give you $4000 in a single transaction).

Most ATMs have on screen commands and buttons along the sides of the screen that are supposed to line up with the commands (press the button and it executes the command "next" to it). The problem is they often have the buttons far enough away from the edge of the screen, and the buttons are raised. The net effect is that at different heights the buttons line up differently. Additionally, even when there are only two options, they tend to put them on buttons that are right next to each other. Even given the flawed physical design, the chance of error could be dramatically reduced if the options were always kept as far from each other as possible.

Well, I was flustered, because its disturbing to know that one button press will dump $400 in cash on you: I wanted that $400 off the screen pronto! In my haste I did not account for the button line up (my car is really small and hence low... buttons must have been designed for an SUV), and pressed the accept option instead of cancel. 30 seconds later I'm flush with benjamins.

So my first reaction is "put this cash somewhere safe", so I find a place to stow it temporarily. Then I glance over and see the receipt and grab it, because I sure want a record of this transaction until I count the loot. Then I realize that because I'm getting an apt soon, I really want the money in the bank ASAP, so I back up and grab a deposit envelop.What did I forget? Oh yes, I forgot to take my card.

So I pull away from the ATM into the bank's parking lot to fill out the deposit envelop, stuff the cash into it, and head back to the machine. I fumble around my wallet for my card. Can't find it. Then I realize that I might have put the card loose on the seat next to me (which I sometimes do if I've already stowed my wallet in my pocket). So I drive back to the parking lot and dismantle my car looking for the card. Then it hits me :-(

So I head into the bank, and the nice man at the desk gets the manager and they go to check the ATM. Oh sorry, your card was with another bank, so we can't give it back to you. Unfortunately my credit union has no branches within thousand miles of here, so its going to take time to get a new card.

$(#*&(&*(*&!!! On the upside, I got $400 out of the account before losing my card, and I guess I can still write checks from that account to get an apt, so its not the end of the world.

I can't believe I did this because I've always been grumpy and conscious of the button-line-up usability problem present in many ATMs. Good ATMs have the buttons close to the screen and at the same height as the screen so it all lines up no matter what angle you look at it from...or they're touch screens (which has other downsides at times, but overall I think is an improvement).

Fri, 16 Jan 2004

Well.... I have learned some valuable lessons. This is my first experience with True Cold[TM]. So last night I decided to walk 2 miles from my friend's place back to the bed & breakfast... at 1 am. At -30F (with windchill). It was... very cold. Lesson one was that jeans cannot, in fact, be worn in any weather. By the time I got home my legs were very, very cold. Lesson two is that a scarf is a worthwhile thing to have (at least, I'm guessing it would be) because by the time I got home sans scarf I could no longer feel my nose. Lesson three is that you look funny the next day after exposing yourself to cold. My skin is all red and is flaking.

On the upside, my jacket held up well, and I learned that socks can actually work better than gloves because they keep all your fingers together (thanks Josh!).

Tue, 13 Jan 2004

Still trying to develop a rhythm for what I'll be doing. Its weird, but where I know how to change whatever in GNOME (who to talk to, who to avoid, etc).... when it comes to messing with things outside in GNOME in Red Hat I really have no clue where to go. So slowly figuring that sort of stuff out.

In other news, been looking for an apartment. I made the mistake of agreeing to stay a month at the Bed and Breakfast. The downside is that they really don't have enough parking and I'm double parked in their driveway, which means I have to get up early to make sure that my car doesn't hem somebody else's in. Oh well.

Currently I'm planning to live somewhere "close to Red Hat". Have looked at a number of apartments... currently leaning toward living in Nashua, NH though that is notably farther from Boston than, say, Kennsington apartments (which is absolutely stellar... except its $300/mo more than I want to pay).

Mon, 05 Jan 2004

Yes, my weblog has grown silent. Yes, important messages clamour for my attention amidst the congestion that is my inbox. What has been going on you ask?

Last week I spent on the road driving ~12 hours a day. My brother foolishly agreed to go along and had to put up with a week of me whinging about his speeding and swerving (sorry about that Drew!). I think I would have gone nuts without the company. Along the way I stayed with a good HS friend (Kenny Martens) who I hadn't seen in 4 years, a close stanford friend who I'd never gotten a chance to say goodbye to (Jamie Fitz), and another good stanford friend who graduated early and went east (Brian Shieh).

We spent two nights sleeping in rest areas... in AZ the temperature dropped to -15C. Drew was wise enough to sleep in the car.... I on the other hand was huddled on the cement next to a picnic table in my sleeping bag (can't stand sleeping in cars). Arrived at my destination on Friday. In a fit of compunctions I flew my brother home instead of selling him to a passing ship as I had originally intended.

Where was I driving to you ask? San Fransisco to Boston by way of Dallas. "Boston? Boston?!?" you exclaim, "Shan't you perish in a blizzard of ice and bad driving?" Yes! But sacrifices must be made. "But why Boston???"

Funny you should ask. That brings me to the next tidbit of news. As of today I'm now working at Red Hat in Westford, MA ("Boston", MA for some definitions of Boston) as an interaction designer.

The original plan was to arrive in Boston on the 1st, stay with my friend Maisy before she disappeared back to stanford, and find a place to live by the 5th when RH threw me on a plane to Raleigh for employee ortientation (which is overall incovenient and disorienting, but c'est la vie). This plan failed. Its been considerably harder to find a place to live in Boston area than it was in either minnesota or bay area. Most people apparently use a realtor who charges first month's rent in fees (!!!). I'm way to Scotch to throw my money at somebody who I consider to be the ultimate middeman. Anyway, current plan is to stay at a bed and breakfast when I get back from Raleigh while I figure out where I want to live. It'll also be good to get a taste of the ~40 minute commute from Cambridge area to Westford before I sign a 12 mo. lease locking me into either the boonies or into a ridiculously long commute. ;-)

Tue, 30 Dec 2003

Best design principle I've heard in a while: "What can we fit on the form?"

Mon, 22 Dec 2003

Got back from a terrific week in Los Angeles hanging out with Nancy. It was a little eclectic and frantic (perhaps too many activities), but still the best vacation I've had in a long time...and probably the last for a long while! She took me to (among other places) the Getty, which was the richest structure I have ever experienced. I thought places like that only existed in computer renderings like Myst and people's imaginations. In particular, I'm enamored with the musical rocks and the central garden; but the whole thing is an experience. Being there with Nancy magnified my enjoyment greatly, because I feel like she (more than most people I know) really appreciates simple sensory experiences like the texture of fabrics, etc.

On another note, I'll join Jeff in sucking by way of having a queue of important messages that I need to get to. As Jeff said: " Apologies to anyone who is stuck in that queue, but I have most likely marked your email as important, and will be getting to it this weekend". Except that I suck way more because I most likely won't be getting to it "this weekend". But after the dust settles in the next two/three weeks I hope you'll find I'm a sane and functional human being again (well, as sane and functional as I ever am, at least).

Fri, 05 Dec 2003

AOL has recently allowed multiple clients to be logged into AIM at the same time. Previously if you logged in from a new location the old login would be disconnected. As a wireless IM, user this change is generally welcomed: I previously maintained two separate accounts so that I could be constantly logged in from my cell phone, whilst still communicating using the rather faster keyboard when I was sitting in front of a computer.

Unfortunately, as the explanation of how the "routing behaves" makes clear, its a bit of a hack. If there's support in the protocol... its sure not visible now. Instead, AOL plays games server side to try and guess where to route messages. If one client is marked AWAY, then messages get routed only to the other client. It appears that if both clients are present, both get the message. What happens if both clients are marked AWAY? need to experiment. Lilly has complained that her laptop was left logged in (I think while she went away on Thanksgiving break?), and people have been sending her messages for a while believing that she was ignoring them. Basically, the point is that instant messages are currently fairly transient... they might get to the receiver, they might not. You can't rely on them.

Unfortunately its a lot worse than just being transient. If instant messages were reliably transient it would enable useful social behaviors. For example, if want to get in touch with somebody quickly (say I want to borrow their rootabega in a hurry) but I don't want to be seen as desperate or harassing I can't safely send them a message every hour to see if they've sat ata their computer. Physically "dropping by" allows for totally transient communication. If the person isn't there, they never know about it (unless you want them to, in which case you leave a note, which is pretty useful: you can select whether the "communication" is transient pretty reliably). Phones used to be transient (if you didn't connect, they never knew you called), though with caller ID and call logs that's no longer quite true.

In a perfect world, I think AIM would be more like IMAP. Messages would be stored server side, so that you could get at any messages you wanted from any client. Of course, like IMAP, you'd want to make sure it was easy for clients to "cache" the history, so they'd only grab new messages (or handle pending deletions). Conversations would persist, so that when you close a window, when you open it up again the last conversation is still there (no matter what machine you had it on). If you needed to get rid of some particularly incriminating information from the "log", you could just select it and DELETED (sorry, too much strongbad). IM would be dependable, and the quirky issues of handling multiple machines would be solved. So that handles the case of totally persistent messages.

Allowing for transient pings in the most ideal fashion requires client cooperation. I don't know how many people really use non-AOL clients... if its small then you can pretend you have client cooperation (just like caller ID used to be rare and unexpected). Anyway, what you'd like to happen is to have a way to send a message so it pops up on the other person's screen, stays there for 5 minutes or so, and then disappears with no trace or log entry. You could use this to send "Hey, you there?" type messages.

Another feature that could be nice for making IM appropriate for more serious transactions is a "federal express" feature where people have to "sign for" a particularly important message. This is the sort of thing I don't imagine being used terribly often, but it does allow for certain sorts of more serious transactions (its basically a step up from persistence). When a "sign-for" message arrives, the recipient sees a dialogue that says "SoAndSo has sent you a message. Clicking Read Message will send an automatic confirmation of receipt to SoAndSo."

A more agressive variation on this theme this allows the sender to include a short agreement that will be presented in the initial dialogue, with an "I Agree" button instead of "Read Message"; the basic idea being to allow short contracts prior to receipt of information. For example, "SoAndSo has sent you a message which requires a legal agreement. They stipulate: This message can not be shared with anyone, blah blah blah blah. If you agree to these terms, click I Agree to read the message (an automatic confirmation of agreement will be sent to SoAndSo). Otherwise click I Disagree."

Sun, 23 Nov 2003

Well... Just got done talking at SCALE. It just made me wish even more that I wasn't missing a day of the summit. Ah well, its important to respect your commitments, right? Felt isolated, didn't know anybody. Talk went well, I feel, though of course I was speaking on a pretty controversial topic (or at least my premise was controversial). *shrug* Anyway, it sucked to get off the plane, find no-one there (my bad, I stupidly assumed I would be met) pay for a shuttle to the convention center... Hang around not knowing anybody for 4 hours (on 4 hrs sleep, mind you), give a stressful talk wherein half the questions were either complaints about things I can't change or age-old annoying rants (some very good comments/questions too tho), and then watch everybody leave the expo and make my back to the airport alone....(actually, on the bus to the airport now ;-)

oh, one thing positive that came out of this was I learned how invaluable having wireless internet connectivity is for getting info in strange situations. Id be at my wits end now w/o a cell phone (and inet was icing on the cake).

the past is behind me.... New York Ho (hm, make that "ny bound")

Mon, 10 Nov 2003

Chema cared so much about...everything that I knew him to be involved with. I can only assume this passion extended to every aspect of his life. And it wasn't just projects, Chema really concerned himself with people. I was always really excited about any opportunity to hang out with Chema in person.

I remember one point very early on when the usability project was having real trouble. I was getting frustrated and impatient, and there was a lot of conflict on mailing lists. I guess Chema noticed too, because he popped on IRC and grabbed me to talk things over. What really struck me was he wanted to figure out not just how to make GNOME usability more effective, but also how to make sure I wasn't getting fried.

Every discussion I've had with Chema got me excited, whether about social organization of online communities, or how GST fits into the overall desktop design. That dude had the most infectious energy, and will be very tangibly missed.

So in case you haven't telepathically inferred the life details I've neglected blogging about, I'm a visiting researcher at PARC (the artist formely known as Xerox PARC) for the time being. I know, I know: "Hey ya damn hippie, go get a real job". "Shoophlah!", I say, "Research does wonders for the soul. Oh you greenback grubbing capitalists, my academic heart sheds crocodile tears for you, ensnared by tales of pleasure, power, and fame into its endless pursuit. Just because a week of my labours costs The Man less than a day of yours doesn't mean that....Look here, I don't have time for this, I need to be working on my grant proposal."

Aaaaanyway, this is all well and good. But for the first time in a long time I'm working under s33kretive conditions wherein I can't talk about how I spend a sizable chunk of my time. What a strange symptom of the information economy (I would say "information age" save that I fear the outpouring of mob justice that would most assuredly stem from the utterance of such a clichƏ. Um, where was I?).

Millions of people are asked to compartmentalize a major piece of their time, of their life: their work. Knowledge of our own activities becomes intricately intertwined with the secrets of the corporation, and so itself becomes the property of the corporation. It is no longer mine to blog about, it is owned by PARC. Now it is one thing to have a few secrets... even secrets you hold for employer's sake. But it is a strange thing indeed to not even be able to disclose the topic (let alone the details) of the cause of so much pre-occupation on your part...

So instead of ramblings, you get meta-ramblings. Ramblings about why I cannot ramble. Be content, dear child, with the bread crumbs.

Fri, 31 Oct 2003

I think you're saying I am glossing the implementation phase.

I'm saying most of GNOME and free software development culture in general are glossing the conceptualization (aka idea generation, aka planning, aka big-D Design) phase.

They are both substantial, hard and, in the context of building *good* original software, both essential.

If you skip implementation you end up with NO software. If you skip conceptualization you end up with crappy software (or a clone, which is better than crappy software, but if Linux takes over the world and can only clone, then software will stop evolving and I will shoot myself... you can't wake up one day and suddenly start inventing after a decade of copying). Unfortunately people seem to fall into two categories: either they skip conceptualization (pragmatists, Meyers-Briggs S's) or they skip implementation (dreamers, Meyers-Briggs N's). The people who have programmed the software stack we rely upon and are hence respected have, of course, tended to be pragmatists. The dominant culture that grew out of this has rightfully observed that crappy is better than nothing. Unfortunately, they have fallaciously extended this argument and decided that implementation is better than conceptualization.

Given a choice between one or the other, that is true. But they're not mutually exclusive. To generate software that's better than the status quo you NEED both... and both are difficult and substantial tasks that you need to work hard at.

With a big project that's composed of lots of people, you can, in theory, have both. But in ecological terms, this does not often happen. Most open source projects start as a very small group of people, and given some success they grow larger. Unfortunately, people tend to hit it off with people who are "like them" so the dreamers tend to only work with dreamers, and the pragmatists tend to only work with pragmatists. Dreamer groups always die young because of course nothing is worse than crappy. Pragmatist groups like GNOME, by virtue of observing the failures of dreamers, develop a culture that is HOSTILE toward dreamers (and the only "dreamers" who stick around as a result are those that don't have anything better to do than get shot down... namely loser dreamers... which only reinforces the feeling of the pragmatists that dreamers are a waste of air).

Havoc: "if you have an idea" to me is glossing over the developing an idea phase as badly as I gloss over the implementation phase. ;-)

The people who do the best implementation are not necessarily those who do the best ideas. In fact, they're often not. We can't expect to develop a culture of implementation relying on cloning for half a decade, and then magically once we've got market share based on our key feature (and we do agree that freedom is at the core of what we're doing... though I would go off and shoot myself if we get freedom and desktops stop making progress entirely ;-) turn on the invention spigot. These are things that need cultivating...

Wed, 29 Oct 2003

Havoc: The point is not what the Apollo program was able to do, but to contrast the Apollo program with the expenditures on "modern NASA" which is only 10% less funded (accounting for inflation) and does a lot less. Its an anecdote, the point doesn't really rise and fall on NASA, its just meant to illustrate on a grandiose scale ;-)

I'm not arguing for haste, I'm arguing for direction. Linux 2.0 was a laughable toy, but without the direction provided by copying the features of other Unix systems, Linux 2.6 could not exist. Maybe they did it better, maybe they didn't, but in terms of core features, its the same thing. The cool things happened in a thousand steps, but a thousand steps in various directions will not get you anywhere particularly cool.

So people are making the X server support smooth 3D graphics and in the future we'll implement the next generation UI on top of it. But we have no idea what the next generation UI is, so how do you know that supporting smooth 3D graphics is needed for it? We only know because we have seen that those are the components Apple used to make their next generation UI. Our only roadmap right now is cloning. And in UIs (perhaps unlike kernels) I believe that the cloner is pretty much doomed to greater crappiness, and that at least a release behind.

Firstly, that there are a hundred people sitting around saying "we should do something cool" doesn't mean there are a hundred people sitting around proposing genuinely good things to do. Secondly, good ideas are NOT cheap, and ideas don't just magically happen. Making good ideas takes work, research, lots of reflection, argument, etc. As I've said before, mailing lists are create for the reality-check phase of idea generation, and terrible for the brainstorming phase.

That there are already thousands of *cough* enthusiasts suggesting ideas (or worse, suggesting that "something cool should happen") does not mean that we should avoid developing good ideas in favour of just sticking our nose to the grindstone (in the same way as the number of usability enthusiasts does not reduce the need for good UI design in GNOME). Avoiding idea generation because the unwashed masses love to engage in it is reactionary.

An straw man interpretation of what you are saying is basically: "stop worrying about ideas and get back to the REAL work". Behind almost every important and useful product was not just an idea, but ideas. And they weren't just selected from the myriad ideas floating around the world, they were developed... they were nurtured... they competed with each other... etc. This is important and real work. Knowing what you're building is pretty important, and it takes real work to figure that out. GNOME has (perhaps in reaction to avoiding being like the l0z3rz) imo strayed way too much toward the "shut up and build!" side of things. Yes, that's PART of what makes things get done, but its not all of it.

Christian: "There needs to be something to build upon before the grand vision stage is plausible"... which interestingly enough is exactly how the last decade of NASA expenditures is justified. They say they are building "generic space tools" that could be used "to build anything". But unfortunately they've done much and accomplished relatively little (even in the tool expansion department). The blueprint comes first, the foundation comes second, and interior decorating can come third (though in reality I think this is often done between blueprint and building, it doesn't necessarily have to be).

I think people assume we are talking interior decorating. You know, you build all the real manly libraries you need to do any sort of computing work, and then you build your little GUI as wall paper atop that. The thing is that the particulars of the libraries and the base system highly constrain the interface... not just in terms of what's possible technically, but what developers will actually write (for example, the HIG is oft criticized as being ineffective because some of the HIG suggestions for controls require more lines-of-code).

HIG btw is NOT a grand vision. HIG is reflective not visionary for the most part. The several little changes we've made that are not reflective of GNOME but instead improve the status quo are always sources of controversy too.

Murray: finding a (good) direction won't be a result of figuring out the average of all the "direction vectors" that GNOME developers want to move in. This is bound to be some composite of "what windows does" and "what OS/X does". Long term goals of use in the sort of way I'm thinking have not even been posted to gnome planet blogs, to my knowledge. A goal in the way I'm using the word is NOT "lets have a cool configuration database", that's a task (a huge one). A goal is "Lets go to the moon" or "Lets make a way for mass installations of GNOME to be remotely admin-able" (and this would really be a sub-sub-goal of a much larger picture). Most of the good goals are going to take serious thinking, I don't think people have concretely formed them yet. And yes, most of them should be human-centered "interface"/"interaction" (even if that's not GUI) goals.