Tue, 29 Jun 2004
Unfortunately my ankle was fractured pretty badly and it was important I have surgery on Wednesday. Unfortunately this precluded my flying to Norway for GUADEC on saturday. I actually proposed that I fly to Norway on Saturday to my orthopaedic surgeon. He gave me a look that was darker than oil at midnight, and went back to what he was doing without saying anything. Some people tell me I should have interpreted this as a "sounds ok". However, he later said some things about our goal being to "reduce the chance of having arthritis in the ankle for the rest of your life". That scared me into behaving.
There's a more formalish storage paper for the occasion here. But honestly, I think the speaking notes are more informative for getting at the soul of the material. In my experience that's often true of talks vs accompanying papers. So I'm including my speaking notes here. I blame oxycontin for any incoherent bits. They're a little random but I hope you press through because some of the good stuff is near the middle/end ;-). Maybe I'll do sketches on whiteboards for all the places I was going to do live sketches and take pictures, but for now the notes are all booooring woooords. Unfortunately in many cases the sketches are the meat of the thing, but I think you can get some idea what I'm talking about from the text. I've fleshed it out past the notes in some places where it was totally incomprehensible:
Storage is designed to support a more general user experience than just “find files more easily”. Storage isn't a silver bullet, but it can serve as a toolkit for making new user experiences easier to extend across the desktop. In the process it helps dissolve the application/desktop boundary a little.
The Experience
Intro: Related to many existing systems
Wiki – anybody can edit or work with information. Information is not super formal to start with, but can become “formalized”. Unlike wiki, allow for rich in place editing and better tie in to the OS for noticing changes and tracking “change threads” (which are themselves communication often).
Whiteboard – support quick informal live collaborations. Don't force things into a particular “format” or medium but allow people to mix it up. Share a space with lots of presence information, etc. Also envision this working when people are in the same place.
Groupware – handle objects that people need to deal with to get their job done. People, teams, projects, tasks, deadlines. These are more central to knowledge workers than even documents. Like groupware, track threads of communication, but don't tie people down to text messages. Let them respond with people, projects, tasks, etc. Rather than “posting to lists” you just append items to a topic in the (or a) central store.
Bugzilla – tasks, and schedules, process, status, owner, etc. Track more interesting metadata in a way that people can shape to their organization.
Build “objects people care about”
This is more about what gets built on top of Storage, but its a major part of the overall experience. The file manager (atop the filesystem) is about managing formal documents and folders to group documents in large concrete chunks. The <some name here> (atop storage) should focus on objects that fill people's daily lives.
People, Projects, Teams, Tasks, Messages, Topics, Discussions, Managers, Proposals, etc, etc, etc (and yes, Documents too) are objects people care about. Many others that are specific to particular industries and job roles. Some of these objects currently live in specialized applications like evolution, and most of these will still be handled primarily through a specialized interface. <sketch the two specialized interfaces>.
Its usually a good idea to have specialized tools for targeting specific use cases.
OTOH, although we work on text documents mostly in the office suite, we still expose common operations to the base OS (the filemanager mostly in this case). How can we extend the set of useful things that can be done with information across the information boundary? In a less generic sense, can we build support for the objects people deal with on a day to day basis more deeply into the OS. It doesn't have to be done by a univeral component system, but base libraries like storage can make it easier to support the important “one off” optimizations in the base OS (such as for projects).
Support informal work
Most office applications are focused on producing deliverables: formal documents. But deliverables are the exception. Most knowledge workers spend most of their time processing, sharing, and extending information not producing deliverables. We want to build interfaces that allow for some degree of information soup. <sketch the process flow for organzing SubsByTheInch2005>
Informal work can eventually turn into formal deliverables. Make this process as convenient as possible.
Information is information, don't force large chunks
We currently have odd granularities of information. “Files” in the case of “formal documents” (but since we don't have informal constructs, many things are pushed into this).
Access items within large bodies of information
The storage “research-y” solution to this is object reference using human language phrases
This aspect of storage still interests me, and has been where most of the work has gone until now.... but it is more researchy because it is prone to being technically infeasible (jury is still out ;-). As such, other parts of storage are not predicated on it.
Provide the components for collaboration
If storage is the physics, social interaction is the chemistry. Storage needs to provide some very basic structures that will give rise (when people, environments, tasks, etc) are thrown into the mix to social interactions. Rather than trying to control things rigidly, as traditional computer environments have done, we allow social behaviors to regulate things more (as things work normally outside computer world).
Presence information is the substrate for coordinating social interactions. Who is where and doing what is the most relevant context for social interactions.
Access by multiple threads/computers/people. Rather than “versioning” documents and the associated problems (e.g. merging is a nearly insoluable UI problem) we allow “live” (or at least effectively live) access to documents.
Fine granularity. If we have access from multiple places, the temptation is to use locking of “documents”. Even inside formal documents, however, this will greatly limit collaborative ability. If we have rich fine grained presence information, combined with very fine grained data access, we can provide the ability to socially manage interactions rather than requiring “forced” lockouts.
Track information flow
E-mail showed the importance of threads of communication between people. An e-mail thread morphs into a task (like a bug), which morphs into a few more tasks (which might have discussions associated with them), which turns into a full fledged project with an associated team, which eventually produces a policy document. All this stays tied together. <show interface idea>
A Brief History (aka excuse):
Storage was initially implemented as project Gargamel by a team of Stanford CS (and one EE, and yours truly) students as a senior project. Brian Quistorf, James Farwell, Khalil Bey, Josh Radel. It gets to a nice demo-able point before they graduate.
It gets even more finished as I work on it after graduation while not looking for work. Web page is written, screenshots made, etc.
I foolishly decide to rewrite the NL parser (and lose the old CVS history when importing to cvs.gnome.org). I get sidetracked writing the NL parser.
Slashdot etc hit. Lots of developer interest, but I'm snowed for other reasons and don't succesfully get development moving with other people. Plus I still have to finish the NL rewrite before things will function again.
The summer is completely crazy, and I stop working on Storage for 8 months.
Today: NL rewrite is now done. Its a much stronger foundation, but the semantic grammar is still small. However, even with the small grammar it can do very sophisticated (correct) interpretations of phrases like “songs that aren't by 'John Lennon' but have the word 'love' in them”. This would be very difficult to parse with a traditional “naive” scavanging search interpretation. Marco is also contributing to Storage, as well as some other Epiphany dudes. Things are starting to pick up, and I'm determined to not kill storage by bottlenecking again. I'm looking for a “project manager”.
What's there today:
Non-NL
storage-store – manages the postgresql server, handles notification
libstorage – GObject interface to store items
libstorage-translators – serializes / deserializes data streams from / to storage items'
GnomeVFS module – automatically invokes translators on read/write into the store allowing existing GNOME apps to use the store like a normal filesystem
NL
PET – parses sentences into Head-Phrase Structure Grammar (HPSG) trees, by Dr. Ullrich Callmeier.
libmrs – interface to the Minimal Recursion 'Semantics' information in the HPSG tree
libmrs-converters – translates MRS into a more meaningful XML statement using a client chosen semantic grammar
libstorage-nl – translates using storage-specific semantic grammar into the intermediate XML form, and then to an SQL query
What's in the near future:
Currently libstorage, the VFS module, and some translators directly access the postgresql server. This is undesirable: it means permissions on a shared store would have to be enforced using a collection of SQL views, it means locking becomes very tricky, and it means that libstorage and other things link directly against postgresql libraries (though this could be addressed by gnome-db).
Support for NL searches in select non-English languages (probably Spanish first, but perhaps Japanese). Storage is built on a “language neutral framework”, but grammar engineering is a very difficult task. Some of the availability of NL searches will depend on what the linguistics community produces and distributes freely.
A nifty collaborative application to provide a test bed for the collaboration/locking framework. <sketch collaborative whiteboard/wiki design> (also shows informal work) Ideas? ;-)
<demo NL search interface>
<show NL slides and explain basic NL process>