<?xml version="1.0"?>
<rss version="2.0">
<channel>
	<title>The Gnumeric sockpuppet</title>
	<link>http://www.gnome.org/~jody/blog/</link>
	<description>Make sure it is plugged in.</description>
	<language>en</language>
	<managingEditor>jody@gnome.org</managingEditor>
	<webMaster>jody@gnome.org</webMaster>
	<image>
		<url>http://www.gnome.org/img/gnome-16.png</url>
		<title>The Gnumeric sockpuppet</title>
		<link>http://www.gnome.org/~jody/blog/</link>
		<width>16</width>
		<height>16</height>
	</image>
	<item>
		<title>Shooting Fish in a Barrel</title>
		<guid isPermaLink="true">http://www.gnome.org/~jody/blog/2006-07-25</guid>
		<link>http://www.gnome.org/~jody/blog/2006-07-25</link>
		<description>
&lt;p&gt;Back at Linuxworld Boston Michael
&lt;a href=&quot;http://www.whatpc.co.uk/vnunet/news/2153630/openoffice-zooms-lagging&quot;&gt;
mentioned a teensy performance problem&lt;/a&gt; with an internal spreadsheet (sorry,
it&apos;s confidential and can&apos;t be posted).
&lt;/p&gt;

&lt;img src=&quot;http://www.gnome.org/~jody/images/2006-07-25-ooo-pivot-perf-sample.png&quot; alt=&quot;OOo Pivot Sample&quot; style=&quot;border: none;&quot;&gt;
&lt;p&gt;This partially autogenerated 50M xls monster has been chock full of useful
compatibility tests for OOo.  Unfortunately, one of my recent patches was
forcing the pivot tables to regenerate on load, rather than only on demand
later, and drove load time up
into the 3 &lt;b&gt;hour&lt;/b&gt; range.  MS Excel could load it &amp;lt; 1 minute.
&lt;/p&gt;
&lt;p&gt;
The first step was to throw speedprof (properly patched for OOo) at it.  Why
not use a sexier tool like cachegrind or oprofile you wonder ?  The short
answer is simplicity.  For a rough first cut speedprof is good enough and
doesn&apos;t have much time/space overhead.  The result showed a hotspot in the xls
importer itself with lots of code of the form &lt;/p&gt;
&lt;/p&gt;
&lt;pre class=&quot;code-example&quot;&gt;
  long nCount = aMemberList.Count();
  for (i=0; i&amp;lt;nCount; i++) {
    const ScDPSaveMember* pMember = (const ScDPSaveMember*)aMemberList.GetObject(i);
    if ( pMember-&amp;gt;GetName() == rName )
      return pMember;
&lt;/pre&gt;
&lt;p&gt;
A quick check showed that &apos;aMemberList&apos; really was a list.  Once I&apos;d bandaged
my forhead and checked the monitor for damage the first patch was obvious.
This code was wrong on several levels.  Let&apos;s count the complexity.&lt;/p&gt;
&lt;p&gt;1) List::Count : Why not just iterate on the list directly and save the lookup ?&lt;/p&gt;
&lt;p&gt;2) List::GetObject(i) : Again, why start from the begining of the list each time when you just what to iterate through each element ?&lt;/p&gt;
&lt;p&gt;3) if (pMember-&amp;gt;GetName() == rName) : Why look things up in order when what you want to look them up by name ?&lt;p&gt;
&lt;p&gt;The first patch was conceptually simple, get a hash in place of that list.
It took some spelunking into the data structures to make that possible but in
the end
&lt;a href=&quot;http://cvs.gnome.org/viewcvs/ooo-build/patches/src680/sc-xls-pivot-optimization.diff?rev=1.1.2.1&amp;view=markup&quot;&gt;Patch1&lt;/a&gt;
brought us down to 45 minutes without bloating the memory usage much.
&lt;p&gt;
The next speedprof run seemed as if the construction of the datapilots was uniformly
slow, but a bit of digging showed that one particular pivot tables was dominating.
It had a field with 30,000 unique strings.  The code used similar idioms previous block.
&lt;/p&gt;
&lt;pre class=&quot;code-example&quot;&gt;
  ScDPItemData aMemberData;
  long nCount = aMembers.Count();
  for (long i=0; i&amp;lt;nCount; i++) {
    ScDPResultMember* pMember = aMembers[(USHORT)i];
    target-&amp;gt;FillItemData( aMemberData );
    if ( bIsDataLayout || aMemberData-&amp;gt;IsNamedItem( target ) )
&lt;/pre&gt;
&lt;p&gt;
Thankfully it used an array in place of a list, but it threw an extra object
copy in the heart of the loop to keep things comfortably inefficient.  One more
&lt;a href=&quot;http://cvs.gnome.org/viewcvs/ooo-build/patches/src680/sc-dp-hash-items.diff?rev=1.2&amp;view=markup&quot;&gt;patch&lt;/a&gt;
and we were down to 10 minutes.  Still not good, but it&apos;s an improvement.  The
next steps will be to see why OOo is using 900M vs 90M for XL (and that&apos;s
with &lt;b&gt;wine&lt;/b&gt;), and to see about using a set of indicies for the pivot data,
rather than a set of strings.
&lt;/p&gt;

&lt;img src=&quot;http://www.gnome.org/~jody/images/2006-07-25-ooo-pivot-perf.png&quot; alt=&quot;OOo Pivot Performance&quot; style=&quot;border: none;&quot;&gt;
</description>
		<pubDate>Tue, 25 Jul 2006 15:59 -0400</pubDate>
	</item>
	<item>
		<title>OpenFormula</title>
		<guid isPermaLink="true">http://www.gnome.org/~jody/blog/2005-10-19</guid>
		<link>http://www.gnome.org/~jody/blog/2005-10-19</link>
		<description>
Looks like the
&lt;a href=&quot;http://sourceforge.net/projects/openformula&quot;&gt;OpenFormula&lt;/a&gt;
initiative is going to restart with the goal of improving the interoperability
of spreadsheets (different implementations and versions).  The traffic got off
to a fast start and we&apos;ve quickly hit an 
&lt;a href=&quot;http://sourceforge.net/mailarchive/message.php?msg_id=13502014&quot;&gt;impasse.&lt;/a&gt;
How comprehensively should we define the evaluation mechanism for conforming
spreadsheets ?  To my mind any file format that claims to be portable must
calculate the same results with different versions/implementations.  
Differences need to be explained.  Others seem to think that the calculations
are just for display,  akin to different kerning in a word processor.&lt;/p&gt;

Microsoft finds the discussion amusing, and 
&lt;a href=&quot;http://blogs.msdn.com/brian_jones/archive/2005/10/04/477127.aspx&quot;&gt;claims&lt;/a&gt;
that their lovely new office 12 formats don&apos;t have this problem.  I can&apos;t check
that because the schemas come with a
&lt;a href=&quot;http://blogs.msdn.com/brian_jones/archive/2005/09/25/473815.aspx&quot;&gt;GPL incompatible license&lt;/a&gt;.
However, I would be &lt;i&gt;very&lt;/i&gt; surprised if MS included an appendix listing
all standard functions and rigorously defined their behaviours.
</description>
		<pubDate>Thu, 20 Oct 2005 01:25 -0400</pubDate>
	</item>
	<item>
		<title></title>
		<guid isPermaLink="true">http://www.gnome.org/~jody/blog/2005-08-25</guid>
		<link>http://www.gnome.org/~jody/blog/2005-08-25</link>
		<description>I&apos;ve been quiet for too long now.  It&apos;s time to say hi, re-join the
community, and do a bit of spreadsheet blogging.

For the last few months I&apos;ve been working on OOo&apos;s spreadsheet.  Given the
choice between working on OOo and leaving free software I swallowed my pride
and made the leap.

To paraphrase the late Douglas Adams

&apos;OOo, is big. Really big. You just won&apos;t believe how vastly
hugely mindbogglingly big it is. I mean you may think there&apos;s a lot of code
in emacs, but that&apos;s just peanuts compared to OOo.&quot;

There&apos;s lot&apos;s of neat stuff in here, and Michael has done some amazing work
getting it building mostly painlessly.

Gnumeric is still alive and well.  The team is on track to release 1.6.0 (with
several nice improvements) along with gnome 2.12 in a few weeks.  With luck
I&apos;ll be able to cross-pollinate the projects.

My current project in OOo is to add support for R1C1 style references.  The
core of the patch was simple.  I was able to lift a blob of Gnumeric code I&apos;d
written a few months back and dual license for inclusion in OOo.  The tricky
bit is turning out to be the interface change that propagates the choice of
address style.
</description>
		<pubDate>Thu, 25 Aug 2005 02:59 -0400</pubDate>
	</item>
	<item>
		<title></title>
		<guid isPermaLink="true">http://www.gnome.org/~jody/blog/2004-08-09</guid>
		<link>http://www.gnome.org/~jody/blog/2004-08-09</link>
		<description>&lt;b&gt;libgoffice&lt;/b&gt;
The vacation gave me a bit of time to bite the bullet and start working on
pulling this out of gnumeric in earnest.  Both kids got sick, and the
resulting sleepless exhaustion limited development time, but at least the end
is in sight.  The remaining elements are
&lt;ul&gt;
&lt;li&gt;Expand GOFont to include sttributes likes strike/underline/color&lt;/li&gt;
&lt;li&gt;Finish the font selector widget to use the attributes&lt;/li&gt;
&lt;li&gt;Pull down the date convention code (needed for value formatting)&lt;/li&gt;
&lt;li&gt;Move the heart of the value formatter down&lt;/li&gt;
&lt;li&gt;Move the format selector down&lt;/li&gt;
&lt;li&gt;Get the last of the GOPlugin code in place&lt;/li&gt;
&lt;li&gt;Move the plugin manager dialog down&lt;/li&gt;
&lt;/ul&gt;
That&apos;s still lots of work but it should be possible within 1-2 weekends.

&lt;p&gt;
&lt;b&gt;Gnumeric&lt;/b&gt;
I had tidied up escher export a few weeks ago to enable chart export to xls.
Jon Kare picked that up and has been working on image export,
something people have been asking for for a long time.  Sitting in my inbox
was also some absolutely lovely work by Emmanuel to complete his work axis
mapping (invert, log) for the 1.5d charts (col/bar/line/area).  While he was at it
he ripped out all the piecewise patching for libart antialiasing fuzziness,
and consolidated it into the pixbuf renderer.  The results look
&lt;a href=&quot;http://emmanuel.pacaud.free.fr/screenshots/gnumeric/barcol-100%25.png&quot;&gt;awesome&lt;/a&gt;.
Couple in Kasal&apos;s recent gsf-janitor work to polish up the msole exporter and
we&apos;re looking good for a release.  There are still a few win32 porting patches
to merge in, but on the whole we should be able to release gnumeric with
gnomeoffice-1.2 in conjunction with gnome-2.8.
</description>
		<pubDate>Mon, 09 Aug 2004 14:58 -0400</pubDate>
	</item>
	<item>
		<title></title>
		<guid isPermaLink="true">http://www.gnome.org/~jody/blog/2004-06-05</guid>
		<link>http://www.gnome.org/~jody/blog/2004-06-05</link>
		<description>Not sleeping well so I spent a few hours doing some mindless coding to
complete the sax-style xml exporter for gnumeric.  I&apos;ll make it the default
for 1.3.1 to get some testing.  There&apos;s a huge speed win for large files.  Not
allocating 4*uncompressed size is apparently helpful.&lt;p&gt;

Had an intereting chat with Ryan, Charlie, and Mark to marvel at the existence
and virtues of Trelane&apos;s work on the new Gnumeric &lt;a
href=&quot;http://www.gnumeric.org/new-design-testing&quot;&gt;website&lt;/a&gt;.  It can
certainly be polished in spots, but the architecture is clean, and it&apos;s a hell
of a lot better than the monkey see monkey do crud I&apos;ve be putting up.  It is
fantastic to finally have some web knowledgeable folk available to put up some
more polished a more polished gnome-office website.
</description>
		<pubDate>Sat, 05 Jun 2004 15:24 -0400</pubDate>
	</item>
	<item>
		<title></title>
		<guid isPermaLink="true">http://www.gnome.org/~jody/blog/2004-06-02</guid>
		<link>http://www.gnome.org/~jody/blog/2004-06-02</link>
		<description>A quiet day.&lt;p&gt;
Walked through the backlanes to the library with Ryan.  It went quickly even
though I only carried him part way.  The innoncence of a two year old with a
serious case of the &apos;Why?&apos;s is an excellent balm for the soul.  I don&apos;t think
we&apos;ll tell him about bub.  It seems too soon for him to come to grips with the
permanence of mortality.  A nice copy of Winken, Bliken, and Nod suits him
better just now.&lt;p&gt;
I&apos;d best get back to working on a eulogy.
</description>
		<pubDate>Thu, 03 Jun 2004 03:28 -0400</pubDate>
	</item>
	<item>
		<title></title>
		<guid isPermaLink="true">http://www.gnome.org/~jody/blog/2004-06-01</guid>
		<link>http://www.gnome.org/~jody/blog/2004-06-01</link>
		<description>&lt;img class=&quot;left&quot; src=&quot;http://www.gnome.org/~jody/images/ahf.jpg&quot;&gt;
Beatrice Gittle Goldberg Sep 23 1922-Jun 1 2004&lt;br&gt;
&lt;br&gt;
My grandmother (on my father&apos;s side) passed away a few moments ago after
struggling wth cancer for several months.  I&apos;m very lucky the kids had a
chance to meet her.
</description>
		<pubDate>Tue, 01 Jun 2004 20:15 -0400</pubDate>
	</item>
	</channel>
</rss>
