I've been meaning to post a response to George Ou's performance critique of Open Office in his web log.
Performance Analysis of OpenOffice and MS Office

I was galvanized by the questions in Harry McCracken's posting.
Is OpenOffice.org bloatware?

Here is my posted response to Mr. McCracken and, by extension, Mr. Ou.


Mr. McCracken,

I'm familiar with Mr. Ou's test, and have downloaded and worked with his Excel file. You had two questions in your posting.

1. Should speed tests of common tasks be in a future comparison of office suites? Yes, because it is common tasks that tell people about real world performance. Mr. Ou's test is arguably not real world.

2. My own performance experience with OOo? I use it daily. It's fine, but does take longer to open the first document.

As for Mr. Ou's article:

1. Yes, raw XML requires more storage space than binary representation.
2. Open Office files are smaller than MS Office files only because the OOo files are compressed using Zip.
3. The OOo files must be uncompressed into memory.
4. A 50MB spreadsheet is exceptionally large. I searched all mine and the largest was about 6MB.

The last point is the most important. It's true than Open Office is slower than MS Office. However, that slowdown is only obvious when working with abnormally large files. Working with a 6MB spreadsheet, the difference is negligable.

So, why so slow with a large file? Easy. That 50MB Excel file converts to a 4MB(!) OOo file. But the OOo file must be uncompressed to work on it, and it uncompresses to 300MBs! That is a large memory hit, and can cripple a machine. But, again, who routinely works with 50MB Excel files?

You can test this yourself. Just open an OOo file using WinZip or another zip compression utility. You'll see the "file" is a set of xml directories and files.

Open Office keeps document elements in memory even after the document is closed. This is unexpected, but can be controlled in the Options > General > Memory settings by reducing how long data is stored. In my view, though, the memory should be reduced as soon as the file is closed.

Mr. Ou argues against XML for size. This is like arguing against HTML for its size. The point of XML is standardized data storage and exchange. In other words, even though I'd never want to have to extract my OpenDocument data without the application, the point is I could do so years from now as long as I can uncompress using standard Zip, then read using standard ASCII. I could easily read the information, even without the formatting.

I believe office suites should have to compete on features, not on how they store the data. One document standard, many commerical options for reading those documents. Since documents are data, it's reasonable to use XML.