Who's in control?
or
What you see is what you get

by Greg Lehey


Since this is my first contribution to the Daemon's Advocate, I suppose a kind of introduction is in order. I'm not going to repeat the stuff that's on my Home page, but it might help you to understand the articles if you know where I'm coming from.

Compared to Wes Peters, I'm a relative newcomer to UNIX: it wasn't until late 1986 that my company started marketing a UNIX box. I came to UNIX from Tandem's Guardian platform, now known as NonStop Kernel, and moving to UNIX was quite an experience. I liked what I saw, though, and moved to UNIX completely before leaving Tandem in 1992 to become an independent consultant.

The problem

Nearly 20 years ago, my friend Hellwig Raffel called me at work, excited about a new program he had for his Apple II. Hellwig was a local electronics distributor, and at the time microcomputers were just starting up. The program he was talking about was so difficult to describe that I couldn't bring up any enthusiasm for it, and I never even talked to him about it again. Only years later, when I got my first PC with IBM's PC-DOS and made acquaintance with Lotus 1-2-3, did I realise that Hellwig must have been talking about VisiCalc.

I rather liked using Lotus, and though I moved on to Quattro when it came, spreadsheets remained one of the main things I used computers for. In early 1992, after leaving Tandem, I spoke to Peter Collinson about the copy of BSD/386 (later to become BSD/OS) that I had just bought, and expressed the opinion that I wouldn't be able to leave DOS behind completely because of things like spreadsheets and word processors. Peter said that he did everything on UNIX, and that he had no problems.

I carried on using DOS for fewer and fewer things. When I started writing I had to change to troff, but I still used Quattro for my expense reports. After a while, though, I changed to UNIX-based spreadsheets like sc, and later ss, which provided the functionality, but there was something missing. I always had trouble inputting data.

The solution

Finally, a few weeks back, I made the obvious choice. Obvious, you ask? I'll discuss that in more detail later on. Anyway, instead of inputting the data directly, I wrote it in a pseudo-spreadsheet form with Emacs, my trusty holy-war-inspiring do-everything editor.

But Emacs doesn't perform spreadsheet calculations, right? Right. To perform the spreadsheet calculation, I had to convert this form into something that ss would understand. That was straightforward enough: a 20 line awk script did just what I wanted.

But why?

By now, you may be asking yourself ``Why did this guy do things the hard way when he could have done it directly with much less effort?''.

That's the theme of this article. Why?

sc offers a functionality comparable with early versions of Lotus 1-2-3. Not much, you might think, but then, it's a spreadsheet, not an editor, a word processor, a database or a graphics package. In any case, that's not the problem: it does exactly what I want. But its editing capabilities are rather less than those of Emacs.

So I did all this just to be able to use Emacs? Well, no, not just because of that, but that was one of the considerations. In fact, there were three:

We'll look at each of these in more detail.

The editing commands

On the face of it, it looks like a wimp's way out to change everything just so you can use your favourite editor. In fact, it's anything but. Commercial ``integrated software'' usually offers you an editor which will do the job--barely. It does this at considerable cost:

I've used Netscape in this example, but these points are typical of most commercial software packages.

The format

Have you ever taken a look at the files that monolithic office packages produce? Here's one representation of the start of a Microsoft Word document:

-D"I^Q`a,i+-^Z'a^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@>^@^C^@pb^~?^@^F^@^@^@^@^@ ^@^@^@^@^@^@^C^@^@^@^@^@^@^@^@^@^@^@^@^P^@^@^C^@^@^@^A^@^@^@pb^~?^~?^~?^@^ @^@^@^A^@^@^@^~?^@^@^@^A^A^@^@^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^ ~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~ ?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~? ^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^ ~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~ ?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?^~?

It's difficult to tell what this is, although if you persevere you'll find the text Microsoft Word 6.0 a bit further on in the gibberish. It's obvious that this format was not designed for easy interpretation by mere humans. Now Microsoft is infamous for its undocumented documentation formats, which appear to change--always only for the best technical reasons--between each release of Word, thus forcing people to upgrade en masse. Still, other packages use similar formats. For a while my wife used StarOffice for word processing, while I used troff. We couldn't transfer data, which basically killed our use of StartOffice.

The issue of control

These points may look bad enough already, but they're just a minor inconvenience compared to the third: the question have I done this correctly? Let's consider a few scenarios:

Let's look at my spreadsheet again. It looks something like this:

It's pretty primitive stuff: type in the data in the columns on the left, calculate the fees in column H, create sums of each row in column J and or each column in row 16. Still, the opportunity exists for error. What if I made a special price for one particular customer, and forgot to take account of it in the spreadsheet? There's nothing in this display, short of comparing the results, to tell me that I've charged the wrong rate. Sure, I can go and check the cell formulae, but that's a lot of work, and to do it right I have to check every one. What if I added row 12 later and forgot to update the formulae in row 16 to include it? I'd end up sending in a bill for less than the full amount. Again, I can add up the columns manually and check, but then what do I need a spreadsheet for?

This is the real reason why I changed the way I did things: accountability. My awk script recognizes different kinds of lines: one kind for headers, one for the ``meat'' of the spreadsheet, and a different one for the telephone bills. Here's the input:

C Sunroot telco Benchmark in Austin # Hours Hotel Lunch Dinner Incidentals Transport 8 June 1998 8 8 9 June 1998 10 15 10 June 1998 9.5 50 34.62 11 June 1998 11 7.04 25.85 12 June 1998 10.5 6.5 13 June 1998 2 20 55 5 12.16 T Telephone 19.68

Out of this, awk makes a spreadsheet which ss can understand. It creates the formulae for the calculations itself, so I can check in the script if it has charged the correct rate, and I know that the ranges for the additions are correct because they're programmed that way. Yes, I know that there's still ample opportunity for bugs, but that's a different order of magnitude: with this method, I have control.

Goodbye stress

If you've been around computers for any length of time, you'll know all about stress. One of the main causes of stress is lack of control over what you're doing.