Greg
Greg's diary
December 2003
Translate this page
Select day in December 2003:
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31
Select month:
2003 May Jun Jul Aug
2003 Sep Oct Nov Dec
2004 Jan Feb Mar Apr
Today's diary entry
Diary index
About this diary
Previous month
Next month
Greg's home page
Greg's photos
Network link stats
Greg's other links
Copyright information
    
Groogle

Monday, 1 December 2003 Echunga
Top of page
next day
last day

Finally managed to finish off my article for Daemon News, and back to the Vinum paper for Linux.conf.au. That'll keep me busy for a while.


Tuesday, 2 December 2003 Echunga
Top of page
previous day
next day
last day

More work on the Vinum paper today, mainly performance measurements. In the process I'm putting together stuff which can be committed to the source tree which will prepare graphs showing differences. Given the concerns that many people have had lately that Vinum's performance isn't up to scratch, it was a pleasant discovery that yes, indeed, performance looks much better than with normal disks. I still need to compare other software, such as ccd and RaidFrame, and also Vinum RAID-5, but things look relatively encouraging.


Wednesday, 3 December 2003 Echunga
Top of page
previous day
next day
last day

Once again more work on the Vinum paper. Finally I'm making some headway (and finding and correcting some bugs in the configuration interface). The most interesting one, which seems to have been there for over five years, is that I note when plexes and subdisks are opened, but not when they're closed, so if you access a plex or subdisk directly at all, you won't be able to stop Vinum cleanly.

Also playing around with gnuplot for the first time in years. The documentation is terrible, and I had lost all my old scripts for preparing the graphs, so I had to start all over again. That may not be the worst thing: looking back at what I was doing five years ago, the poor quality of the work surprises me.


Thursday, 4 December 2003 Echunga
Top of page
previous day
next day
last day

It may look boring to write the same thing every day: Once again more work on the Vinum paper. Finally I'm making some headway. On the other hand, though, it's very satisfying. On the other hand, I'm still seeing cases of I/O errors with rawio, and the more I look at it, the more I'm beginning to think that it's a system bug. For no apparent reason, specific I/O requests fail with an I/O error or an invalid parameter. Since it happens in random child processes, it's difficult to catch. If I can work round it for now, I'll look at it after I've finished my paper.

Also more looking at the Internode machine. The problems we're seeing are very clearly related to ACPI, but other systems seem to be able to work around them. We really need to get our act together and at least handle buggy ACPI implementations before releasing 5.2.


Friday, 5 December 2003 Echunga Images for 5 December 2003
Top of page
previous day
next day
last day

In view of the impending deadline for the Linux.conf.au papers on Monday, I decided to drop my performance measurements and concentrate on the paper instead. Or at least, that was the theory. Somehow it didn't work out, and I spent most of the day working on the graph plotting. Ah well, it needs doing sooner or later, and this way looks like being less work.


Saturday, 6 December 2003 Echunga
Top of page
previous day
next day
last day

It's been six months since Yvonne brought home two Muscovy ducklings. One of them didn't last long: she was eaten by a fox, we think. The other one became more and more of a nuisance: you can't house-train ducks, and after observing Donald (original name, eh?) for some time I decided that ducks are mud filters. Today was the big day: we found somebody to take him away, so our verandah is clean for the first time in months.

Finally got to working on the paper instead of the graphs, and noted how my paper had atrophied in the course of time. I had started with the version of the paper I presented at the USENIX 1999 Annual Technical Conference. Reformatting today did some strange things to the line spacing and point sizes in the examples, and it occurred to me that it's about time I formalized my groff macros, which I started spinning off from the GNU groff .mm macros nearly ten years ago. Spent some time doing that, and ended up with something usable, though I really need some documentation as well.


Sunday, 7 December 2003 Echunga
Top of page
previous day
next day
last day

Summer is here! A nice balmy day without much wind. The weather should be like this more often.

My father has arrived in Australia to spend the summer here, so down to Adelaide to pick him up. Didn't do much else of interest.


Monday, 8 December 2003 Echunga
Top of page
previous day
next day
last day

Today was the deadline for the paper for the Linux.conf.au paper, so I had planned to spend all day doing that. As usual, though, other things god in the way. Made quite a bit of progress on the performance testing front, and the paper now looks a lot better, but it's not ready: in the afternoon, got a phone call from an ISP in Sydney with a crashed system running Vinum, and spent some time talking the operator through recovery procedures. It's interesting how so many of these cases involve a system that was set up by some now-departed admin, and somebody else has to solve the problem. Beyond a failed disk, not a technical issue at all.


Tuesday, 9 December 2003 Echunga
Top of page
previous day
next day
last day

More work on the paper, somewhat delayed by the work I had been doing yesterday, and carried on with the performance graphs, in the process discovering multiple problems with rawio, so spent some time fixing them.

Yet more hardware bites the dust. Came in and found the monitor (one of the 21" iiyamas I brought back with me from Germany years ago) on my test setup dead. It hadn't been too happy since the Alpha power supply blew up a week or so ago, and now it didn't power on at all. Went and got an even more ancient and decidedly nastier Alfaskop monitor and plugged it in too: also dead. Traced it to a dead UPS (only 3 years old, too; hardware seems to be dying much earlier nowadays). I don't really need a UPS on the test machines, but all the connectors are male instead of female, so I need the back panel of a UPS. Replaced it with an APC machine (also ancient), whose battery is dead. Plugged that in and started things up, and immediately an alarm went off. Took the UPS apart to discover a pretty normal 12V lead-acid battery and plenty of empty space, but couldn't locate the beeper. Finally located enough power cords and started the system without the UPS.

Still the beeper! It came from one of the disk trays from my Sun Disk Array. Grrr! Another dead power supply. That makes a total of 7 out of 12 power supplies which have failed. I can't get them repaired by Sun: that would cost more than the whole thing is worth. I wish I could find some independent person who could repair them. What a day! At least there's hope that the iiyama monitor isn't completely dead.

Also brewed another Weißbier, somewhat complicated by inadequate mashing. Ended up with a significantly stronger brew than intended, and still darker. This malt extract is too dark in colour.


Wednesday, 10 December 2003 Echunga
Top of page
previous day
next day
last day

Spent most of the day working on the Vinum performance measurements, which took much more time than I expected, and for some reason I couldn't summon up the energy to do anything else. Still, the graphs are looking good, and I should be able to get the whole thing finished by the weekend.

To the AUUG SA chapter meeting tonight. They're back at Marcellina's in Hindley St., where I swore I would never go again after they canceled a reservation on the same day. I still don't like the place much, but I seem to be in the minority. Presented, yet again, my “Why I hate OpenOffice” talk, this time very well received.


Thursday, 11 December 2003 Echunga
Top of page
previous day
next day
last day

More work on the performance testing today, and still more bugs showing up. For some reason I have never been able to get my mind round automated test procedures, and I kept coming up with more bugs.

While I'm wondering what to do with this Internode machine, got a call from somebody else who wanted to run some performance tests on it. His application uses a lot of RAM; it died after the data segment hit its limit of 512 MB. That's OK, I thought: we can change that with ulimit -d. Well, yes, but how? It seems it's almost completely undocumented. ulimit(1) is linked to tcsh(1), and setrlimit(2) is quiet on the subject and doesn't point to any other documentation. Finally read the source code and discovered that there are two loader tunables called kern.maxdsiz and kern.dfldsiz which set maximum data segment size in kilobytes. Kernel memory starts at 0xc0000000, so I put this into my /boot/loader.conf and tried again:

kern.maxdsiz=3221225472
kern.dfldsiz=2147483647

To be very careful, I made sure that the default was just shy of 2 GB. The results were less than impressive: the system didn't finish booting. malloc failed somewhere. Finally dropped back to 1 GB, which worked, but it's not clear that it's enough. I need to find out what the real limit is. It's a pity that this machine takes forever to boot.

News from the conference that I have plenty more time before I have to submit my paper, so I should have time to do some testing on other platforms as well. Looks like I'm in for getting rawio to run on other platforms.


Friday, 12 December 2003 Echunga
Top of page
previous day
next day
last day

More performance testing today, and found yet more bugs in rawio. This is really quite a mess; high time I fix it. In particular, some changes I made years ago appear to have converted the write tests into read tests (uninitialized variable). Fixed that and continued. Also found an issue in initializing RAID-4 and RAID-5 plexes in Vinum: it was writing a maximum of a stripe at a time, which is unnecessarily slow for small stripes.

Also fired up my Debian box, which on misinterpretation of a suggestion by Rasmus Lerdorf I'm calling brynhild. Built a new kernel with MD, RAID and LVM support and tried to install it, but it didn't want to update the boot sector, claiming that the system disk didn't exist. Only then did I realize that the system had somehow managed to net boot. sigh.


Saturday, 13 December 2003 Echunga
Top of page
previous day
next day
last day

Still more work on the performance tests. It doesn't help that the complete set runs for 10½ hours. Thought out some new tests, particularly comparing RAID-4 and RAID-5. The results were even more obvious than I was expecting. Here's the difference in read performance:

 
This should be raid4raid5read.png.  Is it missing?
Image title: raid4raid5read
Dimensions: 640 x 480, 6 kB
Dimensions of original: 640 x 480, 6 kB
Display this image:
thumbnail    hidden   alone on page
Display all images on this page as:
thumbnails    this size
Show for Sunday, 14 December 2003:
thumbnails    small images    diary entry

I'm assuming that the drop in performance is because the data is coming from only six drives instead of seven. Some time ago Peter Wemm claimed that the drive cache should make up for this deficiency. At least in this example, it didn't.

As expected, the difference in write performance was more obvious. It was even obvious during the test: watching a RAID-4 test is interesting. On reads, one disk LED is completely out (the parity disk, of course). On writes, it's fully on, while the others flicker.

 
This should be raid4raid5write.png.  Is it missing?
Image title: raid4raid5write
Dimensions: 640 x 480, 6 kB
Dimensions of original: 640 x 480, 6 kB
Display this image:
thumbnail    hidden   alone on page
Display all images on this page as:
thumbnails    this size
Show for Sunday, 14 December 2003:
thumbnails    small images    diary entry

Apart from that, spent some time setting up my Linux box (brynhild) for rawio tests. Finally found out how to access raw disk devices in Linux: for some reason, instead of just implementing the raw device nodes, you need a program raw to bind a block device to a predetermined raw device:

=== root@brynhild (/dev/pts/0) ~ 70 -> raw -q /dev/raw2
/dev/raw/raw2:  bound to major 0, minor 0
=== root@brynhild (/dev/pts/0) ~ 71 -> raw /dev/sdb /dev/raw2
raw device '/dev/sdb' is not a character dev
=== root@brynhild (/dev/pts/0) ~ 72 -> raw /dev/raw2  /dev/sdb
/dev/raw/raw2:  bound to major 8, minor 16
=== root@brynhild (/dev/pts/0) ~ 73 -> ls -l /dev/raw2  /dev/sdb
crw-rw----    1 root     disk     162,   2 Dec 13 16:47 /dev/raw2
brw-rw----    1 root     disk       8,  16 Mar 15  2002 /dev/sdb
=== root@brynhild (/dev/pts/0) ~ 74 -> raw -q /dev/raw2
/dev/raw/raw2:  bound to major 8, minor 16

What a strange way to do it! Tried some tests, but they didn't work, perhaps because, for some reason, Linux disk character devices also want their data buffers in memory aligned on page boundaries. That doesn't make any hardware sense.


Sunday, 14 December 2003 Echunga Images for 14 December 2003
Top of page
previous day
next day
last day

Didn't do much work today. The thought of investigating why rawio and Linux don't mix was a little daunting, so did little of interest, though I did get round to riding Darah again, for the first time in nearly two months. Also did a little work on some brewing software, stealing the parser from the Vinum command line utility. It's interesting to note that it effectively allows the main function of standalone programs to be incorporated in another program, though the question of uninitialized variables raises its ugly head.


Monday, 15 December 2003 Echunga
Top of page
previous day
next day
last day

The weather's getting warmer again, about 33° C today, and once again we had trouble with the air conditioner, this time the one in the extension. Despite the outside temperature, and despite setting the air conditioner to 16°, it wouldn't cool. Called up service, who were surprised, and confirmed things like filter cleanliness and such. Finally placed a service call, which appeared to shocked the machine into working again. I'm always left with this uncertain feeling that the positively stupid control setup is to blame: the temperature sensor (only one!) is in the return air duct, which requires one room to always be on. If the temperature in the roof is low enough, it makes no difference how hot it is in the rooms themselves. I'm completely baffled how the entire Australian air conditioning industry can use such an archaic method.

More work on rawio on Linux, and got it working: apart from alignment constraints, differences in the mmap(2) implementation were to blame. After that, did some testing in which Linux came out nearly twice as fast as FreeBSD. I'm still pondering whether this is real, and that there's something wrong with FreeBSD, or whether Linux is cheating somehow.


Tuesday, 16 December 2003 Echunga Images for 16 December 2003
Top of page
previous day
next day
last day

More work on the rawio testing today. It took all day and showed that both Linux and NetBSD were significantly faster than FreeBSD, at least for sequential transfers:

           Random read  Sequential read    Random write Sequential write
ID          K/sec  /sec    K/sec  /sec     K/sec  /sec     K/sec  /sec

FreeBSD    2738.1   176   2347.3   143    2103.4   135    1930.9   118
Linux      2753.7   146  14912.3   910    2126.9   118    2198.3   134
NetBSD     2704.4   163  14808.3   904    2533.7   120    2214.1   135

These results seem to point clearly to disabled drive cache: that would definitely hit the sequential reads that hardest. Unfortunately, the work kept me going all day (changing the way the tests are done, in particular with a static set of “random” locations and with different hardware), so I didn't have time to check.

Some discussion on the IRC lists showed me that the photos of my test setup were somewhat out of date, so put up some new ones.


Wednesday, 17 December 2003 Echunga
Top of page
previous day
next day
last day

More work on the Internode machine today. They're not happy with the progress, and we may end up having a satisfaction issue with FreeBSD 5.2, so dropped everything and looked at that. It's obviously an ACPI problem, but the whole ACPI code looks like one amorphous mess, and it's difficult to find my way around it. The time it takes this box to boot doesn't make it any easier, either.


Thursday, 18 December 2003 Echunga
Top of page
previous day
next day
last day

More work on the Internode machine today, and confirmed that the problem is in the BIOS. Specifically, an attempt to execute the BIOS for some device causes it to smash the stack:

564         for (currdev = 0, left = ndevs; (currdev != 0xff) && (left > 0); left--) {
(kgdb)
566             bzero(pd, bigdev);
(kgdb)
567             pda->next = currdev;
(kgdb) p currdev
$1 = 11
(kgdb) p *pda
$2 = {
  next = 11,
  node = {
    size = 0,
    handle = 0 '\0',
    devid = 0,
    type = "\0\0",
    attrib = 0,
    devdata = 0xc824ad0e ""
  }
}
(kgdb) n
569             if ((error = bios16(&args, PNP_GET_DEVNODE, &pda->next, &pda->node, 1))) {
(kgdb)

Program received signal SIGSEGV, Segmentation fault.
0x0000d119 in ?? ()

After that, nothing is left. Now I have to find out what the device is and, more particularly, whether we can just ignore it. Slow going. Looking at the pda structure, I'm left wondering if it isn't completely broken.


Friday, 19 December 2003 Echunga Images for 19 December 2003
Top of page
previous day
next day
last day

More work on the Internode box, and finally managed to work around the bug by ignoring the BIOS entry. On it went and... hung in the same place as before. Spent some time looking at that, in the process discovering that the gdb debug macros get installed in the kernel build directory by the gdbinit target. Do something like this:

# cd /usr/obj/usr/src/sys/GENERIC
# make gdbinit
# gdb -k kernel.debug
...
This GDB was configured as "i386-undermydesk-freebsd"...
Ready to go.  Enter 'tr' to connect to remote target
and 'getsyms' after connection to load kld symbols.
(kgdb) tr
Debugger (msg=0x26 <Address 0x26 out of bounds>) at machine/atomic.h:263
263     ATOMIC_STORE_LOAD(int,  "cmpxchgl %0,%1",  "xchgl %1,%0");
(kgdb) ps
  pid    proc    addr   uid  ppid  pgrp   flag stat comm         wchan
...
    0 c091b0c0 c0c1f000    0     0     0  000200  1  swapper      atareq c84bbf00
(kgdb) btp 0                           backtrace process 0
 frame 0 at 0xc0c21c40: ebp c0c21c98, eip 0xc0645eb8 <mi_switch+568>:             mov    %fs:0xc,%edi
 frame 1 at 0xc0c21c98: ebp c0c21cdc, eip 0xc064560f <msleep+1263>:               nop
 frame 2 at 0xc0c21cdc: ebp c0c21d00, eip 0xc04ca508 <ata_queue_request+344>:     testb  $0x1,0x2c(%ebx)
 frame 3 at 0xc0c21d00: ebp c0c21d34, eip 0xc04c9748 <ata_getparam+168>:          mov    0x40(%ebx),%edi
 frame 4 at 0xc0c21d34: ebp c0c21d48, eip 0xc04c99bf <ata_identify_devices+223>:  test   %eax,%eax
 frame 5 at 0xc0c21d48: ebp c0c21d60, eip 0xc04c9bbf <ata_boot_attach+47>:        cmpl   $0x0,0xb8(%ebx)
 frame 6 at 0xc0c21d60: ebp c0c21d80, eip 0xc0652d7b <run_interrupt_driven_config_hooks+43>:    mov    %ebx,%edx
 frame 7 at 0xc0c21d80: ebp c0c21d98, eip 0xc0616795 <mi_startup+181>:            mov    (%ebx),%eax

So it looks as if the bug was not in ACPI after all, but in the new ATA code. That's interesting, because other systems with the same basic hardware seem to get by just fine.


Saturday, 20 December 2003 Echunga
Top of page
previous day
next day
last day

More air conditioner problems today, the same as Monday, for which we still haven't heard from a service person. I can only recommend anybody buying an air conditioner in Australia to insist on some maximum response time.

With some perusal of the Daikin web site, finally found the instruction manual for my air conditioner: the people who supplied the unit refused to give me the manual, claiming it would cause me to make silly mistakes. Given the stupid implementation of the control unit, that's a particular insult. Unfortunately, it didn't include information about the test features, some of which are needed to set particular features. I did discover what all the stupid symbols mean, though, and discovered that I could get the thing to work by first running it in fan-only mode, which presumably brought the temperature of the misplaced temperature sensor somewhere close to the room temperature. What stupidity!

More work on the Internode machine, and found that the problem was with the DVD drive. Further examination showed that my other DVD drive also didn't work, but that the CD-ROM drive did. And all work both under 5.1 and under 5.2-BETA on monorchid. Looks like we'll have fun with this one.

In the evening, more technical problems: what looked like a power failure, so put the system on the generator. Then noticed that the failure was apparently only on one phase. Spent some time documenting the switches and discovered that it's quite a mess: the captions are quite incorrect, and two of the switches appear to be switched in series. In addition, discovered that my expensive Liebert UPStation GXT UPS unit was faithfully resetting the machines connected to it every time I turned the switch over. The el-cheapo UPSs had no problems. This is not the first time the Liebert UPS has been worse than useless; two years ago it did the same thing. I'll need a lot of convincing to buy any of their equipment again.

Called ETSA, who sent some people out who confirmed that it wasn't their problem. While they were there, the generator ran out of petrol, so I refilled it—with diesel. No more emergency power. While Yvonne was calling electricians, discovered that the safety switch had tripped: that was the entire problem, well hidden by incorrect switchboard wiring which made it look as if the problem was further back in the supply circuit. Grrr. The result: all machines went down except battunga, which miraculously survived and has now been up for 465 days. What a day. Spent the rest of the evening draining the diesel from the generator and getting it running again.


Sunday, 21 December 2003 Echunga
Top of page
previous day
next day
last day

Essey Deayton held a barbecue today to celebrate her housewarming. The weather didn't exactly cooperate: as I later discovered, we had 34 mm rain, more than we usually get in the entire month of December. That wasn't even the worst of it: the wind was worse, and we had to demount the pavilion that they had erected outside, because it was in danger of blowing away. Ended up eating inside, not quite the ideal summer barbecue.

Home relatively early; maybe it was the stress of yesterday, but we were all pretty tired, and didn't do much for the rest of the day. I later discovered that the wind had been so strong that it had torn the ivy off the water tank:


This should be torn-ivy.jpeg.  Is it missing?
Image title: torn ivy          Dimensions:          2048 x 1536, 688 kB
Make a single page with this image Hide this image
Make this image a thumbnail Make thumbnails of all images on this page
Make this image small again Display small version of all images on this page
All images taken on Monday, 22 December 2003, thumbnails          All images taken on Monday, 22 December 2003, small
Diary entry for Monday, 22 December 2003 Complete exposure details

 
Monday, 22 December 2003 Echunga Images for 22 December 2003
Top of page
previous day
next day
last day

Continued work on the Internode machine today, and confirmed that it was a matter of the jumpering of the DVD-ROM drive: if it was set as master, it worked fine. That alone didn't explain things, of course, so spent some time looking at the code, and made some modifications which might at least stop the system from hanging, but didn't get round to testing them.

Also looking at the read performance issues that we've seen with Vinum, but didn't make much headway. Spent far too much time trying to understand the caching mode page for SCSI drives, and played around with lots of parameters, but it didn't seem to make much difference to the throughput. There's obviously room for a lot of experimentation here.


Tuesday, 23 December 2003 Echunga
Top of page
previous day
next day
last day

I've been meaning to go into town for over a week now, and today I finally made it, even if I wasn't able to do everything I planned. Had to pick up a new pair of glasses, and also looked for some cheaper hops than those offered by Grumpy's (without success). Also to see Dan Shearer, who has now found office space at inetd, and to the RAA for some travel information: my father and I are planning to take a couple of weeks off in early February to drive to Perth, something akin to the Asia trip that we did decades ago. Got a lot of information there; I wonder how it will work out.

Back home and spent most of the afternoon on a conference call about the future of AUUGN, the organ (pun intended) of AUUG. It's a relatively technical magazine, but it would be nice to get it to a wider audience. It should be possible to interest the other open source groups in the magazine.


Wednesday, 24 December 2003 Echunga
Top of page
previous day
next day
last day

Quiet day. For some reason, it doesn't seem in the slightest like Christmas. Spent a bit of time investigating SCO's claims that Linux stole header files from them because the names are the same (the comments aren't). I'm sure there must be something in the POSIX.1 specification, but it's a nightmare to fight through. In any case, one way or another, it's pretty clear that once again they're wrong. It's quite instructive to see what the stock market is doing, though: obviously lack of understanding is more prevalent than I can understand.


Thursday, 25 December 2003 Echunga Images for 25 December 2003
Top of page
previous day
next day
last day

For once, had a more or less conventional (if not traditional) Christmas: up late, down to Adelaide for Christmas dinner with the family, and back home to digest. Took a number of photos. A good time was had by all.


Friday, 26 December 2003 Echunga
Top of page
previous day
next day
last day

Something broke swap on Vinum a few months back, and I haven't had time to look at it. Today, I decided, was the day. In the process, discovered that I had lost my instructions for debugging over firewire, so spent some time writing that up for my debugging tutorial. Got that working nicely, though I'd like to be able to automate it a bit more, but then ran into some problems loading kernel symbols with the getsyms macro in the kernel debugger tools. It looks as if a recent commit has assumed that the testing machine is also the tested machine, so it loads the symbols from the wrong machine. Started fixing that, but ran out of time: today we cooked our Christmas dinner, Canard à l'Orange.


Saturday, 27 December 2003 Echunga Images for 27 December 2003
Top of page
previous day
next day
last day

Another quiet day. Spent some time working on the debug scripts, and got them to work both with remote and local debugging, but then gave up and built a new world to be sure I was testing the right thing. It's a real pain how long it takes to build the system now. I should take another look at Martin Pool's distcc.


Sunday, 28 December 2003 Echunga
Top of page
previous day
next day
last day

More work on the debugging today. Discovered to my surprise that the swap on Vinum problem didn't go anywhere near Vinum: instead it disappeared into GEOM, and I'll have to spend some more effort following that.

In the meantime installed distcc, which was easier than I thought, but it's also not clear how much it buys. Although most of my machines are i386 (ia32) based, they're running a total of five different operating systems (FreeBSD 4, FreeBSD 5, Linux, NetBSD and OpenBSD), and it doesn't seem possible to cross-compile easily. Some tests showed things actually getting slower, though that may be due to non-responding systems. Definitely more work to be done: in one case I got a fivefold speed increase, so it's not all bad news.

Noted in this context that I had a spare Asus BP6 motherboard lying around, and also a 128 MB memory module. At one point I thought that the motherboard was defective, but I've noticed that a lot of problems have gone away since my cheap and nasty Deltec UPS died. I no longer have spontaneous crashes. After replacing the old BP6 with the new BP6, the stability problems didn't go away, but they did when I removed the UPS. There's also a good question as to whether the dead power supplies on the SS200 array are also due to the UPS.

In any case, put the components together and installed another machine, a mirror of zaphod which, for lack of imagination, I called beeble. Spent more time than intended installing a base 5.1 system to be updated to be a mirror of zaphod.

Also spent some time writing up documentation for kernel debugging. I'm feeling a lot more comfortable with firewire debugging now.


Monday, 29 December 2003 Echunga
Top of page
previous day
next day
last day

More work on documenting kernel gdb today, which took up most of the day. At least it's looking more understandable now: I feel more comfortable with documentation, even if I wrote it myself. Didn't do much actual debugging, but did manage to tidy up the macros somewhat, in the process discovering macros I had completely forgotten about.


Tuesday, 30 December 2003 Echunga
Top of page
previous day
next day
last day

More work on the debugging macros today, and finally committed the man pages. The loose ends are gradually tightening, and it's a lot easier to debug a kernel now.

In the process, found where my swapon request was dying. It's in g_dev_getprovider:

    if (devsw(dev) != &g_dev_cdevsw)
        return (NULL);

That's pretty categorical. It looks like I'll have to make some significant modifications to Vinum before we can put swap on it again. Spent some time looking at how to do that, but didn't finish.


Wednesday, 31 December 2003 Echunga
Top of page
previous day
next day
last day

More work on man pages today, including a review of the firewire stuff. Shimokawa-san had reviewed the man page and given some useful input. Firewire is really becoming a nice way to debug things, and today I was able to add information about how to debug a passive machine. Now I just need to find a hang to demonstrate the technique with.

There's one problem with passive remote debugging: you need to set a couple of sysctl variables to specify who you want to talk to, and this is rather painful. Spent some time writing functionality in fwcontrol to do it for me, which makes things even easier.


Top of page Previous month Greg's home page Today's diary entry Next month Greg's photos Copyright information

Valid XHTML 1.0!

$Id: diary-dec2003.php,v 1.59 2015/06/21 23:54:41 grog Exp $