UNIX or BSD?

UNIX or BSD?
by Greg Lehey

In the last article, I looked at the influence that Microsoft has on the BSD world. Two or three people disagreed with this, calling it Microsoft bashing. They weren't very specific, and none answered my question about what they objected to. Still, it isn't my intention to make other software look bad, I just want BSD to look good. Despite the title, my discussion last month wasn't really about Microsoft: it was about the way the industry is catering only for one software vendor.

This month, I'll look at a more comparable operating system: UNIX® System V. Most of the comparisons between BSD and System V that I have seen have looked at BSD from a System V perspective. In this article, I'll look at System V from a BSD perspective.

Where I'm coming from

That doesn't mean that I come from a BSD background. While at Tandem, I cut my UNIX teeth on Tandem's implementation of System V.2 and V.3, and in 1990 I bought some 80386-based PCs for my department to use as our own UNIX workstations.

We had some discussion whether we should use PCs or Sun 3 workstations. In any case, we compared the prices and decided we could get better value for money out of PCs than we could from Sun, so we bought 20 MHz 386s with 8 MB of memory and 20" monitors, and installed Interactive UNIX version 2.2, a release of System V.3.2, along with the X window system 11.3.

It would be incorrect to say that we were delighted by the machines in all respects. It took us experienced software professionals an average of two days each to install the software. Once the base system was installed, we needed to install TCP/IP and X, along with options like NFS. Each package came on its own set of floppies and a little slip of paper with two-part ``activation keys'' like ltd005269 and timzdfuz. Type one of them wrong, and you can restart the installation. I don't recall how many packages we had, but I have a list of the packages on a slightly later installation of SCO UNIX 2.2: there were a total of 77 floppies in 14 different sets.

Once we had the software installed, we had a quite reasonable system at our disposal. Sure, there were some rough edges, and it didn't quite measure up to Tandem's own NonStop-UX version of System V.3.2, but it ran rings round MS-DOS. And there were some strange little problems, but we found workarounds for them after we discovered that phone support cost additional money, and what they offered didn't help anyway.

Meeting BSD

Round about this time, I made the decision to leave Tandem and become a private consultant. One thing worried me: for the previous 10 years, I had always had access to the source code of the operating systems I supported. I knew what a UNIX license cost, and I knew that I wouldn't be able to pay for that as a fledgling consultant. Round about then I heard that people were selling the source code for 4.3BSD for only $1000. That sounded like a great deal, and I enquired. The company was called Berkeley Software Design Inc., and they had a product called BSD/386, which was currently in Beta. Based on my experience with Interactive UNIX, I didn't expect too much, but I thought it was worth trying, so in March 1992 I sent off my money and got a QIC-150 tape of BSD/386 0.3.1 and some photocopied documentation.

I was pleasantly surprised when the installation went smoothly. Well, much more smoothly than the System V installation, anyway. After all, there was just one tape to install from, and I only needed to decide which components to install. No multiple passes, no activation keys, but I did need a pocket calculator to work out my disk partitioning.

Using BSD/386 was even more of a revelation: it just worked. None of the messing around I had needed with Interactive. I was very impressed, and almost immediately wrote an article for the German magazine iX, which in those days specialized in UNIX. They had just published a ``review'' of two of the first Intel-based System V.4 offerings. The review was relatively short: they hadn't been able to install either of them.

Seven years later

That was a long time ago, of course. Things have changed on both sides, but some differences have remained. System V implementations tend to feel less homogeneous than BSD. Although they are the commercial implementation, they seem to have a number of rough edges. I have a theory that this is a result of the ``business case'' approach to support: it's not a bug unless a big customer complains. By comparison, I'd call the BSD approach the ``it gets on my nerves'' approach. Both have validity, but the BSD approach makes for a smoother running system.

System V features

There's another reason why System V seems less homogeneous: it's less homogeneous. To understand that, let's look at how System V evolved.

System V is descended from the Sixth Edition of Research UNIX. At the time it diverged from Research UNIX, BSD was primarily a distribution of user programs, rather like a Linux distribution nowadays, and it had little influence on the direction of kernel development. In the early 80s, that changed. BSD started to develop serious kernel improvements, in particular virtual memory, the fast file system (now known as the UNIX file system or ufs) and integrated Internet software. This even caused a break in the development of Research UNIX: the First to the Seventh Editions were an uninterrupted development, but the Eighth Edition was based on 4.1cBSD.

System V did not include any of the BSD kernel features. The BSD tapes were available, of course, and a majority of vendors included some of these features, typically calling them ``Berkeley extensions''. In the meantime, though, System V was developing its own version of networking support, based on Dennis Ritchie's STREAMS concept. By contrast, Berkeley networking was called sockets, since they were the most obvious difference from the user standpoint. Many implementations of System V have both STREAMS and sockets network interfaces, though nowadays there is a tendency to implement sockets as a library interfacing to STREAMS.

Another issue, a very political one about 12 years ago, was AT&T's attempts to woo Sun Microsystems to use System V. At the time, SunOS was firmly based on 4.2BSD, and AT&T saw this as serious competition. They had already come to an agreement with Microsoft to merge the functionality of XENIX into System V.3, giving rise to System V.3.2. The agreement which was finally reached with Sun involved merging all BSD functionality into System V.4. In fact, the term ``merge'' is misleading; System V.4 kernels were about the same size as a System V.3 kernel and a 4.3BSD kernel put together, so maybe the term ``add'' would have made more sense.

The real differences

But what were the real differences? Most people don't spend their evenings reading kernel source code, though it makes excellent bedtime reading. Most people wouldn't know a socket if it bit them, nor a STREAM if it streamed at them. Most people just use the machines. What's it like to be put in front of a System V machine after being used to BSD? Let's take a random walk-through.

The first time I used a Sun workstation (running SunOS 4) I had no manuals handy, and it drove me crazy that I couldn't even use the ps command, since it had different option letters. I typed ps -ef and got the message

ps: f: unknown option
ps: usage: ps [-acCegjklnrStuvwxU] [num] [kernel_name] [c_dump_file] [swap_file]

Why this gratuitous difference?

The Seventh Edition ps command had exactly four option letters: a (show all processes), l (produce long listing), x (include dæmons) and k (get information from a kernel core dump instead of the running system). 4.4BSD has the first three of these, but with the advent of virtual memory the k flag changed to M. It also adds a u flag for additional information which the Seventh Edition didn't show. System V has also dispensed with the k flag, and also with the x flag: instead of writing ax, you write -e. Instead of writing u, you write -f. Oh yes, and System V insists on a hyphen (-) before the commands, so if on a System V machine (here SGI's IRIX 5.3) you write ps aux, you'll get the message

Usage: ps [ -edalfcjM ] [ -t termlist ] [ -u uidlist ]
          [ -p proclist ] [ -g grplist ] [ -s sidlist ]

OK, you might think, big deal. Right, but some people do think it's a big deal. In fact, of course, the ps command is one of the commands most intimately related with the kernel internal structure. Here's an example looking at the status of process 1, init:

SunOS 4.1.3:

$ ps uax
USER       PID %CPU %MEM   SZ  RSS TT STAT START  TIME COMMAND
root         1  0.0  0.0   52    0 ?  IW   Feb 16  0:00 /sbin/init -
$ ps lax
       F UID   PID  PPID CP PRI NI  SZ  RSS WCHAN        STAT TT  TIME COMMAND
20088000   0     1     0  0   5  0  52    0 child        IW   ?   0:00 /sbin/init -

FreeBSD 4.0:

$ ps up1
USER   PID %CPU %MEM   VSZ  RSS  TT  STAT STARTED      TIME COMMAND
root     1  0.0  0.0   496   72  ??  Is   28Mar99   0:01.79 /sbin/init --
$ ps lp1
  UID   PID  PPID CPU PRI NI   VSZ  RSS WCHAN  STAT  TT       TIME COMMAND
    0     1     0   0  10  0   496   72 wait   Is    ??    0:01.79 /sbin/init --

IRIX 5.3:

$ ps -fp1
     UID   PID  PPID  C    STIME TTY      TIME COMD
    root     1     0  0 13:00:36 ?        0:00 /etc/init 
$ ps -lp1
 F S   UID   PID  PPID  C PRI NI  P    SZ:RSS      WCHAN TTY      TIME COMD
30 S     0     1     0  0  39 20  *    68:40    801752c0 ?        0:00 init

You'll see a lot of differences between all three. Here's a summary:

SunOS doesn't understand the p flag, which specifies the specific process, so I had to list all processes and select the output for init.

IRIX seems to place the binary in /etc/init, whereas the BSD systems place it in /sbin/init. This is a coincidence: in fact, /sbin is a directory introduced in System V.4, and that's where IRIX stores it too, but there's a symbolic link /etc/init which points to /sbin/init. It appears that the startup scripts hadn't adapted yet.

The BSD systems show CPU and memory usage in percentages. This relates to the way the kernel scheduling works. It's difficult to extract this information in System V, so all System V implementations I know show only a field C, which is one of the counters used in calculating process priority. The more CPU time the process uses, the higher its value. The System V C field corresponds to the CPU field in the BSD ps l listing, not the %CPU field in the ps u listing.

The flags S (System V) and STAT (BSD) show the process state. These flags are very dependent on the implementation, and differences exist even between related implementations: for example, according to the online man pages OpenBSD has different flags from NetBSD and FreeBSD. In the examples we see here, SunOS has I, meaning idle. FreeBSD in addition has s, indicating a session leader. SunOS does not flag session leaders. IRIX has S, indicating that the process is sleeping. All these flags mean the same thing: the process is not doing anything right now.
System V and SunOS also have an additional field, F, which provides additional information, conveniently in hexadecimal. The value 30 for IRIX shows that both the process and its user area are in core. The value 20088000 for SunOS shows that the process has completed an exec system call, that it is orphaned (in other words, it has no parent, which is always the case for init) and, to quote the SunOS man page, ``init data space on demand, from vnode''. This last flag would probably require access to the sources to interpret correctly.

When a process is waiting, WCHAN gives some indication of what it is waiting for. UNIX calls this sleeping, and a process always sleeps on a specific address. System V shows this address (in hexadecimal), whereas BSD shows a wait string which can give you some idea of what it is waiting for.

The fields SZ or VSZ (virtual process image size) and RSS (resident segment size, or the amount of physical memory currently in use) look like they mean pretty much the same thing. Again, though, this is deceptive. In System V, the values are measured in pages (which, in IRIX, are 4 kB), whereas SunOS and FreeBSD measure them in kilobytes.

This is only one example, of course. At the beginning, it looked as if FreeBSD and SunOS are strongly related, and that System V is rather different. By now it should be apparent that the differences are more complicated than they appeared.

What about networking?

All these machines I'm showing here are on the same network. What commands do you use to control them? When looking at network configuration, two commands do nearly everything you need: ifconfig and netstat. They both come from 4.2BSD, of course, but they're also the standard in System V. Here's what we see:

SunOS 4.1.3:

$ ifconfig -a
le0: flags=63<UP,BROADCAST,NOTRAILERS,RUNNING>
        inet 192.109.197.145 netmask ffffff00 broadcast 192.109.197.255
        ether 8:0:20:e:2c:98 
lo0: flags=49<UP,LOOPBACK,RUNNING>
        inet 127.0.0.1 netmask ff000000

FreeBSD 4.0:

$ ifconfig -a
ed2: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        inet 192.109.197.137 netmask 0xffffff00 broadcast 192.109.197.255
        ether 00:80:48:e6:a0:61 
ppp0: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> mtu 1500
        inet 139.130.136.133 --> 139.130.136.129 netmask 0xffff0000 
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
        inet 127.0.0.1 netmask 0xff000000

IRIX 5.3:

$ ifconfig -a
ifconfig: ioctl (SIOCGIFFLAGS): no such interface
usage: ifconfig interface
        [ af [ address [ dest_addr ] ] [ up ] [ down ][ netmask mask ] ]
        [ metric n ]
        [ primary ]
        [ arp | -arp ]
$ ifconfig ec0
ec0: flags=c63<UP,BROADCAST,NOTRAILERS,RUNNING,FILTMULTI,MULTICAST>
        inet 192.109.197.146 netmask 0xffffff00 broadcast 192.109.197.255
$ ifconfig lo0
lo0: flags=1849<UP,LOOPBACK,RUNNING,MULTICAST,CKSUM>
        inet 127.0.0.1 netmask 0xff000000

Again, the most obvious thing we see here is that the output is same, but different. IRIX doesn't understand the -a flag to ifconfig, so we have to ask for them piecemeal. When we do, the output looks pretty much the same as for the BSD systems. The only difference is that it doesn't show the Ethernet address for the Ethernet interface. FreeBSD has an additional interface, ppp0, which is the gateway to the Internet, but that's not a feature of the operating system. In addition, the interface flags are similar, but not the same; it's clear that even most of the bit positions in the representation are the same for all three.

How about the routing tables? Here they are:

SunOS 4.1.3:

$ netstat -r
Routing tables
Destination          Gateway              Flags    Refcnt Use        Interface
loopback             loopback             UH       1      1390       lo0
192.109.197.0        iskra                U        9      2872       le0

FreeBSD 4.0:

$ netstat -r
Routing tables

Internet:
Destination        Gateway            Flags     Refs     Use     Netif Expire
default            Cont0.way3.Adelaid UGSc       19    37438     ppp0
localhost          localhost          UH         11     7508      lo0
Cont0.way3.Adelaid freebie            UH         15        0     ppp0
widecast           ff:ff:ff:ff:ff:ff  UHLWb       0       19      ed2 =>
widecast           link#1             UC          0        0      ed2
allegro            0:0:c0:44:a5:68    UHLW        9 27709597      ed2    662
freebie            0:80:48:e6:a0:61   UHLW       12  1005213      lo0
yana               0:0:b4:33:6d:a2    UHLW        4 106587923      ed2    333
iskra              8:0:20:e:2c:98     UHLW        3     1794      ed2    531
raptor             8:0:69:1:7:7       UHLW        5     2136      ed2    363
panic              0:a0:24:37:c:bd    UHLW       12  6827834      ed2    998
papillon           link#1             UHLW        1 71551886      ed2
broadcast          ff:ff:ff:ff:ff:ff  UHLWb       2    84577      ed2

IRIX 5.3:

$ netstat -r
Routing tables
Destination      Gateway            Flags    MTU    RTT RTTvar    Use Interface
localhost.lemis. localhost.lemis.co UH         0      0      0      0  lo0
raptor.lemis.com localhost.lemis.co UH         0      0      0     62  lo0
default          freebie.lemis.com  UG         0      0      0      0  ec0
192.109.197      raptor.lemis.com   U          0      0      0   2002  ec0
BASE-ADDRESS.MCA raptor.lemis.com   U          0      0      0      0  ec0

The biggest difference here is FreeBSD, not IRIX. IRIX has a number of different fields, and it shows a path to itself (raptor.lemis.com) via the local host, and the names are fully qualified. This latter may be a matter of configuration.

The difference between the SunOS routing table and the FreeBSD routing table is a question of evolution. FreeBSD includes information on link-level routing (flagged with L in the display). In this example, it's all Ethernet (interface ed2), and the Gateway value is the Ethernet address. In the case of system papillon, arp has timed out, so it shows link#1 instead. The next time a packet goes to or from papillon, this value will be replaced by the Ethernet address. broadcast and widecast are broadcast addresses (widecast is address 0, required by some brain damage in SunOS 4). As a result, the Ethernet address is the Ethernet broadcast address.

Those fully qualified IRIX names

I was puzzled by the fully qualified names from IRIX's netstat, so I checked with the hostname utility. Here are the results:

SunOS	iskra
IRIX	raptor
FreeBSD	freebie.lemis.com

Well, there was a difference, anyway. Let's see what IRIX's man page on hostname says:

hostname(5)                                                        hostname(5)

NAME
     hostname - hostname resolution description

DESCRIPTION
     Hostnames are domains, where a domain is a hierarchical, dot-separated
     list of subdomains; for example, the machine monet, in the Berkeley
     subdomain of the EDU subdomain of the Internet would be represented as

     monet.Berkeley.EDU

     (with no trailing dot).

Does this look like System V? What does it look like in FreeBSD?

HOSTNAME(7)        FreeBSD Miscellaneous Information Manual        HOSTNAME(7)

NAME
     hostname - host name resolution description

DESCRIPTION
     Hostnames are domains, where a domain is a hierarchical, dot-separated
     list of subdomains; for example, the machine monet, in the Berkeley sub-
     domain of the EDU subdomain of the Internet would be represented as

           monet.Berkeley.EDU

     (with no trailing dot).

In other words, nothing has changed except for the format and the manual section. Even the name of the system shouts ``BSD''. Interestingly, I have other man pages, from the Lachman implementation of System V STREAMS TCP/IP, which have changed the name: monet.Berkeley.EDU has become laiter.Lachman.COM. The rest of the text appears identical.

Anyway, I tried this on the IRIX machine:

$ hostname -s raptor.lemis.com
$ netstat -r
Routing tables
Destination      Gateway            Flags    MTU    RTT RTTvar    Use Interface
localhost        localhost          UH         0      0      0      0  lo0
raptor           localhost          UH         0      0      0     62  lo0
default          freebie            UG         0      0      0      0  ec0
192.109.197      raptor             U          0      0      0   2903  ec0
BASE-ADDRESS.MCA raptor             U          0      0      0      0  ec0

So yes, the fully qualified name is due to the host name. I can see the same effect in FreeBSD, but interestingly, not in SunOS.

So what?

In this article, I've taken a quick look through three different UNIX systems. Oh, sorry, two UNIX systems and one BSD system. It's only one look through, and it's not supposed to be representative. But it does show that the differences aren't as big as they appear to be--at least, not for most users. There are bigger differences under the hood, but we'll look at them some other time.