History of UNIX Manpages

An appendix of Practical UNIX Manpages

Where do UNIX manpages come from? Who introduced the section-based layout of NAME, SYNOPSIS, and so on? And for manpage authors: where were those economical two- and three-letter instructions developed?

The many accounts available on the Internet lack citations and are at times inconsistent. In this article, I reconstruct the history of the UNIX manpage based on source code, manuals, and first-hand accounts.

Special thanks to Paul Pierce for his CTSS source archive; Bernard Nivelet for the Multics Internet Server; the UNIX Heritage Society for their research UNIX source reconstruction; Gunnar Ritter for the Heirloom Project sources; Alcatel-Lucent Bell Labs for the Plan 9 sources; BitSavers for their historical archive; and last but not least, Rudd Canaday, James Clark, Brian Kernighan, Douglas McIlroy, Nils-Peter Nelson, Jerome Saltzer, Henry Spencer, Ken Thompson, and Tom Van Vleck for their valuable contributions.

Please see the Copyright section if you plan on reproducing parts of this work.

Timeline

The development of UNIX manpages can be divided into the Prehistory, before UNIX; the Classical Age, during the development of UNIX; and the Renaissance, where traditional UNIX utilities were re-written. In this chart, I show all known formatters of manpages (and their logical precursors before manpages existed as such).
Timeline of UNIX manpage utilities
Timeline of UNIX manpage utilities

Prehistory

1964: RUNOFF (Jerome H. Saltzer)

Saltzer wrote the RUNOFF utility for MIT's IBM 7094 CTSS operating system in the MAD computer language. Its legacy is considerable: not only do contemporary manpages inherit from RUNOFF, many of them, in fact, use instructions identical to those specified in the original RUNOFF manual.

Input generally consists of English text, 36O or fewer characters to a line. Control words must begin a new line, and they begin with a period so that they may be distinguished from other text. RUNOFF does not print the control words.

Of the many abbreviated RUNOFF control words, macros such as sp and br are still common-place. According to Saltzer and the source literature, the syntax of RUNOFF inherits loosely from the prior DITTO, MEMO, and MODIFY utilities by M. J. Leslie Lowry, Fernando J. Corbató, and J. Richard Steinberg, 1963. The original purpose of RUNOFF was to format Saltzer's doctoral thesis proposal.

Sources:

1966: RUNOFF (Rudd Canaday) (UNCERTAIN)

While working at AT&T Bell Labs' Whippany centre, Canaday led a porting effort of the CTSS at MIT to the GE-635 (in 635 assembly). The RUNOFF utility is suspected to be part of this port. The ported CTSS was originally intended as a prototype (to be replaced by Nike hardware), but ended up being used for five more years. No sources could be located for this CTSS port.

Sources:

  • Rudd Canaday, the Old Days. 27 October, 2011. Email to Kristaps Dzonsons and Tom Van Vleck.
  • Rudd Canaday, the Old Days. 24 October, 2011. Email to Kristaps Dzonsons and Tom Van Vleck.
1967: roff (Robert Morris) (UNCERTAIN)

Little is known about this speculated port of RUNOFF, only that it was called roff and probably ran on Bell Labs' GCOS-II GE-635. Ritchie is commonly cited as participating in this port, and this is not disputed. Canaday is also mentioned as an author, though this is not the case by his own account. Both McIlroy and Saltzer speculate that this version, if written, was likely in BCPL, much like runoff.

Sources:

1967: SCRIPT (Stuart Madnick)

In 1967, Madnick ported the RUNOFF code to the IBM CP67/CMS at IBM as SCRIPT. The documentation of SCRIPT explicitly mentions the backspace-encoding convention used to this day by manpage formatters on UNIX terminals (of course, this was common practise in mechanical type-writers before then):

Thus the backspace key allows underscoring and overprinting at the terminal for SCRIPT files. The logical backspace character prints only when entered and does not take up a column in the record; it logically backspaces one column...

Source code for the original re-write of RUNOFF could not be located, although a considerable amount of documentation exists for this utility. The dates of Madnick's porting derive from his publication and independent accounts.

Sources:

1969: runoff (Douglas McIlroy)

In 1969, McIlroy released an influential BCPL port of RUNOFF to extend the runoff model to the GECOS GE-645 computer at AT&T Bell Labs, Murray Hill. He did not refer to the CTSS RUNOFF source code in writing runoff, nor any other speculated derivatives of Saltzer's utility. The progress of this utility subsequent 1969 is recorded in the Multics BCPL source as ported by R. F. Mabee:

The first ROFF for Multics was written in March, 1969, by Doug McIlroy of Bell Labs. Art Evans made extensive modifications to it in May and June, 1969, adding many comments and making various changes. Footnoting added by Dennis Capps in 1970. Maintained by Harwell Thrasher in 1971. Many new features added and bugs fixed by R Mabee in 1971-1972. RUNOFF and BCPL were brought over to the 6180 Multics (from 645) in May of 1973 by R F Mabee.

McIlroy's port is at times referred to as runoff, at times as roff. The reason for renaming is not entirely clear: I see that I called it runoff in 1969. By 1971 it was roff. Now I'm not so sure I got the name from Morris. Conceivably it came from Thompson, who was big on shortening names. My 1971 description came in January; the first ediion [sic] Unix manual is dated December 1971.

Sources:

1974: compose (Dennis Capps)

The compose utility, né runoff, was a port of RUNOFF to PL/1 by Capps for the Multics Honeywell 6180. It was later tuned and improved by Ed Wallman. Tom Van Vleck, Wallman's manager in mid-1978, recounts the porting effort to produce photo-typeset manuals and eliminate dependence on BCPL. (Note: the primary source erroneously refers to Ossanna as writing the BCPL runoff; it was in fact McIlroy.)

Sources:

Classical Age

1969: roff (Brian Kernighan)

After being exposed to RUNOFF while at MIT in summer 1966, Kernighan wrote a port in Fortran (for an IBM System/360) while working on his doctoral thesis at Princeton (to format the thesis). By his account, the system was used for five more years by the student agency. The punch-card source has long since been lost.

Sources:

1970: rf (Ken Thompson)

Thompson wrote a PDP-7 port of either the BCPL runoff or directly from the CTSS RUNOFF. This fork was an evolutionary dead-end, replaced by the PDP-11 roff(1) which was written at around the same time. The source code for this port has long since been lost.

Sources:

1971: roff(1) (Dennis Ritchie)

Most of the programming team for Multics continued working with UNIX at AT&T Bell Labs, Murray Hill, so it's no surprise that Multics runoff was incorporated as UNIX roff(1) in Version 1 AT&T UNIX, 1971. This was a PDP-11 assembly-language port of the BCPL runoff. According to McIlroy, Ossanna convinced the AT&T patent department to use UNIX and roff(1) for formatting patent applications, and according to Thompson, that usage was the justification for the PDP-11 purchase.

There are many differing accounts of who wrote this version, but most settle on Ossanna, Ritchie, and Thompson. However, the first-hand accounts of McIlroy, Kernighan, and Thompson settle on Ritchie as the primary author. This is corroborated by Ritchie's expertise in BCPL, as roff(1) was based upon McIlroy's BCPL runoff.

Unfortunately, the first roff(1) version is lost. Some sources from Version 1 AT&T UNIX have been reconstructed from old tapes, but roff(1) was not among them. All that remains from this time are manual entries and first-hand accounts. However, the PDP-11 source for Version 5 AT&T UNIX still exists, and by first-hand accounts, is a modified form of the original.

Version 1 AT&T UNIX roff(1) is also notable, regarding manpages, for its release of the First Edition UNIX Programmer's Manual, which defines the manpage structure and layout enjoyed in the present day. Thompson conceived of this convention, inspired by the Multics MSPM, itself inspired by the CTSS manuals.

The source code of Version 1 and 2 AT&T UNIX manual pages may not have survived; only typeset versions are known.

Sources:

1972: nroff(1) (Joseph F. Ossanna)

Ossanna took over the PDP-11 roff(1) and built nroff(1), which focussed on outputting text onto terminals, for Version 2 AT&T UNIX. The exact motivations for this are unknown, but are vaguely agreed as evolving the utility and language to support more advanced formatting.

Unfortunately, a source record of the original effort is entirely lost; however, later versions of the PDP-11 source (for Version 6 AT&T UNIX) and manual page (Version 3 AT&T UNIX) are available.

Thompson mentions that nroff(1) introduced the notion of programmable macros. This is corroborated by the Version 6 AT&T UNIX source for nroff(1), where files under the prefix /usr/lib/tmac. are parsed for macros (see nroff.8 in the Version 6 AT&T archive). This same behaviour is not reproduced in the roff(1) sources from the same time, so it makes sense that this is the first appearance of macros.

On the other hand, the manual pages for Version 3 AT&T UNIX (Feb. 1973) were still written in raw roff without using any macro set.

Sources:

1973: troff(1) (Joseph F. Ossanna)

Ossanna started writing troff(1) from his PDP-11 nroff(1) sources for Version 4 AT&T UNIX. It's widely asserted that the driving motivation was to create output for the CAT phototypesetter.

Original sources for this version of AT&T UNIX are also lost; however, the manual still exists, and records of the PDP-11 source exist in raw tape data for Version 6 AT&T UNIX (which apparently couldn't be reconstructed).

From Version 4 AT&T UNIX (Nov. 1973) to Version 6 AT&T UNIX (May 1975), a simple macro set was used for writing manual pages, but the macros were completely incompatible with macro sets used today.

Sources:

1979: ditroff(1) (Joseph F. Ossanna, Brian Kernighan)

In around 1975, Ossanna re-wrote troff(1) in the C-language. However, in 1977 this work was discontinued and sources were untouched for nearly two years until resumed by Kernighan:

After Joe Ossanna died in late 1977, troff was static for probably close to two years, since no one had the time and courage to touch it. It was entirely in C at that point, somewhat over 9000 lines as I remember. I simply modified it, gradually getting around some of the limitations on fonts, making use of dynamic memory, and generating "device-independent" output for devices whose properties were specified in dynamically loaded files.

The result is the first intact UNIX troff(1) source. By Version 7 AT&T UNIX, both nroff(1) and troff(1) are built from the same C sources, separated by preprocessor conditionals, while roff(1) was still built in its PDP-11 assembly. The ditroff(1) name origin is unknown. Kernighan writes, I'm pretty sure that I only talked about a "device independent troff"; the name "ditroff" came from somewhere else, and I've never been fond of it.

Kernighan's device-independent troff(1) was repackaged in commercial AT&T (and derivative) UNIX systems for years to come. In circa 1978, the DWB featured troff(1) as its mainstay; later, the WWB bolted on many additional word-processing utilities. These applications were repackaged in 1989 by USL, a subsidiary of AT&T Bell Laboratories. The DWB tools were then bought by SoftQuad in circa 1978, which rebranded troff(1) as sqtroff(1).

When Douglas McIlroy edited volume 1 of the manual pages for Version 7 AT&T UNIX, he revised the manual page macros substantially, first designing and implementing most of the macros that are still used in the man(7) language today.

Sources:

1991: troff(1) (Brian Kernighan)

The Plan 9 operating system, initially released in 1991 from AT&T Bell Labs, Murray Hill, is a research system extending the UNIX model. It included ditroff(1), which was steadily improved by Kernighan to include features such as UTF encoding. The troff(1) for Plan 9 was not free software until the Third Edition in June, 2000, when sources were licensed under the Plan 9 license. It was again re-licenced in 2002 under the Lucent Public Licence 1.02.

Sources:

2005: troff(1) (Gunnar Ritter)

In 2005, Sun Microsystems published an CDDL-licenced variant of their Solaris operating system called OpenSolaris. This included a re-licenced descendant of troff(1) as imported into AT&T UNIX System V UNIX in 1983. Ritter incorporated this software into the Heirloom project in August, 2005.

Renaissance

1989: groff(1) (James Clark)

The GNU troff is popularly considered to be the most wide-spread in modern UNIX installations. It's bundled by default on most GNU/Linux operating systems. This port, renamed groff, was written by James Clark in 1989 specifically for the GNU project. At this time, no open-source implementation existed.

The groff port was based purely from troff documentation as shipped with SunOS 4.1.4, and was written in C++ on Sun 4/110 (initial implementations of the GNU C++ compiler were on Sun, making it an ideal starting point).

Sources:

1991: awf(1) (Henry Spencer)

In 1991, Spencer sent a terse mail to USENET comp.sources.unix indicating that he'd written an interpreter in the AWK language for the man and ms macro packages documented in Version 7 AT&T UNIX nroff(1) and ditroff(1).

This is awf, the Amazingly Workable Formatter – a "nroff -man" or (subset) "nroff -ms" clone written entirely in (old) awk.

According to Spencer, awf was necessary for the portability of C News, a USENET news server written by Spencer and Geoff Collyer in 1987. The project wished to distribute manpage sources, but since ditroff(1) sources were still license-encumbered, needed to guarantee portability to un-licensed UNIX systems.

Sources:

1991: cawf(1) (Vic Abell)

Abell ported awf(1) to C in 1991 while at Purdue. The motivation of this port was to [produce] a C language version that would run on small systems, particularly MS-DOS ones.

2008: mandoc(1) (Kristaps Dzonsons, Ingo Schwarze)

Like awf, mandoc(1) primarily reads ditroff(1) macro files (not general ditroff(1) input), although it has some capability for generalised input. It is the first fully semantic parser for manpages, exposing the annotated content of parsed documents. groff(1) manuals were the predominant basis for this port.

Disclaimer: the author of this document originally wrote mandoc(1).

Sources:

People

Abell, Vic

Wrote cawf while at Purdue University. Homepage: people.freebsd.org/~abe/.

Canaday, Rudd

May have ported RUNOFF to the GE-635 while at AT&T Bell Labs, Whippany.

Capps, Dennis

Wrote compose while at Honeywell.

Clark, James

Wrote groff(1) for the GNU project, groff.ffii.org. Homepage: www.jclark.com.

Dzonsons, Kristaps

Wrote mandoc(1), mdocml.bsd.lv. Homepage: kristaps.bsd.lv.

Kernighan, Brian

Completed ditroff(1) while at AT&T Bell Labs, Murray Hill. Homepage: www.cs.bell-labs.com/who/bwk/.

Madnick, Stuart

While at Honeywell, ported RUNOFF to the IBM CP67/CMS and renamed it SCRIPT. Homepage: web.mit.edu/smadnick/www/home.html.

McIlroy, Douglas

Ported RUNOFF to GECOS GE-645 while at AT&T Bell Labs, Murray Hill. Homepage: www.cs.dartmouth.edu/~doug.

Morris, Robert

Possibly ported RUNOFF to a GECOS GE-635.

Ossanna, Joseph F.

Worked on the PDP-11 assembly-language nroff(1) and troff(1) (also converted the latter to C) while working at AT&T Bell Labs, Murray Hill.

Ritchie, Dennis

Wrote the PDP-11 assembly-language roff(1) while working at AT&T Bell Labs, Murray Hill. Homepage: cm.bell-labs.com/who/dmr.

Ritter, Gunnar

Maintains the Heirloom Project, including their re-licenced troff(1).

Saltzer, Jerome H.

Wrote RUNOFF while working as a student at MIT. Homepage: mit.edu/saltzer.

Schwarze, Ingo

Took over mandoc(1) from Kristaps Dzonsons in 2013. Current project lead.

Spencer, Henry

Wrote the awf formatter for the man and ms macro sets. Homepage: www.hq.nasa.gov/alsj/henry.html.

Thompson, Ken

Contributed to roff(1), nroff(1), and troff(1) while at AT&T Bell Labs, Murray Hill. Homepage: cm.bell-labs.com/who/ken.