return to first page linux journal archive
keywordscontents

World Wide Web Books

The first impression newcomers to the Internet and the World Wide Web get is that much net information feels very do-it-yourself. Is that good or bad? Some argue that all this volunteer information erodes the difference between the work of casual amateurs and the truly authoritative. Others contend that making this judgement has always been the responsibility of the reader, regardless of the medium, and the explosion of content on the Internet just makes more voices available to choose among. The Internet's detractors seem to be in the minority (although that doesn't diminish their arguments), and they can't stop Web serving from being a ton of fun. So we find many books on the market these days to help us do-it-yourself publishers. I recently had the pleasure of working with three.

Of these books, Morris's HTML for Fun and Profit is the only one that gives evidence of having been rushed to press (and it still missed its intended publication date by several weeks). The book has many distracting typos as well as a few truly embarrassing errors---for example, it's regrettable that Garrison Keillor has not read HTML for Fun and Profit, because it alleges that gopher originated at the University of Michigan. He should be able to get a good monologue out of that, on the topic of "Yet More Indignities Minnesota Suffers---Stoically, of Course."

HTML for Fun and Profit has the feel of a reference manual, with all the pros and cons that might suggest. It has many tables, and you probably won't come away from the book with the feeling that you're missing raw data.... Most of what Web authors and managers use most of the time is covered, with the exception of post-HTML 2.0 extensions. On the other hand, the prose is wooden, which makes for slow reading, and the onslaught of detail after detail creates a trees-obscuring-the-forest effect. Morris's book is the only one of the three that comes with a CD-ROM, which in this case offers 500MB of the same stuff over and over again. The disc's top-level directory structure offers "mac", "WINNT", "sol1", "sol2", and disappointingly slim "docs" and "src" directories. The directories named after architectures contain binaries for the platforms and all the examples from the book, mostly identical among the various architectures. ("sol1" and "sol2" are SunOS and Solaris 2, respectively.) Can ISO 9660 CD-ROMs have hard links? This one doesn't, and they might have made some room for the things we'd like to see on a Web CD-ROM, like maybe giftrans, which the text itself spends more than one page documenting. So what's here for Linux users? Well, all the text examples are here, though most are pretty trivial; source code for the CERN and NCSA servers is here; and there's source for Perl 4. No selection of Web browsers, no pbmplus, no xv, no giftrans. I was disappointed with the CD-ROM, as well as the book as a whole. Neither really connects the reader with the day-to-day experience of running a Web site.

By contrast, Build a Web Site is pervaded by experience. You sense on every page that you are being addressed by people who have worked long and hard with what they are discussing. For example, all the books mention that it is a faux pas to use "click here" as a hyperlink, but only this book bothers to give credible reasons why. Only in Build a Web Site will you find the magic incantation telnet hostname 80, the canonical way to test whether your Web server is happy. And only Build a Web Site explains why you should care about your server logs.

Build a Web Site includes the specifications for HTTP, HTML, and URLs, which the cynical might assume were included so that the book would take up more shelf space than its competitors at Bookstar. But the authors have taken pains to insert cross-references into the specs, a nice gesture, and I found myself using the specs regularly.

One odd omission from Build a Web Site is server-parsed HTML, a topic covered very nicely in The HTML Sourcebook. This book's focus is narrower than the others'; its intent is primarily to describe the world of Web authoring. It contains a section on Web servers, many of which are non-Unix, but the presentation is broad rather than deep. Interestingly enough, the Sourcebook displays a simple C program that pretends to be a Web server and displays what a client is sending. The program feels out of place, but it's useful nevertheless.

The Sourcebook also contains voluminous lists of tools, all with URLs for getting them. Since any paper guide to the Internet is going to be out of date the moment it hits the stands, these locations must be taken with a grain of salt, but you can at least use the locations as hints for archie use. Perhaps knowing this, the author included a section on how to use archie---although he assumes that you will be willing and able to install an archie client rather than telnetting to an archie server directly.

The HTML Sourcebook is the only book of the three to emphasize that the Web's native character set is ISO Latin-1, not ASCII. Build a Web Site ignores the issue (except in the included HTML spec), and HTML for Fun and Profit makes a muddle of the whole issue. The HTML Sourcebook is also the only one that mentions Linux.

Another strength of the Sourcebook is that it demonstrates by example the importance of previewing your Web content with a range of browsers, and the book includes the archive locations of many. Web authors often forget to do this test before releasing their material, so it's nice to see the word getting out; I wish Graham had been even more explicit. In fact, my favorite single page in any of these books is in HTML for Fun and Profit: it shows a Web home page whose authors neglected the possibility that visitors might have chosen to delay the loading of inline images. It's a train wreck, of course, and all the more satisfying because the page belongs to Sun.

Of these books, I recommend Build a Web Site for Linux audiences, possibly supplemented by The HTML Sourcebook for finding resources. Perhaps HTML for Fun and Profit will be improved in future editions. I also recommend that anyone contemplating Web work learn perl, perhaps before reading these books.

Brian Rice (rice@kcomputing.com) is Member of Technical Staff with K Computing, a nationwide Unix and Internet training firm.

  Previous    Next