Bits and Pieces - SGML and CALS - 12/1993

In my wanderings lately, I have picked up some interesting items I'd like to bring to your attention this issue. We'll cover SGML, and other developments in this new electronic environment for technical information.

First, recently I was fortunate to attend a talk by Dr. Goldfarb, the "father" of SGML, presented to a meeting of the Northern California CALS Interest Group. Dr. Goldfarb reviewed his contributions to information technology and gave some insight into his view of current commercial developments of the same.

Much to my surprise, Dr. Goldfarb is a lawyer whose ideas originate from work done on a law office application for processing information. His mission, as it were, is to create electronic information in a format that is free from proprietary control for anyone other than the information creator. If you have ever had to change the word processing software you use, and needed to translate legacy documentation without a software program to do it for you, you understand what I mean by proprietary control.

His first efforts were for IBM from 1969 to 1978, and resulted in the GML (Generalized Markup Language), owned by IBM. IBM then sponsored Dr. Goldfarb's work with the International Standards Organization to create the ISO-8879, SGML (Standardized General Markup Language), adopted in 1986.

SGML is an open system of tagging, where the tags define the elements of the document, not the format of the document (that is, tags define chapter headings for example, not indents, italics, and bold fonts). All elements of style (fonts, etc.) are defined in a separate section. Dr. Goldfarb, having completed his concept of SGML, was interested in pushing the idea still further. He began work on tagging music in SGML to see if he could deal with the issue of time. Time is, of course, the underlying structural addition that makes multimedia possible. And music has already developed a system of notation that allows for relative time.

This work culminated in the development of HyTime (ISO/IEC-10744) in 1992, a SGML subset that sets standards for hyper links to any information object (SGML or not), has real time capability (necessary in video for example), and provides for the isomorphic representation of time and space.

Dr. Goldfarb notes that his tagging schema can save companies tremendous sums of money in preserving their investment in information. For example, IBM recently took 11 million master pages and distributed them on CD-ROM and on-line (IBM technical manuals). Some of the information was 15 to 20 years old, tagged in the original GML format. Rework of the information amounted to only 20% of the whole, and Dr. Goldfarb estimated that with SGML the rework would have been additionally cut in half.

Dr. Goldfarb also affirmed that SGML is moving mainstream, with SGML products available or in progress from WordPerfect, Lotus AmiPro, Interleaf, Frame, and Microsoft. He suggested that buyers beware when it comes to SGML. The point of SGML is to have information that is free from proprietary word processing tagging systems. However, there is always an urge on the part of large organizations to optimize and specialize their products. In doing so, they may introduce elements into their SGML products that move them away from the open standard and back into the area of proprietary systems.

At the same meeting, two demonstrations were given, one of Boeing's REDARS system and United's EMSYS implementation, and the other of a prototype IETM (Interactive Electronic Technical Manual) developed by Ford. These two projects are of interest because they point out, once again, the leadership of the aerospace and automotive industries in maximizing the utility of information within the electronic environment.

The REDARS system is simply a computerized library of drawings that were previously stored on aperture cards. Cindy Bjornstad, the presenter from Boeing, noted that the first prototype of the system released in 1987 was not well used, but did provide users with a chance to critic the product. Currently, Boeing is in the third phase, with approximately 170,000 views and 120,000 prints being made from the database daily. Within Boeing, access is directly from the user's desktop (i.e., workstation).

Of particular interest is that not only is Boeing using this information internally, but they are starting to additionally supply this type of information to users of their aircraft, as with the United EMSYS project. The goal of EMSYS: integration of electronic information from manuals and from maintenance to improve the following areas: technical information management, planning and control, materials management, and cost and support. The external electronic information to be used in aircraft maintenance is comprised of both regulatory information and manufacturers' information (manuals). For example, at United, 25% of the job cards in the maintenance area have drawings attached.

For your information, the aerospace industry has adopted the ATA-100, the SGML standard for this industry.

Additionally, Ford presented a prototype Interactive Electronic Technical Manual. Of special interest was the ability of the system to update a drawing utilized within a number of manuals automatically upon updating the original. Ford is also working on robust drawings that can be rotated or varied in size depending on the needs of each publishing usage (manual, parts list, etc.). Due to the competitive needs of the automotive industry, we can expect more leadership in this area from this sector. The industry SGML standard is published by the Society of Automotive Engineers, SAE-J2008, which should be available soon.

IEEE presented a quick overview of their work in developing a suite of "Standards DTD's" to the ANSI SDSC (Standards and Data Services Committee) meeting in March. Rather than attempt to create one DTD (Data Type Definition) to cover all of their standards, they have created six individual ones, for specific sections of the standard (definitions, referenced documents, etc.). IEEE is hoping to be able to accommodate both standards committees that can author in an SGML environment, and those that work with word processing packages. Their goal is to define the sections so well with style sheets, that the translation of word processed information into SGML format can be automated.

IEEE is definitely on the forefront in taking this stand. Most organizations are taking a wait and see attitude, some choosing word processing for now, and others front-runners opting for SGML with one overarching DTD to cover all situations.

IEEE is anticipating a future in which it will be an information provider, not a publisher. This is its main rational for choosing SGML. With its information intelligently tagged, it should be able to accommodate a variety of third party valued added products and services. Yet, it will be able to service its members and the public with the "raw" data quite easily as well.

To meet the goal without straining its resources, IEEE has felt that they want to narrow the choices down during the authoring process, to allow for ease of use. They also have analyzed how the committees actually go about the task of producing a written standard. They have found that various sections of standards are handed off to different authors and then later consolidated into a single document. By limiting the choices to ones suiting each section, they hope to make the transition into an intelligently-tagged authoring environment easier on their volunteers.

Although there are a number of organizations that have opted for SGML, it is not yet clear that all organizations are ready to go along. However, as Dr. Goldfarb pointed out, as SGML applications are brought to market by the big names in word-processing, tagging documents intelligently should become less exotic. Information so tagged should have a longer life-span and be more robust for a variety of uses.

Standards Developing Organizations, like IEEE, rely on volunteers to do the work of document authoring. Whatever path they take, they must be sensitive to the abilities of their membership, since these folks are the ones who will make or break the system.

Claudia Bach is President of Document Center, Inc., an information delivery service based in Belmont, CA. The company has complete collections of specifications and standards from both government sources and industry associations. She can be reached by phone at (650) 591-7600, by fax at (650) 591-7617, or by e-mail at, or see  

For more information on standards see Document Center's Home page Welcome to Document Center . This article was originally published by the Society for Technical Communication. If you'd like to know more about this association for technical writers see: