This is the first draft "standard.940" which is the short form of the 1994 Project Gutenberg standards file. You should also receive .941 and .942 later, which will be medium and long. Given that this is an entirely new format, I am sure we have a few things left out, so your suggestions and corrections are a a great value to us, as always. The standard.gut files for 1994 will be provided a new format, [as are most Project Gutenberg files this year] including both a simplified procedure for those working alone [use a "program editor" to check your work" AND also reincluding a few mark-up procedures looking forward to the future, when bold, underline and italics will be a part of UNIcode or whatever ASCII super- set comes over the horizon]. Here is the draft, in the simplest possible terms: standard.940 Table of Contents [Search for [[X]] to find that section] [[1]] Goals [[2]] How We Hope To Accomplish These Goals [[3]] Current Standards For Release [[4]] Standards For Future Etexts [[5] Hyphenation and Margination standard.940 [[1]] Goals The goal of Project Gutenberg is the creation and distribution of 10,000 Etexts to 100,000,000 people by the end of 2001, and after that a Public Domain Register that will include not just include a LISTING of all the materials that are going into the Public Domain, but also the CONTENTS of some or all of them; I would hope the Public Domain Register would eventually include ALL Public Domain materials in both an Index and in Content. [[2]] How We Hope To Accomplish These Goals To accomplish this goal, Project Gutenberg encourages the book to Etext conversion in a number of ways, and also maintains an extensive distribution network. All of this is done in "hands off" manners, in which nearly total independence is granted to all those working with or for Project Gutenberg. We create an Etext edition from a variety of sources that have been cleared of copyright restrictions by our legal volunteers [we have one of the finest legal teams in U.S. and International copyrights which also doubles in output each year, to keep up with double output in the number of Etexts we distribute each year. Since copyright is a vastly important subject to us, we request that you put us in contact with ANY lawyers you know who might have an interest in copyrights in any manner. [[3]] Current Standards For Release Currently most of our files are distributed in a Plain Vanilla ASCII file structure very similar to that used in email. Each line ends in a "hard return" [cr/lf=carriage return/line feed] so that it can be read in DOS, Mac and UNIX programs. The DOS programs want both cr and lf, Macs want a cr, UNIX wants a lf. The reason for the Plain Vanilla ASCII format is simple, other formats simply won't REACH 95% of the computer populations out there, since NOT MORE THAN 5% of the 200,000,000 computers use any particular form of markup, with WordPerfect being the most popular form of markup at 5%, with 10,000,000 sales recorded. Our goal is to get these Etexts to EVERYONE on and off the net and on whatever hardware/software combinations they like. [[4]] Standards For Future Etexts In the future we hope the common ASCII character [super]set is going to include ways to include bold, italics and underscore, but for the present these methods of emphasis carry little for the average reader, other than for general emphasis which will currently be represented by CAPITALIZING the emphasized words, and other, more technical reasons for using these, to indicate names of boats, newspapers, books, etc., are not used today in Project Gutenberg Etexts because they clutter up the page with little or no addition of meaning, they are merely conventions. However, as we do hope that some easy and searchable UNIcodes, or whatever, will be developed, we still keep some of this for future use, and you are welcome to include an extra copy of an Etext with the following forms of markup: ~italics~. . .of the NON-emphatic nature: titles, etc. ~~italics~~. . .of the EMPHATIC nature, now done in CAPS. *bold* _underline_ or _underscore_ This simple form of markup will allow you [or us] to eliminate the extra markings after capitalizing the emphatic italics and still to preserve a second file for the future system. Remember, you can't find "To *be* or *not* to be" with "search programs" looking for "To be or not to be". . .yet. . .and the ability to search an Etext for what you are looking for in the space of a few seconds is one of the prime uses of Etexts. Other than this, the primary differences you may note in Etext formatted by Project Gutenberg are: Two spaces after every sentence or after a colon [:]. Two hard returns after each paragraph. Three for wide paragraph separations or between sections. Four hard returns at the end of a chapter. In a nutshell, this is it. We gratefully accept Etexts in ANY format: and are willing to do the work ourselves to get them into this format if needed. [[5] Hyphenation and Margination You may have also noted that most Project Gutenberg Etexts had most hyphenation at the ends of lines removed. We now have an experimental program to do this, and you are encouraged to see how this works on YOUR Etext by contacting Rick McGowan to put your Etext through this process. [rick_mcgowan@next.com] You may also have noted that many Project Gutenberg Etexts are additionally marginated to eliminate "widows and orphans" on a line by line level, so that phrases or sentences do not have a word of importance dangling off on another line. This makes a Project Gutenberg Etext easier to read AND to search, as many/ most search programs do NOT search from the end of one line to the beginning of the next. If you consider that this article, such as it is, has somewhere just over 10 words of the average line, searching for a phrase of three or four words should get a "hit" only about twice as often as a "miss", because a third of the time some of the phrase would be on a different line. This is a VERY important factor in using Etexts, and while few people enjoy doing the remargination, it can easily make Etext searching a much more powerful tool for our readers, and hopes are that a new experimental margination program will also help you in this process.