% AroundTheBend.tex concatenation of Around The Bend \begin{filecontents}{bend.ist} % MakeIndex style file bend.ist for use with AroundTheBend.tex % @ may be a valid character in the index, use ? instead actual '?' \end{filecontents} %\documentclass[draft,openany]{memoir} \documentclass[openany]{memoir} \usepackage{comment} \usepackage{url} \ifpdf \usepackage[pdftex, plainpages=false, pdfpagelabels, bookmarksnumbered ]{hyperref} \else \usepackage[%pdf, plainpages=false, pdfpagelabels, bookmarksnumbered ]{hyperref} \fi \usepackage{graphicx} \settrimmedsize{11in}{210mm}{*}% min letterpaper/A4 sizes \setlength{\trimtop}{0pt} \setlength{\trimedge}{\stockwidth} \addtolength{\trimedge}{-\paperwidth} \settypeblocksize{7.75in}{33pc}{*} \setulmargins{4cm}{*}{*} \setlrmargins{1.25in}{*}{*} \setmarginnotes{17pt}{51pt}{\onelineskip} \setheadfoot{\onelineskip}{2\onelineskip} \setheaderspaces{*}{2\onelineskip}{*} \checkandfixthelayout %\addtolength{\textwidth}{1in} %\addtolength{\oddsidemargin}{-0.5in} %\addtolength{\evensidemargin}{-0.5in} \newcommand{\ed}[1]{\emph{(Ed: #1)}} \newcommand*{\oposted}[1]{Originally posted on #1} \newcommand*{\arch}[1]{Archived as {\normalfont \ttfamily #1}} \newenvironment{solution}[1]{% \begin{description} \item[#1]\mbox{}}% % {\par\noindent\textbf{End solution}\end{description}} {\end{description}\vspace{-0.5\onelineskip}\textbf{End solution}} \newcommand*{\pfile}[1]{\texttt{#1}}% print a file name \newfixedcaption{\freetabcaption}{table} \renewcommand*{\chaptername}{QA} \renewcommand*{\chaptername}{} % \piif{if...} print and index \if... \newcommand*{\piif}[1]{\cs{#1}\index{#1?\cs{#1}}} \makeatletter \newcommand*{\zeroseps}{% \topsep\z@ \partopsep\z@ \parskip\z@} \newlength{\gparindent} \gparindent 0.5\parindent \newenvironment{lcode}{\zeroseps \renewcommand{\verbatim@startline}% {\verbatim@line{\hskip\gparindent}}% \small\setlength{\baselineskip}{\onelineskip}\verbatim}% {\endverbatim \vspace{-\baselineskip}\noindent} \makeatother \nouppercaseheads \headstyles{bringhurst} %\setlength{\beforechapskip}{2\onelineskip} \chapterstyle{section} \setlength{\beforechapskip}{2\onelineskip} \setlength{\beforechapskip}{0pt} \setlength{\afterchapskip}{1\onelineskip} \settocdepth{subsubsection} \setsecnumdepth{subsubsection} \makeindex %\title{Around The Bend} %\author{Michael Downes \\ %(edited by Peter Wilson)} %\date{} \newlength{\drop} \providecommand*{\wb}[2]{\fontsize{#1}{#2}\usefont{U}{webo}{xl}{n}} \newcommand*{\titleAB}{\begingroup \drop=4\baselineskip \centering \vspace*{\drop} {\Huge AROUND THE BEND}\\[\drop] {\hspace*{1.5em}\scalebox{8}[1]{{\wb{10}{12}4}}}\\[\drop] {\Large\itshape A Collection of TeX Challenges by}\\[\baselineskip] {\Large MICHAEL DOWNES}\\[\baselineskip] {\wb{10}{12}4}\\[\baselineskip] {\Large\itshape edited by}\\[\baselineskip] {\Large Peter Wilson}\par \vfill {\hspace*{1.5em}\scalebox{8}[1]{{\wb{10}{12}4}}}\\[\drop] {\large The Herries Press}\\ {July 2008}\par \vspace*{\drop} \endgroup} %% normally \parindent = 1.5em, but 0pt in \titleAB \begin{document} \tightlists \raggedbottom \frontmatter %\maketitle \thispagestyle{empty} \titleAB \cleardoublepage \tableofcontents \chapter{Preface} In the early 90's the late and much missed Michael Downes (1958--2003) ran a column in the INFO-TeX mailing list called \emph{Around The Bend} where he proposed macro-related problems and then posted submitted solutions. Although it was archived on CTAN in \url{info/aro-bend} it is not well known which is a shame as it provides answers to many problems that keep cropping up. (The archive is now at \url{info/challenges/aro-bend}). This is an attempt to make his work more accessible by providing the collection as a single document. As much as possible what follows is what Michael wrote; I have tried to limit myself to marking up the original ASCII text emails but I have not repeated administrative elements such as email headers. In some cases the original TeX code was replete with comments explaining what was going on. Where the comments were long with respect to the code I have set them in the regular body type so as to make the actual code more obvious; this has a side effect of slightly decreasing the amount of paper required to print the document. If you want to use the code solutions I suggest that you cut and paste them from the original archived versions. I thought that there were eighteen Around the Bends as that is all that are archived on CTAN. However I googled the Google Groups \url{comp.text.tex} group and found three more, nos.~19, 20 and~21. I have included what I could find of these, but answers to no.~19 appear to be missing, which is a pity as I think that I could have put them to use. Perhaps some of you might be willing to take up the challenge on this, or on any of the others. {\raggedleft \textsc{PW}\\ July, 2008 \par} \chapter{Introduction} \ed{This is Michael's introduction to his scheme, originally posted on 1991/10/10 as the initial portion of exercise~1.} %%[Exercises 1,2,3 were originally posted together on 10 Oct 91] \begin{verbatim} Date: Thu 10 Oct 91 09:51:32-EST From: Michael Downes Subject: Around the bend To: info-tex@shsu.edu \end{verbatim} Proposal for a regular feature: AROUND THE BEND With the encouragement of George Greenwade (the INFO-TeX list owner), I would like to propose a regular department for INFO-TeX, called `Around the bend'. It will consist of macro-writing challenges on the level of the dangerous-bend exercises in the \emph{TeXbook}, with interested parties invited to collaborate and/or compete to find the best solution. My motivation for doing this is partly selfish: to get more feedback from other macro writers about some of the interesting macro-writing problems that I run into. I originally approached George for advice about setting up a separate mailing list, but he thought that INFO-TeX and comp.text.tex readers would be interested. Since INFO-TeX mail is also channeled to comp.text.tex, readers of the latter should let me know if they don't want the extra traffic (although I don't expect it to be that much). I don't currently have access to read comp.text.tex directly, although George has been investigating the possibility of piping it through the INFO-TeX mailing list. So if you object by posting to comp.text.tex, I may not see your objection; send me mail, instead. The sample below should give a pretty good idea of what `Around the bend' would be like. Solutions should be sent to me instead of to INFO-TeX or comp.text.tex, on the premise that people usually won't want to read others' solutions until they've had a chance to try their own hand. A summary of the results would then be posted to the INFO-TeX list after two or three weeks; to those who submit solutions before the deadline, I could forward without delay solutions submitted by other people, for comparison. I will try to keep the difficulty of the exercises down to something reasonable, let's say, on the level of a homework assignment which a university student must complete in two weeks, finding time in the normal way from the usual busy schedule of other homework, class attendance, sports, and social life. However, be warned that the challenges will be hard. I'm planning to follow a `hard and fast' format: one or two hard questions, followed by one or two fast questions, where if you don't know the answer off the top of your head, you can either look it up in the \emph{TeXbook} or find it by running a quick test. \mainmatter \chapter{Expansion} \section{Exercise (hard)} %%\input{ex001.tex} % ex001.tex \begin{comment} (Originally posted on 1991/10/10) [Exercises 1,2,3 were originally posted together on 10 Oct 91] Date: Thu 10 Oct 91 09:51:32-EST From: Michael Downes Subject: Around the bend To: info-tex@shsu.edu Proposal for a regular feature: AROUND THE BEND With the encouragement of George Greenwade (the INFO-TeX list owner), I would like to propose a regular department for INFO-TeX, called `Around the bend'. It will consist of macro-writing challenges on the level of the dangerous-bend exercises in the TeXbook, with interested parties invited to collaborate and/or compete to find the best solution. My motivation for doing this is partly selfish: to get more feedback from other macro writers about some of the interesting macro-writing problems that I run into. I originally approached George for advice about setting up a separate mailing list, but he thought that INFO-TeX and comp.text.tex readers would be interested. Since INFO-TeX mail is also channeled to comp.text.tex, readers of the latter should let me know if they don't want the extra traffic (although I don't expect it to be that much). I don't currently have access to read comp.text.tex directly, although George has been investigating the possibility of piping it through the INFO-TeX mailing list. So if you object by posting to comp.text.tex, I may not see your objection; send me mail, instead. The sample below should give a pretty good idea of what `Around the bend' would be like. Solutions should be sent to me instead of to INFO-TeX or comp.text.tex, on the premise that people usually won't want to read others' solutions until they've had a chance to try their own hand. A summary of the results would then be posted to the INFO-TeX list after two or three weeks; to those who submit solutions before the deadline, I could forward without delay solutions submitted by other people, for comparison. I will try to keep the difficulty of the exercises down to something reasonable, let's say, on the level of a homework assignment which a university student must complete in two weeks, finding time in the normal way from the usual busy schedule of other homework, class attendance, sports, and social life. However, be warned that the challenges will be hard. I'm planning to follow a `hard and fast' format: one or two hard questions, followed by one or two fast questions, where if you don't know the answer off the top of your head, you can either look it up in the TeXbook or find it by running a quick test. All right, here are the first three. \end{comment} %********************************************************************** %*** Exercise 1 (hard): \ed{\oposted{1991/10/10}. \arch{exercise.001}.}\\%[0.5\baselineskip] Given arbitrary \cmd{\b}, \cmd{\c}, \cmd{\d} (macros without arguments), for example \begin{lcode} \def\b{\c\c} \def\c{*} \def\d{\b\c} \end{lcode} figure out how to define \cmd{\a} so that its replacement text consists of \cmd{\b} fully expanded plus \cmd{\c} not expanded plus \cmd{\d} expanded exactly once. I.e., with the above definitions the replacement text of \cmd{\a} should be \begin{lcode} **\c\b\c \end{lcode} You may not use \cmd{\the} or \cmd{\noexpand} in your solution. This is Exercise 20.16 in the \emph{TeXbook}, except that there's an added restriction: your answer must also not use the \cmd{\halign}\texttt{\ldots}\cmd{\span} method given in the answer to 20.16. (Yes, that means you can't use \cmd{\valign} either!) Why would anyone want to do such a hard exercise? Answer: advanced macro writing requires a thorough knowledge of expansion control principles. \begin{comment} [Exercise 2 moved to exercise.002] [Exercise 3 moved to exercise.003] Send answers to: Michael Downes mjd@math.ams.com (Internet) A summary will be posted Friday, October 25, 1991. \end{comment} %%\endinput \section{Answers} %%\input{ans001.tex} % ans001.tex \ed{\oposted{1991/10/25}. \arch{answer.001}.}\\ \begin{comment} [Solutions for exercises 1,2,3 were originally posted together on 25 Oct 91] Date: Fri 25 Oct 91 15:19:44-EST From: Michael Downes Subject: `Around the bend' #1 solutions To: info-tex@shsu.edu Solutions to the exercises of `Around the bend' #1. "*** Exercise 1 (hard): "Given arbitrary \b, \c, \d (macros without arguments), for example " " \def\b{\c\c} \def\c{*} \def\d{\b\c} " "figure out how to define \a so that its replacement text consists "of \b fully expanded plus \c not expanded plus \d expanded exactly once. "I.e., with the above definitions the replacement text of \a "should be " " **\c\b\c " "You may not use \the or \noexpand in your solution. This is Exercise "20.16 in the TeXbook, except that there's an added restriction: your "answer must also not use the \halign ... \span method given in the "answer to 20.16. (Yes, that means you can't use \valign either!) \end{comment} The restrictions leave us with (essentially) three expansion-control commands: \\ \cmd{\expandafter}, \cmd{\edef} and \cmd{\def}. %\begin{description} %\item[Solution 1 {[Peter Schmitt]}] \mbox{} \begin{solution}{Solution 1 (Peter Schmitt)}\index{Schmitt, Peter} \begin{lcode} \edef\B{\b} \def\defA#1{\def\defa##1##2{\def\a{#1##2##1}}} \expandafter\defA\expandafter{\B} \expandafter\defa\expandafter{\d}{\c} \end{lcode} \end{solution} %%>>EndSolution %\item[Solution 2 {[Donald Arseneau]}] \mbox{} \begin{solution}{Solution 2 (Donald Arseneau)}\index{Arseneau, Donald} \begin{lcode} \edef\e{\b} \expandafter \expandafter \expandafter \def\expandafter \expandafter \expandafter \a\expandafter \expandafter \expandafter {\expandafter \e\expandafter \c\d} \end{lcode} \end{solution} %%>>EndSolution %\item[Solution 3 {[mine]}] \mbox{} \begin{solution}{Solution 3 (mine)}\index{Downes, Michael} \begin{lcode} \edef\a{\b} \expandafter\expandafter\expandafter\def \expandafter\expandafter\expandafter\a \expandafter\expandafter\expandafter{\expandafter\a\expandafter\c\d} \end{lcode} \end{solution} %%>>EndSolution %\end{description} My solution differed from Arseneau's only in using \cmd{\a} rather than \cmd{\e} in the first step. \begin{comment} [Solution for exercise 2 moved to answer.002] [Solution for exercise 3 moved to answer.003] Michael Downes mjd@math.ams.com (Internet) \end{comment} %%\endinput \chapter{Empty argument} \section{Exercise (hard)} %%\input{ex002.tex} % ex002.tex \begin{comment} [Posted to info-tex on 10 Oct 91; see exercise.001] ********************************************************************** *** Exercise 2 (hard): \end{comment} \ed{\oposted{1991/10/10}. \arch{exercise.002}.}\\ Define an `ifempty' macro that takes one argument and resolves essentially to \piif{iftrue} if the argument is empty, and \piif{iffalse} otherwise. This is useful for handling arguments given by users to commands defined in a macro package. Plain TeX or LaTeX-style solutions are both acceptable, that is, \begin{lcode} \ifempty{...}TRUE CASE\else FALSE CASE\fi \end{lcode} or \begin{lcode} \ifempty{...}{TRUE CASE}{FALSE CASE} \end{lcode} (In the former case you will need to do something to avoid problems in the situation \begin{lcode} \iffalse ... \ifempty{...} ... \fi ... \fi \end{lcode} there are different possibilities here, so I will refrain from indicating any particular one.) Use the following test suite to verify the robustness of your solution: \begin{lcode} \long\def\test#1{\begingroup \toks0{[#1]}% \newlinechar`\/\message{/\the\toks0: % LaTeX-style solution; modify the following line according % to the syntax of your solution. \ifempty{#1}{EMPTY}{NOT empty}% }\endgroup} \test{} \test{ } \test{aabc} \test{-} \test{$} \test{\empty} \test{\endinput} \test{\iftrue a\else b\fi} \test{\else} \test{#} \test{\par} \halign{#\cr\test{&}\cr} \test{\relax} \test{\relax\relax\relax} \expandafter\iffalse\test{x}\fi \test{{}} \end{lcode} %$ The two tests on the first line should produce a message `EMPTY' and the remaining ones, `NOT empty'. The reason for saying that the second test should return `EMPTY' is that (1) this is the ideal behavior for the applications I've encountered so far; (2) at least one other person working independently arrived before me at a solution essentially identical to mine, including this behavior. The details and credit to the other guy will be given at solution time. %%\endinput \section{Answers} %%\input{ans002.tex} % ans002.tex \begin{comment} [Posted to info-tex on 25 Oct 91; see answer.001] %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% "*** Exercise 2 (hard): "Define an "ifempty" macro that takes one argument and resolves "essentially to \iftrue if the argument is empty, and \iffalse "otherwise. This is useful for handling arguments given by "users to commands defined in a macro package such as LaTeX. " "Plain TeX or LaTeX-style solutions are both acceptable, that "is, " " \ifempty{...}TRUE CASE\else FALSE CASE\fi " "or " " \ifempty{...}{TRUE CASE}{FALSE CASE} \end{comment} \ed{\oposted{1991/10/25}. \arch{answer.002}.}\\ The LaTeX-style solution that I had prepared was, I thought, pretty good, but Donald Arseneau\index{Arseneau, Donald} observed that it fails the test \begin{lcode} \test{{\iftrue a\else b\fi}} \end{lcode} which was not in my list of tests. %\begin{description} %\item[Solution 1 {[mine]}] \mbox{} \begin{solution}{Solution 1 (mine)}\index{Downes, Michael} \begin{lcode} \catcode`\@=11 % \@car is actually already defined in latex.tex, but for % maximum robustness it needs to have the \long prefix: \long\def\@car#1#2\@nil{#1} \long\def\@first#1#2{#1} \long\def\@second#1#2{#2} \long\def\ifempty#1{\expandafter\ifx\@car#1@\@nil @\@empty \expandafter\@first\else\expandafter\@second\fi} \catcode`\@=12 \long\def\test#1{\begingroup \toks0{[#1]}% \newlinechar`\/\message{/\the\toks0: \ifempty{#1}{EMPTY}{NOT empty}% }\endgroup} \end{lcode} \end{solution} %%>>EndSolution The advantage of using the auxiliary macros \cmd{\@first} and \cmd{\@second}, together with the \cmd{\expandafter}'s, is that it allows the true and/or false cases to end with arbitrary things, even macros that require arguments that have not yet been read (any number of arguments, even delimited arguments). From here it is easy to implement an \piif{ifnotempty} test that has a null false case. This is often useful in dealing with user-supplied arguments: `If \#1 is empty, do nothing; otherwise, do the following with \#1: ...' \begin{lcode} \long\def\ifnotempty#1{\ifempty{#1}{}} \end{lcode} %\item[Solution 2 {[Donald Arseneau]}] \begin{solution}{Solution 2 (Donald Arseneau)}\index{Arseneau, Donald} Don Arseneau came up with a plain TeX style solution, using an ingenious device with \cmd{\then} to pass the test case \begin{lcode} \expandafter\iffalse\test{x}\fi \end{lcode} The comments in the solution are his. \begin{lcode} % \ifblank{...}\then Test if a parameter is blank (null or spaces). % Use the inaccessable "letter" @ to separate parameters. The two cases are: % _text_is_not_blank_ _text_is_blank_ % #1<- whatever #1<-@ % #2<- whatever (possibly null) #2<- % #3<- @ #3<-. % #4<- .. #4<-. % \if @.. {false} \if .. {true} % In the {false} case, the extra period is skipped so it doesn't hurt. \catcode`\@=11 % as in plain.tex \let\then\iftrue \long\def\ifblank#1\then{\Ifbl@nk#1@@..\then}% \long\def\Ifbl@nk#1#2@#3#4\then{\if#3#4} \catcode`\@=12 \long\def\test#1{\begingroup \toks0{[#1]}% \newlinechar`\/\message{/\the\toks0: \ifblank{#1}\then EMPTY\else NOT empty\fi% }\endgroup} \end{lcode} \end{solution} %%>>EndSolution The good thing about this solution is that it doesn't subject any part of the user-supplied argument to the \piif{ifx} test. Using @ with category code of 11 as a delimiter for the user-supplied text is extremely safe because even in internal code @ doesn't appear by itself, only as part of control sequence names. In a partial solution, Peter Schmitt\index{Schmitt, Peter} pushed the idea a little further by using space with category code 3 as the delimiter. There is another way of handling the problematic \piif{iffalse} test, in a plain-TeX style solution, by using a suggestion of Donald Knuth that appeared in TeXhax a while ago, in reply to a query of Stephan von Bechtolsheim (texhax89, \#38 (post from svb, 17 Apr 89)). %\item[Solution 3 {[Arseneau/Knuth]}] \mbox{} \begin{solution}{Solution 3 (Arseneau/Knuth)}\index{Arseneau, Donald}\index{Knuth, Donald} \begin{lcode} % Usage: \if\blank{#1}...\else...\fi \catcode`\@=11 % as in plain.tex \long\def\blank#1{\bl@nk#1@@..\bl@nk}% \long\def\bl@nk#1#2@#3#4\bl@nk{#3#4} \catcode`\@=12 \long\def\test#1{\begingroup \toks0{[#1]}% \newlinechar`\/\message{/\the\toks0: \if\blank{#1}EMPTY\else NOT empty\fi% }\endgroup} \end{lcode} \end{solution} %>>EndSolution At the end of Exercise 2 I wrote: \begin{quote} The two tests on the first line should produce a message `EMPTY' and the remaining ones, `NOT empty'. The reason for saying that the second test should return `EMPTY' is that (1) this is the ideal behavior for the applications I've encountered so far; (2) at least one other person working independently arrived before me at a solution essentially identical to mine, including this behavior. The details and credit to the other guy will be given at solution time. \end{quote} The name of the `other guy' is Michael Wester\index{Wester, Michael}; a listing of his macros was published in the preprints for the July 1991 TUG meeting in Dedham, Massachusetts (`Form Letter in LaTeX with 3-across Mailing Labels Capability', joint paper with Jackie Damrau). In rereading the preprint recently, it seems to me the presentation is more different from Exercise 2 and its solutions than I had previously imagined, but the essential ideas are there. See \cmd{\wcar}, \cmd{\wcdr} and related macros. By the way, if anyone came up with a fully expandable test (suitable for use inside a \cmd{\message}) for which \verb?\test{ }? came up false instead of true, I would be interested to hear about it. I didn't mean to eliminate that possibility in my original statement of the problem. %%\endinput \chapter{Discretionary} \section{Exercise (fast)} %%\input{ex003.tex} % ex003.tex \ed{\oposted{1991/10/10}. \arch{exercise.003}.}\\ \begin{comment} [Posted to info-tex on 10 Oct 91; see exercise.001] ********************************************************************** *** Exercise 3 (fast): \end{comment} What's the most important difference between \cs{-} and \begin{lcode} \discretionary{-}{}{} ? \end{lcode} %%********************************************************************** %%\endinput \section{Answers} %%\input{ans003.tex} % ans003.tex \ed{\oposted{1991/10/25}. \arch{answer.003}.}\\ \begin{comment} [Posted to info-tex on 25 Oct 91; see answer.001] %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% "*** Exercise 3 (fast): "What's the most important difference between \- and "\discretionary{-}{}{}? \end{comment} The most important difference between \cs{-} and \cmd{\discretionary}\verb?{-}{}{}? is that the latter always puts in the character from font position 45 ("2D, '55) of the current font when a word must be broken at the end of a line; \cs{-} puts in the character from font position \cmd{\hyphenchar} of the current font, which is NOT NECESSARILY position 45. It would be rather unusual for \cmd{\hyphenchar} to be something other than 45; in certain special applications, however (possibly in some foreign languages as well?) a variant value of \cmd{\hyphenchar} can be useful. I have an idea for using this in a future exercise\ldots Credit to Donald Arseneau\index{Arseneau, Donald} for a correct answer. Thanks to Peter Schmitt\index{Schmitt, Peter} for providing the perfect opening for another point I wanted to make: \begin{quotation} The \emph{TeXbook} states explicitly: \\ \cs{-} is equivalent to \verb?\discretionary{-}{}{}? \\ and both are internal. I do not see where to the question aims: \begin{itemize} \item control symbol : control sequence \item no paramaters : three parameters \item two characters : 21 characters to type \item ??? \end{itemize} \end{quotation} Schmitt is quoting from the last page of Chapter 25; the point is, that in newer versions of the \emph{TeXbook} that sentence has been revised. I'm not sure what the latest printing says, since I don't have a copy, but I think it simply refers the reader to Appendix H, where the significance of \cmd{\hyphenchar} is explained. \cmd{\hyphenchar} is a feature that was added late in the development of TeX82 (\pfile{TeX82.bug} reveals that is was not added until May 25, 1983). Even if the source files for the \emph{TeXbook} were immediately updated by Knuth at that time, the changes did not appear in the published version being sold to the general public until some time later when the first revised edition was published, which was no earlier than October 1984, the date of the \emph{TeXbook} copy that I have on hand, and probably later. The statement of purpose in `Around the bend' \#1 said something about finding the `best solution', but conspicuously failed to define what `best' should mean in this context. It was my intention to address this question in future exercises; for now, let me just say that I don't intend to arbitrarily rule out of consideration answers such as Schmitt's `two characters : 21 characters to type', since depending on how you look at it, it could be argued that this is much more significant than dumb old \cmd{\hyphenchar} minutiae. I promised that these exercises would be challenging; that means, among other things, that they won't always be well-defined, well-bounded, or well-behaved, and part of the job of finding the `best solution' will be to decide what parts of the problem need to be specified further, and to examine the ramifications of alternatives. %%\endinput \chapter{What is `best'?} \section{Exercise (essay)} %%\input{ex004} % ex004.tex \begin{comment} [Exercises 4,5,6,7 were originally posted together on 4 Nov 91] Date: Mon 4 Nov 91 16:42:44-EST From: Michael Downes Subject: Around the bend #2 To: info-tex@shsu.edu \end{comment} \ed{\oposted{1991/11/04}. \arch{exercise.004}.} The statement of purpose in `Around the bend' \#1 said something about finding the `best solution', but failed to define what `best' should mean when comparing pieces of TeX code. I'll start by throwing out a few ideas. \begin{description} \item[Simplicity] A good solution gets hold of the essential idea of the problem and attacks it directly, rather than beating around the bush and resorting to separate clauses to handle troublesome subcases. \item[Economy] If two solutions compare equal in other respects, then the better solution is the one that uses less of TeX's resources (main memory, hash table, string pool, and so forth). Therefore I (immodestly) say that my solution to Exercise 1 was ever so slightly better than the other two given, because it avoided introducing any auxiliary macros that were not included in the original statement of the problem. \item[Robustness] If a solution only works under limited friendly circumstances, and otherwise blows up with an error message, that's not good. My solution to Exercise 2 was flawed in this respect, since D.A. found a test case that caused it to go wrong. \end{description} %%*********************************************************************** *** Exercise 4 (essay): What should `best' mean when comparing solutions to an `Around the bend' exercise? What qualities of a good solution are most important? Why? How can they be objectively measured? (Or can they?) On the negative side, what qualities indicate an inferior solution? %%*********************************************************************** \begin{comment} [Exercise 5 moved to exercise.005] [Exercise 6 moved to exercise.006] [Exercise 7 moved to exercise.007] Send answers to: Michael Downes mjd@math.ams.com (Internet) A summary will be posted Tuesday, December 4, 1991. However, because of the difficulty of E7, I will probably procrastinate on posting the solutions for that exercise until the first or second week of December. \end{comment} Table of special characters, to verify accurate transmission: \begin{lcode} ASCII 33: ! exclamation point ASCII 60: < left elbow ASCII 34: " double quote ASCII 61: = equals sign ASCII 35: # number/pound sign ASCII 62: > right elbow ASCII 36: $ dollar sign ASCII 63: ? question mark ASCII 37: % percent sign ASCII 64: @ at sign ASCII 38: & ampersand ASCII 91: [ left square bracket ASCII 39: ' right quote/apostrophe ASCII 92: \ backslash ASCII 40: ( left parenthesis ASCII 93: ] right square bracket ASCII 41: ) right parenthesis ASCII 94: ^ circumflex/hat/caret ASCII 42: * star/asterisk ASCII 95: _ underscore ASCII 45: - hyphen ASCII 96: ` left quote ASCII 47: / slash ASCII 123: { left curly brace ASCII 58: : colon ASCII 124: | vert bar ASCII 59: ; semicolon ASCII 125: } right curly brace ASCII 126: ~ tilde \end{lcode} %$ %%\endinput \section{Answers} %%\input{ans004} % ans004.tex \ed{\oposted{1991/12/10}. \arch{answer.004}.} \begin{comment} [Solutions for exercises 4,5 were originally posted together on 5 Dec 91] Date: Thu 5 Dec 91 10:26:58-EST From: Michael Downes Subject: `Around the bend' #2 solutions (4,5) To: info-tex@shsu.edu Answers to exercises 4 and 5 of `Around the bend' #2. Discussion of E6 will follow in a separate post because it is rather lengthy. Discussion of E7 will follow in another couple of weeks (I'm going to be on vacation next week.) "*********************************************************************** "*** Exercise 4 (essay): " "What should `best' mean when comparing solutions to an `Around the "bend' exercise? What qualities of a good solution are most important? "Why? How can they be objectively measured? (Or can they?) On the "negative side, what qualities indicate an inferior solution? \end{comment} Peter Schmitt\index{Schmitt, Peter} writes: \begin{quotation} What is to be rated as `best' clearly depends on the function used to measure quality. And therefore the question makes sense only with respect to some particular rating function. Seemingly nothing is gained by this statement: Instead of discussing what qualities are required for a good solution one has to discuss how the rating system should be defined. But nevertheless this shifted point of view has an important an important advantage. It makes clear that there is no unique answer: Quality is not an absolute notion but a notion relative to some (agreed) measure. This measure is not independent of the context --- under different conditions different rating functions may be used. One further important point must not be forgotten: If matters of personal taste are to be excluded than the measuring function has to be precisely defined --- demanding simplicity, without giving this notion a precise (formal) meaning, is not sufficient. Therefore I would like to split the original question into two seperate questions: (a) What (formal and informal) rating functions are likely to be useful, and under what circumstances? (b) With respect to some formal rating function, is there always a best solution? Some answers to the first questions are the following (no completeness claimed or even intended): (1) the first solution: If some special effect is needed for a single application then the best solution is the first solution (the solution that can be realized with the least effort). This is, however, a purely individual criterion that cannot be formalized. (2) the most economic (in some sense) solution: Economic considerations are important if a code is used frequently, Depending on the nature of the applications running time, memory usage, and others, may be relevant. But the time spent for finding a good solution still cannot be neglected in a real world situation. Of course, for theoretical investigations the time spent for research does not matter. (3) the more robust solution: If some set of macros is used by a large number of people who not always know how to use them correctly (or even do not care to know) then it is certainly an advantage if they are robust, i.e. work in as many cases (even strange ones) as possible. But again, one has to decide what price (in terms of resources) is acceptable for this robustness. (In many cases the item (4) below will be more important.) (4) ease-of-use: If a set of macros is used frequently (by one or more persons) then ease-of-use is certainly a mark of quality: easy to remember syntax, short commands, natural and good readable embedding into the surrounding text, and similar criteria, decide about this. (5) simplicity: Simple solutions certainly have a strong appeal --- but what is a simple solution? Again this is hard to formalize, since simplicity basically is an aesthetic value, closely related to the concepts of elegance and beauty. (This is similar to the situation in mathematics.) But be careful: Simple is not equivalent to short! (6) the shortest solution: This may seem to be an easy rating function, but is it? Should length be measured by the number of characters (probably not!), or by the number of tokens, or by the number of control sequences? Or by something else? Most of the measures mentioned are difficult to formalize, or cannot be formalized at all. Only the resources used (in (2)) and the length of a code (in (6)) can be precisely defined. Therefore, with respect to one of these cases two solutions of the same problem can be compared. Furthermore, in many cases it will be possible to proof that an optimal solution exists. (For instance, since the length of a code (in any interpretation) is a positive integer, there must exist one or more solutions with minimal length, provided there is at least one solution.) But unfortunately this does not imply that one is able to construct an optimal solution, or to decide whether a given piece of code is an optimal solution (or at least near to one). And in some cases it may happen that no optimal solution exists, e.g. if to every solution there is better --- but longer! --- one. What is the conclusion of all this? That there may be a best solution relative to some side conditions. But that there is no globally best solution. This statement is, of course, not very satisfying. One would rather prefer to have at least some notion (even a tentative one) of a best solution than none at all. I propose therefore the following informal definition (often subject to personal taste): If some code is optimal or near-optimal in more than one category then it is probably as near to a globally optimal solution as this is possible. \end{quotation} My comments: I propose the following list, based on (1) [my interpretation of] Knuth's ideas about good macro writing as demonstrated in the \emph{TeXbook} and plain.tex, (2) various articles in TUGboat, (3) Schmitt's comments, (4) discussions I've had in the past with other macro writers, and so forth. The characteristics of a good solution to an `Around the bend' exercise are (in order of decreasing importance): \begin{enumerate} \item Robustness \item Brevity (= minimal usage of TeX's main memory)3 \item Simplicity \item Ease of use \item Suitable commentary \item Speed \item Minimal hash table load \item Minimal save stack load \item Minimal load in other categories of TeX's memory \item Comprehensive test suite (when applicable) \end{enumerate} Schmitt's\index{Schmitt, Peter} point about 'first solution' is well taken but does not apply to `Around the bend' exercises, because of the stated goal of finding a 'best' solution, with the presumption that normally more than one solution will be found. Measurement of these qualities is not too difficult, I think, except for 3 and 5. Here's how I see the measurements: \begin{description} \item[1. Robustness] A solution is robust if no one who reads it offers a counterexample that causes it to fail. If two solutions both fail, the one with more counterexamples is less robust; if two solutions have different counterexamples, the solution whose counterexample is more likely to occur in normal use is the less robust solution. \item[2. Brevity] Of two different solutions, the one that is briefer/shorter/more compact is the one that uses less of TeX's main memory as measured by \cmd{\tracingstats}. \item[3. Simplicity] Of two different solutions, the shorter one (in the sense of the previous item) is usually the simpler one, but not always. A solution that condenses all the necessary operations into a dense, incomprehensible Gordian knot is less simple than a longer solution that lays out the operations in a series of easily comprehended steps. A solution that relies on arcane dirty tricks is less simple than a solution that uses better-known techniques in a straightforward approach. \item[4. Ease of use] I believe this will not be extremely hard to measure in the context of the particular application; it can't sensibly be discussed out of context. \item[5. Suitable commentary] The commentary surrounding a solution should explicitly mention any necessary assumptions. If the code is complex, the commentary should give an outline or overview of the intended algorithm. It should explain the operation of any macro if its operation is not evident from the code. If an unusual construction is used where a different construction would normally be expected, the commentary should give the reason. \item[6. Speed] Of two solutions, the speedier one is the one that runs faster on common computer systems. If one solution runs faster and slower than another, depending on the system \ldots well, let's not cross that bridge unless it turns out to be real. \item[7,8,9. Minimal hash table load, save stack load, etc.] These can be measured by \\ \cmd{\tracingstats}. \item[10. Comprehensive test suite] If two solutions are equal in other respects, the one whose accompanying test suite covers more distinct cases than the other's is better by that much. \end{description} It may be argued that I have not sufficiently answered the question of subjectivity. For example, who's to decide what's an 'arcane dirty trick' and what's not? What does 'suitable' mean in number 5? The answer is that I will say that something is an 'arcane dirty trick' if I think so, and anyone else can do the same. In most cases I believe that there will be general agreement on such a question; if not, and an ensuing discussion fails to reach a clear settlement, then each of the solutions in question will be decreed 'subjectively just as good as the others'. Other qualities of a good solution can be expressed in terms of the ones listed above. For example, self-sufficiency may be considered an aspect of robustness---if a solution is not entirely self-sufficient, it can easily be shown to be not robust by giving a counterexample that exploits the assumption that makes the solution non-self-sufficient. Elegance? If a solution is simple and easy to use, then I say it is elegant. A solution doesn't necessarily have to be robust in order to be elegant, nor even short (although of two solutions that are otherwise equal, the shorter one is undoubtedly more elegant). \begin{comment} [Solution for exercise 5 moved to answer.005] %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Table of special characters (ASCII): 33: ! exclamation point; 59: ; semicolon; 34: " double quote; 60: < left elbow; 35: # number/pound sign; 61: = equals sign; 36: $ dollar sign; 62: > right elbow; 37: % percent sign; 63: ? question mark; 38: & ampersand; 64: @ at sign; 39: ' right quote/apostrophe; 91: [ left square bracket; 40: ( left parenthesis; 92: \ backslash; 41: ) right parenthesis; 93: ] right square bracket; 42: * star/asterisk; 94: ^ circumflex/hat/caret; 43: + plus sign; 95: _ underscore; 44: , comma; 96: ` left quote; 45: - hyphen; 123: { left curly brace; 46: . period/dot/point; 124: | vert bar; 47: / slash; 125: } right curly brace; 58: : colon; 126: ~ tilde %$ Michael Downes mjd@math.ams.com (Internet) \end{comment} %%\endinput \chapter{\cs{string} tokens} \section{Exercise (fast)} %%\input{ex005} % ex005.tex \ed{\oposted{1991/11/04}. \arch{exercise.005}.} \begin{comment} [Posted to info-tex on 4 Nov 91; see exercise.004] *********************************************************************** *** Exercise 5 (fast): \end{comment} Assuming a normal value for \cmd{\escapechar} \begin{lcode} \string\a \end{lcode} produces two character tokens. What is the category code of the second? Write an experiment (as short as possible) to demonstrate the correctness of your answer. %%%********************************************************************** %%\endinput \section{Answers} %%\input{ans005} % ans005.tex \ed{\oposted{1991/12/05}. \arch{answer.005}.} \begin{comment} [Posted to info-tex on 5 Dec 91; see answer.004] "*********************************************************************** "*** Exercise 5 (fast): " "Assuming a normal value for \escapechar, " " \string\a " "produces two character tokens. What is the category code of the second? "Write an experiment (as short as possible) to demonstrate the "correctness of your answer. \end{comment} The category of the 'a' token is 12. All tokens produced by \cmd{\string} have category 12, except for space tokens, which have category 10. \begin{solution}{Solution 1 (mine)} \begin{lcode} \def\answercheck#1#2{\message{#2: \ifcat0#2\else NOT \fi Category 12}} \expandafter\answercheck\string\a \answercheck bb \end{lcode} This produces on screen the following message: \begin{lcode} a: Category 12 b: NOT Category 12 \end{lcode} \end{solution} %%>>EndSolution %%>>Solution 2 [Peter Schmitt]: \begin{solution}{Solution 2 (Peter Schmitt)}\index{Schmitt, Peter} \begin{lcode} \def\test#1#2#3{% \message{\ifcat#2#3 #2 and #3 have the same category code \else #2 and #3 have not the same category code \fi}} \def\Test#1#2#3{% \ifcat#2#3 \message{#2 and #3 have the same category code} \else \message{#2 and #3 have not the same category code} \fi} \catcode`\A12 \test 1aA \Test 1aA \expandafter\test\string\a A \expandafter\Test\string\a A \end{lcode} Comment: \\ I have given two essentially equivalent Tests --- \cmd{\test} and \cmd{\Test}. (i) \cmd{\test} is slightly more simple because it contains only one \cmd{\message} command, but I think that \cmd{\Test} is more adequate because it avoids to perform the test inside the \cmd{\message} --- there might be some side effect one is not aware off. (ii) Both tests are not as short as possible --- the \piif{true} and \piif{false} cases could be much shorter, e.g. a T (for true) and a F (for false) would suffice --- the result could be checked in the dvi-file. (I regard this difference as inessential.) Furthermore, setting the catcode of the model character to 12 could easily be omitted (use some character that is known to be an `other character'), but I think it should be included: It makes the test independent of any assumption on the format running. This makes the solution more closed and selfsufficient, and therefore also simpler and more elegant (if I may say so). \end{solution} %%>>EndSolution %%\endinput \chapter{Counting arguments} \section{Exercise (hard)} %%\input{ex006} \begin{comment} [Posted to info-tex on 4 Nov 91; see exercise.004] ********************************************************************** *** Exercise 6 (hard): \end{comment} \ed{\oposted{1991/11/04}. \arch{exercise.006}.} Define a macro \cmd{\args} that can be used to fill in the proper number in the following sentence no matter how \cmd{\foo} is defined (except you may assume it is not \cmd{\outer}). The macro \verb?\tt\string\foo? has \verb?\args\foo? arguments. Is it possible to solve this if \cmd{\foo} is \cmd{\outer} also? Is it possible to make \cmd{\args} fully expandable, so that it could be used in a message: \begin{lcode} \message{The macro \noexpand\foo has \args\foo\space arguments.} \end{lcode} %%********************************************************************** %%\endinput \section{Answers} %%\input{ans006} % ans006.tex \begin{comment} Date: Mon 23 Dec 91 11:46:33-EST From: Michael Downes Subject: Answers to 'Around the bend' #2 Exercise 6 To: info-tex@shsu.edu X-ListName: TeX-Related Network Discussion List "*** Exercise 6 (hard): " "Define a macro \args that can be used to fill in the proper number "in the following sentence no matter how \foo is defined (except "you may assume it is not \outer). " " The macro {\tt\string\foo} has {\args\foo} arguments. " "Is it possible to solve this if \foo is \outer also? Is it possible "to make \args fully expandable, so that it could be used in a "message: " " \message{The macro \noexpand\foo has \args\foo\space arguments.} \end{comment} \ed{\oposted{1991/12/23}. \arch{answer.006}.} This was a tough one. All who sent in answers to this exercise (counting myself) used the approach of applying \cmd{\meaning} to \cmd{\foo} and analyzing the resulting string. There are some drawbacks to this. (1) In a \cmd{\meaning} string, all characters (other than spaces) have catcode 12. This means that all occurrences in a \cmd{\meaning} string of the character \# are indistinguishable, regardless of their true significance in the parameter text or replacement text of the macro in question. Consequently, an occurrence of a \# character, not category 6, followed by a number, in the parameter text of \cmd{\foo} can potentially make \cmd{\args} report an incorrect number of arguments. For example, in the following definitions \cmd{\foo} has no arguments, only delimiter text, in all three cases, but the \cmd{\meaning} string would appear to show that \cmd{\foo} has one argument: \begin{lcode} \def\foo\#1{} \expandafter\def\expandafter\foo\string #1{} \catcode`\#=12 \def\foo#1{} \end{lcode} (2) The following two examples produce identical \cmd{\meaning} strings: \begin{lcode} \def\foo&1{} % no arguments \catcode`\&=6 \def\foo&1{} % one argument \end{lcode} (The string is \verb?"macro:&1->"?.) I.e., characters other than \# can be used to create parameter markers in a macro definition, and such a parameter marker cannot be distinguished in a \cmd{\meaning} string from a normal use of the character in question. (3) There is no completely general way to isolate the parameter text of an arbitrary macro from the replacement text. The best you can do is remove the tail of the \cmd{\meaning} string---everything after the last occurrence of \verb?->? in the string---and say 'This is not part of the parameter text'. Likewise, anything preceding the first occurrence of \verb?->? is certainly part of the parameter text. If there are two or more occurrences of \verb?->? in the string, however, you cannot say for sure whether anything between the first and last occurrences is parameter text or replacement text. This raises a slight additional possibility that pseudo 'parameter markers' in the replacement text could cause \cmd{\args} to give an incorrrect result. For example: \begin{lcode} \edef\foo #1{\string#2->} \end{lcode} defining \cmd{\foo} with one argument, produces a \cmd{\meaning} string of \begin{lcode} macro:#1->#2-> \end{lcode} which is exactly the same as the \cmd{\meaning} string for \begin{lcode} \def\foo#1->#2{} \end{lcode} where \cmd{\foo} has two arguments. Speaking practically, however, rather than theoretically, using \cmd{\meaning} to analyze the number of arguments of an arbitrary macro works fine. Donald Arseneau's solution, below, is admirably brief and demonstrates an easy way of handling an outer argument that I had never seen before. \begin{solution}{Solution 1 (Donald Arseneau)}\index{Arseneau, Donald} Here is my solution for counting arguments. It is totally expandable, and relies on the fact that the parameter numbers must be in increasing order, that they are only single digits, and that there is no parameter zero. Also important is that \cmd{\meaning} of a macro defined by \verb?\def\x#{...}? reports a syntax of \verb?{? rather than \#. \begin{lcode} {\catcode`\*=6 \catcode`\#=12 % use * for macro parameters while # is "other" % \gdef\args{\expandafter\Args\noexpand}% get rid of \outerness % \long\gdef\Args*1{\expandafter\countargs \meaning*1:->{}\end}% % ... \meaning will display the parameter syntax (as "other" characters). % \gdef\countargs*1:*2->*3\end{\twoargs#0*2#0}% get just the parameter syntax % ... in format #0junk#1junk...#njunk#0. \twoargs processes the list to % ... give "n", the last number before #0. \end{lcode} Here's what tests the parameter numbers, two at a time. (Thus, the two \verb?#0?'s in \cmd{\countargs}, so there are always at least two \verb?#n?'s detected.) When the second number of a comparison isn't zero, \cmd{\twoargs} re-executes itself to test the next pair; when the second \verb?n? is 0, the first \verb?n? is the highest parameter number, so it is output. \begin{lcode} \gdef\twoargs*1#*2*3#*4{\ifnum0=*4 *2\else % note the space to end the number \expandafter\twoargs\expandafter#\expandafter*4\fi} } \end{lcode} Here is my test suite. The character ``:'' works in a funny way: it confuses how \cmd{\countargs} reads its parameter list, and another colon gets into the supposed syntax. But it works because there are no parameters. The primitive \cmd{\halign} is reported to have no parameters because it is not a macro. This could be confusing to someone. The same confusion could arise with \cmd{\args} itself because it doesn't read the parameter right away. \begin{lcode} \def\test#1#{nothing} \def\Test[#1]#2:{\##1,#2##} \def\#{haha} \show\test \show\Test \end{lcode} (I condensed this test suite---MJD) \begin{lcode} \long\def\msg#1{\message{The object \string#1 has \args#1 arguments.}} \msg\mathpalette \msg\mathhexbox \msg\par \msg\halign \msg\args \msg\relax \msg # \msg\# \msg\test \msg\Test \msg : \msg\: \msg\csname \msg t \msg ~ \msg $ \msg ^ \end{lcode} (Outer macros---MJD) \begin{lcode} \message{The object \string\bye\space has \args\bye\space arguments.} \message{The object \string\newhelp\space has \args\newhelp\space arguments.} \bye % -- Donald Arseneau \end{lcode} \end{solution} %%>>EndSolution Although the problem statement only mentioned `macros' Arseneau earned some thoroughness points by including primitives \cmd{\halign}, \cmd{\relax}, and \cmd{\csname}, as well as characters \verb?# : t $ ^? in his tests. This is of some interest because of the difference in \cmd{\meaning} strings between macros and non-macros. In my solution for this exercise, I amused myself by trying to pack everything into as few control sequences as possible. Although I got it down to two, that's really only one less than Arseneau's four, because one control sequence in his solution is expended to handle outer macros, something my solution didn't attempt to do. %>>Solution 2 (mine) \begin{solution}{Solution 2 (mine)} \begin{lcode} % Use & instead of # temporarily. \catcode`\&=6 \catcode`\#=12 \long\def\args &1{\expandafter\countargs\meaning &1#\args->\countargs 0} \end{lcode} Analysis is restricted to the parameter text by chopping off everything after \verb?->? in the meaning string (this will leave possibly only part of the parameter text). Then we look in the parameter text for \# followed by a number (checking to make sure that the thing after \# is a number handles a few extra possibilities, such as \verb?\#? followed by non-number in the parameter text). If we find \# plus a number, we pass the number onward to the next invocation of \cmd{\countargs}, where it will end up as the returned value (argument \#5) if the next \cmd{\countargs} determines that the remaining parameter text contains no more parameter markers. \begin{lcode} \def\countargs &1#&2&3->&4\countargs &5{% \ifx\args&2&5% \else \ifodd0&21 % Then &2 is a number, carry forward. \countargs&3#\args->\countargs&2% \else % &2 not a number---ignore, carry forward last number instead \countargs&3#\args->\countargs&5% \fi \fi} \catcode`\#=6 \def\test{\message{The macro \noexpand\foo has \args\foo\space arguments (\meaning\foo).}} %\tracingmacros=2 \tracingcommands=2 % Success: \def\foo{No args}\test \def\foo#1{One arg}\test \def\foo#1#2{Two args}\test \def\foo./{No args, delimited}\test \def\foo#1#2#3#4#5#6#7#8#9{Nine args}\test \def\foo//#1#2#3#4#5#6#7#8#9//{Nine args, delimited}\test \def\foo#{Weird}\test \def\foo#1#{Weird, one arg}\test \def\foo#1#2#3#4#5#6#7#8#9#{Weird, nine args}\test \def\foo#1 {One arg, space delimited}\test \def\foo#1 #2 #3 #4 #5 #6 #7 #8 #9 {Nine args, space delimited}\test \def\foo/{\def\foo} \foo/ #1{Interesting}\test \edef\foo#1#2{\string #3\string #4}\test \edef\foo{\string #}\test \expandafter\edef\expandafter\foo \csname 0\string #\string #\endcsname#1#2{#1#2}\test % Failure: \def\foo->#1->#2->#3->#4->#5->#6->#7->#8->#9->{Nine args, devious delimiter}\test \expandafter\edef\expandafter\foo \csname 0\string #1\string #2\endcsname{...}\test \let\foo=\bye \test % \outer bomb \end{lcode} \end{solution} %%>>EndSolution When I originally posed this problem, I had seen far enough ahead to suspect that the drawbacks of \cmd{\meaning} mentioned above would be impossible to overcome. But \cmd{\meaning} is the only way to analyze a macro that has a nonsimple parameter text---that is, one containing delimited arguments. Another possibility I had in mind was restricting the analysis to macros with simple parameter texts---empty or having only nondelimited arguments---to see what might be done without \cmd{\meaning}. The best that I could manage in my experiments along these lines was a definition of \cmd{\args} with an unacceptably cumbersome call syntax. But it does have the virtue of correctly identifying any number of nondelimited arguments, no matter whether \cmd{\foo} was originally defined using \# (category 6) or some other category 6 character. %%>>Solution 3 (mine) \begin{solution}{Solution 3 (mine)} \begin{lcode} % This solution is not fully expandable, hence cannot be used % inside a \message. \def\args{\expandafter\argscontinue} \def\argscontinue{\begingroup \end{lcode} Make all digits have category 2 (= end of group) so that they will serve to end the token register assignment \verb?\global\toks1 ...? \begin{lcode} \catcode`\0=2 \catcode`\1=2 \catcode`\2=2 \catcode`\3=2 \catcode`\4=2 \catcode`\5=2 \catcode`\6=2 \catcode`\7=2 \catcode`\8=2 \end{lcode} We use \cmd{\afterassignment} to put an \cmd{\endgroup} after the token register assignment, so that numbers will revert to their ordinary catcodes. And we use \cmd{\aftergroup} to put a \cmd{\finishup} token after the \cmd{\endgroup}. Thus \cmd{\finishup} can look ahead to see what numbers are remaining; this information reveals how many arguments were used up by the \cmd{\foo} macro call. \begin{lcode} \aftergroup\finishup \afterassignment\endgroup \global\toks1\bgroup} \end{lcode} \cmd{\finishup} takes the first digit following it and returns it as the value of \cmd{\args}; any following numbers are discarded (note that \#2 is delimited by a space). \begin{lcode} \def\finishup#1#2 {%\showthe\toks1 #1} %\tracingmacros=2 \tracingcommands=2 \tracingonline=1 \def\foo{} The macro {\tt\string\foo} has \args\foo 00123456789 \ arguments. \def\foo#1{} The macro {\tt\string\foo} has \args\foo 00123456789 \ arguments. \edef\foo#1{\string #2\string #3\string #4->\string #4\string #3#1} The macro {\tt\string\foo} has \args\foo 00123456789 \ arguments. \def\foo#1#2#3{a#1b#2c#3} The macro {\tt\string\foo} has \args\foo 00123456789 \ arguments. \def\foo#1#2#3#4#5#6#7#8#9{#1#2#3#5#8bb#9} The macro {\tt\string\foo} has \args\foo 00123456789 \ arguments. \end{lcode} \end{solution} %%>>EndSolution The fourth solution for Exercise 6 is by Peter Schmitt; it gets the robustness prize for carrying out a diligent analysis of \cmd{\meaning} strings that enables it to correctly handle a greater variety of exotic cases than the other solutions. Schmitt's original method of handling outer macros was effective, but more complicated than Arseneau's method, incorporated here as noted. Even though my approach was rather different from Schmitt's, some of the comments in Schmitt's solution inspired me in turn to improve my solution [2] from its previous much inferior state. %%>>Solution 4 (Peter Schmitt) \begin{solution}{Solution 4 (Peter Schmitt)}\index{Schmitt, Peter} \begin{lcode} % \args expands to: - if is not a macro % 0..9 according to the number of parameters % if the is a macro % \args is fully expandable and accepts outer macros as well. % It assumes, however, that the tested macro has been defined using the % standard parameter symbol #, % and that the current value of \escapechar is the standard backslash \. \end{lcode} The definition of the macros uses the expansion of \cmd{\meaning}\verb?\cs?: It is of the form: \begin{lcode} [..] macro: [parameter text] -> [replacement text] \end{lcode} and consists of `other characters'. The macro \cmd{\args} checks: \begin{enumerate} \item if the expansion contains `macro': \\ --- if not, then \verb?\cs? is not a macro and \cmd{\args} yields `-' \item if the expansion contains parameters \#1 etc. \\ --- if \verb?#n? is the first that is not present then \verb?\cs? takes (n-1) arguments and \cmd{\args} yields `n-1' \end{enumerate} The following special characters are chosen to make the definitions as readable as possible. Any characters having catcodes different from 12 will serve the same purpose: \begin{lcode} \catcode`\:3 \catcode`\/3 % : and / are used as parameter delimiters \catcode`\^3 % ^ is used to detect empty arguments \catcode`\?11 % ? is used to make the control sequences private \end{lcode} Since the occurrences of \# in the expansion of \cmd{\meaning}\verb?\cs? has to be detected, it has to be used as an `other character'. To avoid confusion it has been replaced not only where necessary but throughout all the definitions: \begin{lcode} \catcode`\#12 \catcode`\*6 % * is parameter character \end{lcode} \begin{itemize} \item \verb|\?macro| is defined to be `macro' consisting of `other characters' using the expansion of \verb?\meaning\TeX?. \item \verb?\?DEF? inserts these five characters into some definitions where they are as parameter delimiters: \begin{lcode} \DEF\cs { } { } \end{lcode} where the texts may contain *1 and **1 .. **9 yields \begin{lcode} \def\cs {} \end{lcode} where *1 is replaced by `macro' and **1 yields *1 etc. \end{itemize} \begin{lcode} \def\?macro *1:*2:{*1} \edef\?macro{\expandafter\?macro\meaning\TeX:} \def\?DEF *1*2{\def*1**1:{\long\def*1*2}\expandafter*1\?macro:} \end{lcode} \begin{itemize} \item \cmd{\args} passes the \meta{token} unexpanded to \verb|args?| \item (taken from the solution by Donald Arseneau) \verb|\args?| takes one argument, expands its \cmd{\meaning} to TEXT and passes it to \verb|\macro?| after appending \verb|macro^|: \item \verb|\macro?| checks the first token after the first occurrence of `macro': if this is \verb?^(3)?, then `macro' was not present in TEXT (output: -) otherwise TEXT is further investigated. \end{itemize} \begin{lcode} \def\args{\expandafter\args?\noexpand} \?DEF \args? {**1{\expandafter\macro?\meaning **1*1^:}} \?DEF\macro? {**1*1**2:{\ifx^**2-\else\expandafter\purge? **2:\fi}} \end{lcode} The parameters taken by a control sequence all appear (once and in numerical order) in the parameter text --- and no other occurrence of a pair \verb?#n? is allowed in it. Moreover, only the same pairs \verb?#n? may occur in the replacement text. It is, however, not possible to simply look for occurrences of these pairs since there are tokens that may --- if followed by some number --- be (wrongly) interpreted as parameters: \begin{itemize} \item the token \verb?##? in the replacement text, and \item (as pointed out by Michael Downes) -the control symbol \verb?\#? both in the parameter text and the replacement text. \end{itemize} Since \verb?\\#n? has to be distinguished from \verb?\#n? the control symbol \verb?\\? is also important. Therefore \verb|\purge?| is used to remove all occurrences of these tokens. After that the search-macro \verb|\head?| is invoked, appending the sequence \verb?#n^(n-1)? for every possible parameter \verb?#n?. Since \verb|\purge?| has to identify the character \verb?\(12)? it is necessary to change the escapecharacter: \begin{lcode} \catcode`\!0 !catcode`!\=12 % ! is used as escape character \end{lcode} \verb|\purge?| appends \verb?## \#^ and \\^? to the TEXT as a means to stop the search for these tokens, and : as delimiter: \begin{enumerate} \item \verb|\backslash?| looks for the first occurrence of the character pair \verb?\\? in TEXT (this must be a token \verb?\\?) and replaces it by a space. If it is followed by \verb?^(3)? then the search is completed, otherwise the process is repeated. \item \verb|\numbersign?| looks for the first occurrence of the character pair \verb?\#? in the (in the meantime modified) TEXT (since all \verb?\\? have been removed this must correspond to a token \verb?\#?) and replaces it by a space. Again the process is stopped when it is followed by \verb?^(3)?. \item \verb|\parametersign?| truncates TEXT at the first occurrence of the character pair. Note that this pair must correspond to a parameter token \verb?##? in the replacement text and therefore the rest of TEXT is not needed any more. \end{enumerate} \begin{lcode} !def!purge? *1:{!backslash? *1##\#^\\^:} % \purge? could be avoided - \macro? could call \backslash? directly !def!backslash? *1\\*2*3:{!ifx^*2!expandafter!numbersign? !else !expandafter!backslash? !fi *1 *2*3:} !def!numbersign? *1\#*2*3:{!ifx^*2!expandafter!parametersign? !else !expandafter!numbersign? !fi *1 *2*3:} !catcode`!\0 \catcode`\!=12 % return to the normal use of backslash \def\parametersign? *1##*2:{% \head? *1^#1^0#2^1#3^2#4^3#5^4#6^5#7^6#8^7#9^8#0^9:} \end{lcode} For each n from 0 to 9 \verb|\head?| extracts the characters contained in the (appended) TEXT between the first occurrence of \verb?#n? and \verb?#(n+1)? and investigates them with \verb|\used?|. If \verb?#n? is not present in TEXT, then the first of these characters is \verb?^(3)?, taken from the appended string: \\ When this happens for the first time \verb|\used?| outputs the second character (the number of parameters) and calls \verb|\skip?| to hide all the remaining parts of the appended TEXT, otherwise \verb|\used?| checks the next item. Since eleven parameters are necessary to handle the ten cases (0..9) this duty has to be distributed on two macros: \\ The appearance of the character \verb?/(3)? is used to indicate that the second macro \verb|\tail?| has to be invoked by \verb|\used?|. \begin{lcode} \def\head? *1#1*2#2*3#3*4#4*5#5*6:{% \used? *2..:*3..:*4..:*5..:/.:% \expandafter\tail? *6://} \def\tail? *1#6*2#7*3#8*4#9*5#0*6:{\used? *2..:*3..:*4..:*5..:*6:} \def\used? *1*2*3:{\ifx^*1*2\expandafter\skip?\else\ifx/*1\else \expandafter\expandafter\expandafter\used?\fi\fi} \def\skip? *1//{} %% Finally, catcodes are turned back to normal: \catcode`\#6 \catcode`\*12 \catcode`\?12 \catcode`\:12 \catcode`\/12 \catcode`\^12 %%%%%%%%%%%%%%%%%%%%%% \long\def\test#1{ The macro {\tt\string#1} has {\args#1} arguments. \message{The macro \noexpand#1 has :\args#1:\space arguments.} } \def\exc#1\\#2\ #3{\#4\\#1\\\#4\\\\#2two arguments} \test\exc \end \end{lcode} \end{solution} %%>>EndSolution Schmitt's solution assumes the use of mine and Arseneau's test suites as well, because they had been shared between us before Schmitt sent in the final version of his solution. \begin{comment} Answers for Exercise 7 will follow next week. Michael Downes mjd@math.ams.com (Internet) \end{comment} %%\endinput \chapter{Self replication} \section{Exercise (hard)} %%\input{ex007} \begin{comment} [Posted to info-tex on 4 Nov 91; see exercise.004] ********************************************************************** *** Exercise 7 (hard): \end{comment} \ed{\oposted{1991/11/04}. \arch{exercise.007}.} In the September 1991 issue of Dr. Dobb's Journal, in an article `Little Languages, Big Questions' (pp. 16--25), Ray Vald\'es described a `little language' as a part of a more complex application that is \begin{quote} partitioned into two (or more) nested components: a core module that provides a primitive set of services for an application area (the ``engine''), and a surrounding module that provides programmatic access to these services. The surrounding module is typically a language interpreter for a simple, easily parsed computer language--a ``little language''. \end{quote} Since TeX seems to fall into this category, I wonder if any Dr. Dobb's readers who know TeX tried their hand at the challenge given in a sidebar (`How Strong Is Your Little Language')? \begin{quote} [An] informal benchmark of a language's computational power is the programming exercise that Ken Thompson (coauthor of Unix) used to pass the time in college. ... The goal is to write the shortest self-reproducing program: ``More precisely stated ... to write a source program that, when compiled and executed, will produce as output an exact copy of its source.'' \end{quote} When I tried it it turned out to be a real challenge for me. In the Unix world, for conventional compiled languages, the problem as originally stated can assume output on the `standard output' stream; but TeX already clutters up standard output with some of its built-in messages. This leaves three alternatives in refining the statement of the problem to be meaningful for TeX: 1. Write a TeX program that includes the built-in messages in its source in such a way that it exactly fulfills the the original problem statement with standard output as the output stream. 2. Pretend the built-in messages don't exist and write a TeX program that reproduces an exact copy of itself (with no extra garbage) in the middle of the built-in messages. 3. Write on a different output stream. Take your pick, any or all of the above, and see what you can come up with. I have solutions for 2 and 3 but have not gotten around to really thinking about 1 yet. I believe it will require at least a different algorithm than the other 2, if it is not impossible. %%%********************************************************************** %%\endinput \section{Answers} %%\input{ans007} % ans007.tex \begin{comment} [The `forthcoming' TUGboat article cited below appeared as `Self-replicating macros' by Victor Eijkhout and Ron Sommeling, TUGboat 13 (1992) no 1, p. 84] Date: Tue 7 Jan 92 16:43:29-EST From: Michael Downes Subject: 'Around the bend' #2, Exercise 7, solutions To: info-tex@shsu.edu X-ListName: TeX-Related Network Discussion List "*** Exercise 7 (hard): " "In the September 1991 issue of Dr. Dobb's Journal, in an article "`Little Languages, Big Questions' (pp. 16--25), Ray Vald\'es "described a `little language' as a part of a more complex "application that is " " partitioned into two (or more) nested components: a core module " that provides a primitive set of services for an application area " (the ``engine''), and a surrounding module that provides " programmatic access to these services. The surrounding module is " typically a language interpreter for a simple, easily parsed " computer language--a ``little language''. " "Since TeX seems to fall into this category, I wonder if any Dr. Dobb's "readers who know TeX tried their hand at the challenge given in a "sidebar (`How Strong Is Your Little Language')? " " [An] informal benchmark of a language's computational power is the " programming exercise that Ken Thompson (coauthor of Unix) used to " pass the time in college. ... The goal is to write the shortest " self-reproducing program: ``More precisely stated ... to write a " source program that, when compiled and executed, will produce as " output an exact copy of its source.'' " "When I tried it it turned out to be a real challenge for me. In the "Unix world, for conventional compiled languages, the problem as "originally stated can assume output on the `standard output' stream; "but TeX already clutters up standard output with some of its built-in "messages. This leaves three alternatives in refining the statement of "the problem to be meaningful for TeX: " "1. Write a TeX program that includes the built-in messages in its "source in such a way that it exactly fulfills the the original problem "statement with standard output as the output stream. " "2. Pretend the built-in messages don't exist and write a TeX program "that reproduces an exact copy of itself (with no extra garbage) "in the middle of the built-in messages. " "3. Write on a different output stream. " "Take your pick, any or all of the above, and see what you can come up "with. I have solutions for 2 and 3 but have not gotten around to really "thinking about 1 yet. I believe it will require at least a different "algorithm than the other 2, if it is not impossible. \end{comment} \ed{\oposted{1992/01/07}. \arch{answer.007}.} Plenty of good answers for this one. %%>>Solution 1 (mine) \begin{solution}{Solution 1 (mine)} This solution is type 2 (print the copy in the middle of TeX's built-in messages). It assumes \pfile{plain.tex} or similar has been loaded to set the catcodes of the left and right curly braces. The idea is to assign the text to the token register \cmd{\errhelp} (used merely because it is a convenient pre-existing token register), and then print out \cmd{\the}\cmd{\errhelp} twice. There is a bit of shuffling to ensure that \cmd{\errhelp} will swallow the last half of the file and that the last half of the file is equal to the first half, which contains all the preparations necessary to prepare \cmd{\errhelp} for that swallowing and the subsequent message-sending. A space is left after every control word, because this is easier than trying to prevent TeX from printing spaces after control words when the message is eventually printed on screen. The lines are carefully arranged to break at column 79 (including spaces) since this is the normal value for \verb?max_print_line?, a constant compiled into TeX which controls the length of screen output lines. It would be easy to make the lines work out nicely no matter what the working code required, by varying the length of the macro name \cmd{\selfcopy} and using, say, \cmd{\everyhbox} or \cmd{\everyjob} instead of \cmd{\errhelp}. The total number of tokens in this solution is 54. \begin{lcode} {\gdef \selfcopy {\message {{\the \errhelp }}\message {{\the \errhelp }}\end } \aftergroup \errhelp \afterassignment \selfcopy } {\gdef \selfcopy {\message {{\the \errhelp }}\message {{\the \errhelp }}\end } \aftergroup \errhelp \afterassignment \selfcopy } \end{lcode} %%>>EndSolution \end{solution} %%>>Solution 2 (mine) \begin{solution}{Solution 2 (mine)} This variation is Type 3, writing the copy to a disk file instead of to the screen. The total number of tokens in this solution is 126. \begin{lcode} \immediate \openout 0=\jobname .cpy {\gdef ~#112{\errhelp {#112}\immediate \write 0{\the \errhelp }\immediate \write 0{\the \errhelp }\immediate \closeout 0 \end}} \newlinechar 13 \catcode `\#=3 \afterassignment ~\catcode 13=12 \immediate \openout 0=\jobname .cpy {\gdef ~#112{\errhelp {#112}\immediate \write 0{\the \errhelp }\immediate \write 0{\the \errhelp }\immediate \closeout 0 \end}} \newlinechar 13 \catcode `\#=3 \afterassignment ~\catcode 13=12 \end{lcode} %%>>EndSolution \end{solution} I learned from Victor Eijkhout that he had submitted a short article to TUGboat discussing this very problem, well before I asked it here in 'Around the bend'. He kindly sent me a copy of the article, which contains a good discussion of the underlying ideas, and a couple of different solutions. To summarize briefly, he gave a Type 2 solution similar in length to mine, and also a solution that involved printing out the source file on PAPER! A 'Type 4' solution, in other words. I'm a little embarrassed that I didn't think of this, given that the whole idea of TeX is to print things on paper. %%>>Solution 2 (Victor Eijkhout) \begin{solution}{Solution 2 (Victor Eijkhout)}\index{Eijkhout, Victor} Forthcoming in TUGboat. It appeared as: \\ `Self-replicating macros' by Victor Eijkhout and Ron Sommeling, TUGboat 13 (1992) no 1, p. 84. %%>>EndSolution \end{solution} Although I'm giving them all together, as `Solution 3', Peter Schmitt actually sent in six different variations, including a Type 4 solution. His first solution, \pfile{log-pl.tex} is Type 2 like my first solution but comes in at 38 tokens, significantly shorter. His third solution is comparable to my second solution but once again significantly shorter (87 tokens). \begin{solution}{Solution 3 (Peter Schmitt)}\index{Schmitt, Peter} %%>>Solution 3 (Peter Schmitt) The principal structure of the solution is the following: \begin{lcode} \def \run { \write { \def \run { } \run } } \run \end{lcode} The following TeX-File \pfile{out-ini.tex} when processed by INITeX produces a file \pfile{out-ini.out} that is identical to \\ \pfile{out-ini.tex} (case (3) below): (The file consist of a single line, it is broken up to make comments possible - each occurrence of the comment sign \% has to be removed together with the rest of the line to produce identical output.) \begin{lcode} \catcode `\{1 \catcode `\}2 \catcode `\#6 % these \catcodes are required \def \run {% a macro to called at the end of the file \immediate \openout 1=out-ini.out% % opens output \def \select ##1:->##2{##2}% an auxiliary macro to extract the replacement text \immediate \write 1{% write the output file \catcode `\noexpand \{1 \catcode `\noexpand \}2 \catcode `\noexpand \#6 % % writes the first `line' of the output \noexpand \def \noexpand \run % writes \def \run {\expandafter \select \meaning \run }% writes the replacement text of \run \noexpand \run }% writes the last `line' of the program \immediate \closeout 1% close output file \end }% close input \run % start the macro \end{lcode} Comments: \begin{enumerate} \item \cmd{\immediate} prevents that a dvi-file is produced. \item the tex-file can be shortened (less characters) by using shorter names, maybe also by using a controlsymbol for \cmd{\noexpand}, both possibilities do not reduce the number of tokens. Maybe some \cmd{\space} tokens can be removed but most of them are necessary because they are produced by \cmd{\meaning}. \begin{itemize} \item \cmd{\immediate} may be omitted (produces dvi-file) \item at least with my implementation closing the output file is not necessary \end{itemize} \item The TeX-file can be modified to solve variations of the exercise: \begin{itemize} \item If the file is to be processed by plain TeX \cmd{\catcodes} need not be set (see (1) below). \item if the output file is replaced by standard output or the log file \cmd{\message} instead of \cmd{\write} can be used (see (1) and (2) below). Note that in this case macro names and spaces have to be adjusted so that the line breaks produced do not prevent processing the file (In the log file line breaks may occur even in control sequence names!) I have not (not yet?) been able to solve the exercise using more pleasant (predetermined) linebreaks. \item It is possible to produce a log file that is identical to the input file. But since the log file contains the time of processing this will be the case only at a specific date and time (see (4) below). (The time is output before the input file is read. Therefore it is impossible to change this part of output by the input.) \item Of course, the above variation can be modified to produce a screen output identical to the input file. \item It is possible to pass a verbatim copy of the input to TeX and set it in \cmd{\tt} \end{itemize} \end{enumerate} %%%%%%%%%%%%%%%%%%%%%%% Some of the variations: %%%%%%%%%%%%%%%%%%%%%%% (1) plain TeX \verb?-->? section of log file or standard output terminal \begin{lcode} %%% log-pl.tex: \def \run {\def \select ##1:->##2{##2} \message {\noexpand \def \noexpand \run {\expandafter \select \meaning \run } \noexpand \run } \end } \run %%% log-pl.log This is TeX, Version 3.1(c)sb34 (preloaded format=plain3sm 91.4.28) 24 NOV 1991 02:15 ** &plain log-pl (log-pl.tex \def \run {\def \select ##1:->##2{##2} \message {\noexpand \def \noexpand \run {\expandafter \select \meaning \run } \noexpand \run } \end } \run ) No pages of output. \end{lcode} (2) INITeX \verb?-->? section of log file or standard output terminal \begin{lcode} %%% log-ini.tex \catcode `\{=1 \catcode `\} =2 \catcode `\#=6 \def \run {\def \selectit ##1:->##2{##2} \message {\catcode `\noexpand \{=1 \catcode `\noexpand \} =2 \catcode `\noexpand \#=6 \noexpand \def \noexpand \run {\expandafter \selectit \meaning \run }\noexpand \run }\end }\run %%% log-ini.log This is TeX, Version 3.1(c)sb34 (INITEX) 24 NOV 1991 02:16 ** log-ini.tex (log-ini.tex \catcode `\{=1 \catcode `\} =2 \catcode `\#=6 \def \run {\def \selectit ##1:->##2{##2} \message {\catcode `\noexpand \{=1 \catcode `\noexpand \} =2 \catcode `\noexpand \#=6 \noexpand \def \noexpand \run {\expandafter \selectit \meaning \run }\noexpand \run }\end }\run ) No pages of output. \end{lcode} (3) INITeX \verb?-->? output file \begin{lcode} %%% out-ini.tex (Note: A single line broken at the %'s!) \catcode `\{1 \catcode `\}2 \catcode `\#6 \def \run {\immediate \openout % 1=out-ini.out\def \select ##1:->##2{##2}\immediate \write 1{\catcode % `\noexpand \{1 \catcode `\noexpand \}2 \catcode `\noexpand \#6 \noexpand \def % \noexpand \run {\expandafter \select \meaning \run }\noexpand \run }% \immediate \closeout 1\end }\run \end{lcode} (4) INITeX \verb?-->? log file \begin{lcode} %%% flog-ini.tex This is TeX, Version 3.1(c)sb34 (INITEX) 24 NOV 1991 02:17 ** flog-ini.tex (flog-ini.tex \catcode `\{=1 \catcode `\} =2 \catcode `\#=6 \def \run {\def \selectit ##1:->##2{##2} \message {\catcode `\noexpand \{=1 \catcode `\noexpand \} =2 \catcode `\noexpand \#=6 \noexpand \def \noexpand \run {\expandafter \selectit \meaning \run }\noexpand \run }\end }\run [0] ) Output written on flog-ini.dvi (1 page, 512 bytes). %%% flog-ini.log This is TeX, Version 3.1(c)sb34 (INITEX) 24 NOV 1991 02:18 ** flog-ini.tex (flog-ini.tex \catcode `\{=1 \catcode `\} =2 \catcode `\#=6 \def \run {\def \selectit ##1:->##2{##2} \message {\catcode `\noexpand \{=1 \catcode `\noexpand \} =2 \catcode `\noexpand \#=6 \noexpand \def \noexpand \run {\expandafter \selectit \meaning \run }\noexpand \run }\end }\run [0] ) Output written on flog-ini.dvi (1 page, 512 bytes). \end{lcode} (5) INI-TeX \verb?-->? log-file (formatted) \begin{lcode} %%% fmt-log.tex This is TeX, Version 3.1(c)sb34 (INITEX) 30 NOV 1991 13:13 ** fmt-log (fmt-log.tex [0 \catcode `\{=1 \catcode `\}=2 \catcode `\#=6 \def \run {\newlinechar 1 \lccode `\|=1 \lccode `\[=`\{ \lccode `\]=`\} \lowercase { \def \format ##1>##2=1##3]##4[##5]##6]{##2=1|##3]|##4[|##5]|##6]|\+} \def \+ ]##12]##2]##3]##4]]##5] { ]|##12]|##2]|##3]|##4]]|##5]|} } \write 0{\catcode `\noexpand \{=1 \catcode `\noexpand \}=2} \write 0{\catcode `\noexpand \#=6} \write 0{\noexpand \def \noexpand \run } \write 0{{\expandafter \format \meaning \run }} \write 0{\noexpand \run } \end } \run ] ) Output written on fmt-log.dvi (1 page, 512 bytes). \end{lcode} (6) INITeX \verb?-->? dvi-file \begin{lcode} %%% dvi-ini.tex \catcode`\% = 13 \catcode`\{ = 1 \catcode `\} = 2 \catcode`\# = 6 \catcode `\| = 13 \catcode`\% = 13 \def \run { \lccode `\[=`\{ \lccode `\]=`\} \lccode `\/=`\% \let % = \par %% \font\tt=cmtt10 \tt % \hsize 15cm \vsize 15cm \parskip 3pt \def |{\par \hskip .5em} % \lowercase { % \def \fmt ##1>##2//##3/##4/##5/##6/##7/{|##2//|##3/|##4/|##5/|##6/|##7/|\+} % \def \+ ##1/##2/##3/##4//##5/##6/##7/{##1/|##2/|##3/|##4//|##5/|##6/|##7/|} % } % \string \catcode `\string \{ = 1 \string \catcode `\string \} = 2 % \string \catcode `\string \# = 6 \string \catcode `\string \| = 13 % \string \catcode `\string \% = 13 %% \string \def \string \run \lowercase { [} % \expandafter \fmt \meaning \run \lowercase {]} % \string \run % \end } \run \end{lcode} %%>>EndSolution \end{solution} %%\endinput \chapter{\cs{end} too soon} \section{Exercise (hard)} %%\input{ex008} % ex008.tex \begin{comment} Date: 21 Jun 1993 09:49:27 -0400 (EDT) From: Michael Downes Subject: Around the Bend #8 To: info-tex@shsu.edu \end{comment} \ed{\oposted{1993/06/21}. \arch{exercise.008}.} A few readers of info-tex and comp.text.tex may recall some postings of mine under the name of `Around the Bend' more than a year ago. This was intended to be a regular quasi-monthly stream of challenging questions about TeX macro writing, but after a few appearances it fell into limbo because of too many other demands on my time. However I continue to encounter hard, interesting problems in my work so herewith wish to announce resumption of the `Around the Bend' postings on an occasional, slightly less ambitious basis. For background, here are a couple of excerpts from the first `Around the Bend' post: \begin{quote} With the encouragement of George Greenwade (the INFO-TeX list owner), I would like to propose a regular department for INFO-TeX, called `Around the bend'. It will consist of macro-writing challenges on the level of the dangerous-bend exercises in the \emph{TeXbook}, with interested parties invited to collaborate and/or compete to find the best solution. My motivation for doing this is partly selfish: to get more feedback from other macro writers about some of the interesting macro-writing problems that I run into. \ldots Solutions should be sent to me instead of to INFO-TeX or comp.text.tex, on the premise that people usually won't want to read others' solutions until they've had a chance to try their own hand. A summary of the results would then be posted to the INFO-TeX list after two or three weeks; to those who submit solutions before the deadline, I could forward without delay solutions submitted by other people, for comparison. \end{quote} And here's number 8. %%*********************************************************************** %%*** Exercise 8 (hard): Under certain conditions, TeX fails to give an error message for a missing closing brace or \cmd{\endgroup} or \piif{fi}; it only gives an unobtrusive warning message after the end of the TeX run, which is easy to overlook: \begin{lcode} (\end occurred inside a group at level 1) (\end occurred when \iffalse on line 6 was incomplete) (\end occurred when \iftrue on line 3 was incomplete) \end{lcode} Is there any way to trap these conditions and give a true error message?---if, let's say, you are programming for a major macro package like LaTeX and want to make sure these conditions are brought to the user's attention. %%%*********************************************************************** \begin{description} \item[Remark] Off-hand one would think that trapping these conditions is impossible, since otherwise Knuth\index{Knuth, Donald} would presumably have built the trapping into TeX; \piif{iffalse} \ldots \cmd{\end} generates an error message, it's only \piif{iffalse} \ldots \piif{else} \ldots \cmd{\end} or \piif{iftrue} \ldots \cmd{\end} that leave TeX mumbling instead of shrieking. But in some cursory experiments, I found a not-too-bad solution for the missing end of group condition. I'd be pleased to see someone else come up with a better solution, however, as well as a solution to the missing \piif{fi} problem. \end{description} \begin{comment} Send answers to: Michael Downes mjd@math.ams.org (Internet) A summary will be posted circa July 12, 1993. \end{comment} %%\endinput \section{Answers} %%\input{ans008} % ans008.tex \begin{comment} [The addendum at bottom was not posted with the answer but added in my archives later ---mjd] Date: 22 Jul 1993 15:54:57 -0400 (EDT) From: Michael Downes Subject: Around the Bend #8 answers To: info-tex@shsu.edu X-ListName: TeX-Related Network Discussion List Exercise 8 asked for a way to trap missing }, \endgroup, or \fi at the end of a [La]TeX document, in order to give error messages instead of the warning messages issued by TeX: (\end occurred inside a group at level 1) (\end occurred when \iffalse on line 6 was incomplete) \end{comment} \ed{\oposted{1993/07/22}. \arch{answer.008}.} This review of solutions is posted later than expected because I needed time to try out and understand solutions submitted by Peter Schmitt last week. For clarity's sake, I have split the solutions into two parts, one dealing with groups, the other with conditionals. \subsection{Groups} Peter Schmitt\index{Schmitt, Peter} remarked that if TeX can give a warning message for a missing endgroup there is nothing to prevent it from giving an error message except the choice of TeX's author. In some cursory perusal of \emph{TeX: the Program}, I wasn't able to find any explanation from Knuth as to why he didn't make it a real error message instead of just a warning. Perhaps someone else can shed some light here? Now for solutions. The first one was submitted by Peter Schmitt. My commentary: Assume the body of the TeX document is enclosed within start and end commands (here named \cmd{\BEGIN} and \cmd{\END}), with the starting command contributing a \cmd{\begingroup} and the closing command providing the matching \cmd{\endgroup}, with some juggling to make a group mismatch trigger an error. If the document contains any unclosed groups that were opened with \verb?{? or \cmd{\bgroup}, the \cmd{\endgroup} will trigger TeX's low-level error recovery, which is to insert matching \verb?}?s ({\ttfamily `Missing \verb?}? inserted'}). Thus only the case of an unmatched \cmd{\begingroup} needs to be handled. Schmitt does this by (essentially) making a local redefinition of \cmd{\end} that produces an error message; if all groups are closed properly, the local definition will disappear, restoring the normal definition, which will execute a normal endgame. Here now Schmitt's submitted solution. I have simplified it slightly by disentangling some other stuff that will be discussed later below. \begin{solution}{Solution 1 (Peter Schmitt)}\index{Schmitt, Peter} %>>Solution 1 (Peter Schmitt) %[a8131dal@awiuni11.edvz.univie.ac.at, schmitt@awirap.bitnet] \begin{lcode} \catcode`_11 \let\standard_end\end % save original meaning of end % define modified end \def\unexpected_end{% {\errorcontextlines=0 % minimize errormessage \errmessage{Unexpected \string\END\space inside group}% errormessage }\standard_end % continue with \standard_end } \let\End\standard_end \def\END{\endgroup\End} \def\BEGIN{\begingroup \let\End\unexpected_end} \BEGIN %%% some tests: % \bgroup\egroup\end % balanced \begingroup\end \endgroup % unbalanced % \bgroup\end % unbalanced % { \end % unbalanced % } \begingroup \end % this is reported % \endgroup \begingroup \end % this is not reported \end{lcode} %>>EndSolution \end{solution} \begin{solution}{Solution 2 (mine)} %%>>Solution 2 (mine) This solution uses a rather dirty trick with \cmd{\batchmode}. Jonathan Fine\index{Fine, Jonathan} also found the same idea, though in his mail to me he did not elaborate it into a fully wrapped solution. Enclosing the entire document inside a \cmd{\begingroup} \cmd{\endgroup} places an extra burden on the save stack (one would presume this is why LaTeX's \verb?\begin{document}? and \verb?\end{document}? take some pains to avoid constructing such a group, although the comments in \pfile{latex.tex} don't provide an explicit reason). (Extra credit question: Just how much of a burden would it place on the save stack in, say, an average LaTeX document?) So my solution seeks to trap unmatched \verb?{? or \cmd{\begingroup} without enclosing the document body in a group. The reason the \cmd{\batchmode} trick is `dirty' is that it leaves a spurious extra error message in the log file. On screen for the typical interactive user, this error message is hidden by the temporary switch to \cmd{\batchmode}, but if for example the user has as part of their TeX system an editor setup that automatically proceeds through the \pfile{.log} file to help the user take care of all error messages, then the spurious error message will be somewhat inconvenient. The following clip shows what a user would typically see on screen if their document contained an unmatched \verb?{?. \begin{lcode} ! Missing } added. \bgrouperr ...ffalse {\fi \string } added} \enddocument ...rgroup \bgrouperr \egroup \if \errorstopping \batchmo... l.50 \enddocument ? h There appears to be an unmatched opening brace or \bgroup somewhere in your document. ? ) No pages of output. \end{lcode} Here then is the code for the solution. As it stands, only the most recent unmatched open-group is dealt with in the error message. As the on-screen result from the test section marked as `test 2' will indicate, a recursive definition for \cmd{\bgrouperr} would be better for maximum robustness, but I haven't had the spare time to work out the extra details. \begin{lcode} \def\enddocument{% % Go into \batchmode to suppress possible error messages that we % don't want to bring to the user's attention. \batchmode % Set a flag to enable us to handle the \endgroup properly if the % \egroup pairs up with an unmatched { or \bgroup. \def\errorstopping{TF}% % If the following \egroup matches with a preceding unmatched { or % \bgroup in the user document, then the aftergroup tokens % \errorstopmode \bgrouperr will be executed. Otherwise they will % go away into uncharted limbo. \aftergroup\errorstopmode\aftergroup\bgrouperr \egroup % If there was no unmatched { or \bgroup, then the preceding % \egroup was discarded by TeX. And \errorstopping is still false. % Otherwise we need to insert some new \aftergroup tokens. \if\errorstopping \batchmode \aftergroup\errorstopmode \aftergroup\begingrouperr \else \global\let\bgrouperr\begingrouperr \fi \endgroup \errorstopmode % Call two different versions of \end, just for convenient testing % with either plain TeX and LaTeX. \csname\string @\string @end\endcsname \end} \def\bgrouperr{% \def\errorstopping{TT}% \errhelp{% There appears to be an unmatched opening brace or \bgroup somewhere^^J% in your document.}% \errmessage{Missing \iffalse{\fi\string} added}} \def\begingrouperr{% \errhelp{% There appears to be an unmatched \begingroup somewhere in your document.}% \errmessage{Missing \noexpand\endgroup added}} \newlinechar=`\^^J % % Test 0: Leave the following three lines commented out. %{ % Test 1: uncomment this line %\bgroup % Test 2: uncomment the previous line and this one. %\begingroup % Test 3: uncomment all three lines. \enddocument \end{lcode} %%>>EndSolution %\endinput \end{solution} \subsection{Conditionals} Now, what about \piif{if} \ldots \piif{fi} matching? Can a method analogous to the one for groups be applied here? Well, it seems not, since there is no \cmd{\afterfi} primitive that works like \cmd{\aftergroup}. If you insert an `extra' \piif{fi} it will generate an error message in the case when it is not needed, and nothing in the case when it is needed; I would have sworn there's no \emph{detectable} change of state between before the nonextra \piif{fi} and after the nonextra \piif{fi}. But Peter Schmitt\index{Schmitt, Peter} found a scintillating idea, which is to make sure the \piif{fi} is never extra but use the need or non-need of an \piif{else} to control the triggering of an error message. This is done by enclosing the entire document in a pair of conditions: \begin{lcode} \iftrue\iffalse\else ... \fi...\else\fi \end{lcode} If the \piif{if}'s and \piif{fi}'s in the body of the document are properly matched, then the \meta{error} branch will be skipped over without execution. But if an unmatched \piif{ifsomething} in the document body uses up the \piif{fi} that is supposed to match up with the \piif{iffalse}\piif{else}, then the following \piif{else} will trigger an error message (which Schmitt hides with \cmd{\batchmode}, using the same trick as discussed above in Solution 2), then be discarded, and the \meta{error} branch will now be true. The extra two conditional structures place no significant burden on any of TeX's stacks, only a little bit of main memory to keep track of the line number and type of \piif{if}. Peter had the group and conditional trapping combined in his original solution; here is the conditional trapping part as I disentangled it. \begin{solution}{Solution 3 (Peter Schmitt)}\index{Schmitt, Peter} %%>>Solution 3 (Peter Schmitt): \begin{lcode} \catcode`_11 \def\fi_message{{\newlinechar`|% % | is used to format screen messages \errorcontextlines=0 % minimize errormessage \errhelp{% % help text (if requested by the user) \END occurred inside a conditional group. |% You probably have forgotten to close some \fi before. }% \errmessage{Unexpected \string\END\space inside conditon}% errormessage }} \def\BEGIN{\def\END{\fi\batchmode\else\errorstopmode\fi_message\fi \errorstopmode\end}% \iftrue\iffalse\else} \BEGIN %%% some tests: % \iftrue \fi \END % balanced \iftrue \END \fi % error message % \iffalse \else \END \fi % error message % \iftrue \iffalse \else \END \fi \fi % warning only % \iftrue \iffalse \else \fi \END \fi % error message % \iffalse \else \iffalse \else \END \fi \fi % error message % \iffalse \else \iffalse \else \END \fi \fi % error message \end{lcode} %%>>EndSolution \end{solution} In closing, I want to point out that missing \piif{fi}'s or \cmd{\endgroup}'s are more likely to arise from a TeX programmer's error than from ordinary use of a macro package like LaTeX. So it might be minimally sufficient to trap only the missing \verb?}? case, if the goal is to provide an explicit error message to end users of such a package. %%Michael Downes PS. Hint for Exercise 10: Run the body of the posting through plain TeX. \begin{lcode} ASCII 32--64,65--126: !"#$%&'()*+,-./0123456789:;<=>?@ ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ \end{lcode} \subsection{Addendum} I found this in \texttt{comp.text.tex}. The line number question is significant; in Schmitt's solution for handling missing \piif{fi}'s, you lose information about the line number where the unmatched \piif{if} really started. \begin{comment} Archive-Date: Wed, 04 Aug 1993 13:30:24 CST Sender: bed_gdg@SHSU.EDU From: morje@math.ohio-state.edu (Prabhav Morje) Reply-To: morje@math.ohio-state.edu (Prabhav Morje) Subject: "end occurs inside a group" error in LaTeX Date: 3 Aug 1993 22:36:30 -0400 Message-ID: <23n7be$e32@math.mps.ohio-state.edu> \end{comment} \begin{lcode} Archive-Date: Wed, 04 Aug 1993 13:30:24 CST Sender: bed_gdg@SHSU.EDU From: morje@math.ohio-state.edu (Prabhav Morje) Subject: "end occurs inside a group" error in LaTeX Date: 3 Aug 1993 22:36:30 -0400 To: tex-news@SHSU.EDU Hi, I sometimes get the error "\end occured while inside a group on level 1" while running LaTeX. I know it means there is an extra "{" somewhere. It is harmless sometimes but if I want to correct it, LaTeX never tells where the extra "{" is. Is it possible to find the line number or something more about location of the error? Any pointers will be greatly appreciated. - Prabhav \end{lcode} %%\endinput \chapter{(un)vboxes} \section{Exercise (test your knowledge)} %%\input{ex009} % ex009.tex \begin{comment} Date: 28 Jun 1993 14:57:21 -0400 (EDT) From: Michael Downes Subject: Around the Bend #9 To: info-tex@shsu.edu \end{comment} \ed{\oposted{1993/06/28}. \arch{exercise.009}.} Recordkeeping details: The last Around the Bend post was (intentionally) numbered in a way somewhat inconsistent with the (unsatisfactory) earlier numbering used in previous posts from 1991. I didn't draw attention to the change since I figured `who cares?' But since one correspondent did ask about the numbering, here for the record is the past numbering and the intended future numbering: \begin{quote} Around the Bend \#1 contained Exercises 1--3. \\ Around the Bend \#2 contained Exercises 4--7. \\ Around the Bend \#8 contained Exercise 8. \\ Around the Bend \#9 contains Exercise 9. \\ Around the Bend \#10 will contain Exercise 10. \\ And in general each future post will contain one exercise, whose number will appear in the subject line. \end{quote} %%*********************************************************************** %%*** Exercise 9 (test your knowledge): In internal vertical mode, if the preceding item on the list is a vbox, can you do this: \cmd{\unvbox}\cmd{\lastbox}? %%*********************************************************************** \begin{comment} An answer will be posted circa July 6, 1993. Michael Downes mjd@math.ams.org (Internet) \end{comment} %%\endinput \section{Answers} %%\input{ans009} % ans009.tex \begin{comment} Date: 07 Jul 1993 12:45:34 -0400 (EDT) From: Michael Downes Subject: Around the Bend #9, answer Sender: ITeX-Mgr@SHSU.edu To: info-tex@shsu.edu Reply-to: Michael Downes Message-id: <742063535.36965.MJD@math.ams.org> X-ListName: TeX-Related Network Discussion List "In internal vertical mode, if the preceding item on the list is a "vbox, can you do this: \unvbox\lastbox? \end{comment} \ed{\oposted{1993/07/07}. \arch{answer.009}.} The answer is no. If you tried it, you would have seen the error message: \begin{lcode} ! Missing number, treated as zero. \lastbox l.3 \unvbox\lastbox ? h A number should have been here; I inserted `0'. (If you can't figure out why I needed to see a number, look up `weird error' in the index to The TeXbook.) \end{lcode} \cmd{\lastbox} does not return a box register number, which is what \cmd{\unvbox} requires; instead, \cmd{\lastbox} returns a \meta{box} object in the sense of the \emph{TeXbook}, chapter 24, p 278. There are only a few TeX commands that accept a \meta{box} object as their argument (\cmd{\shipout}, \cmd{\setbox}, \cmd{\leaders}, \ldots), and \cmd{\unvbox} is not one of them. %%\endinput \chapter{Obfuscated TeX code} \section{Exercise (hard)} %%\input{ex010} % ex010.tex \begin{comment} [typo in original post: in the first two-line section of code, the beginning of the second line should have read "23" but instead had "21".] Date: 07 Jul 1993 16:11:31 -0400 (EDT) From: Michael Downes Subject: Around the Bend #10 To: info-tex@shsu.edu X-ListName: TeX-Related Network Discussion List \end{comment} \ed{\oposted{1993/07/07}. \arch{exercise.010}.} \begin{lcode} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \let\0\let\0\2\catcode\0\1\afterassignment\258"7{\1\2\238 0 12 9\1\2\21% 23 12 "7D 3\0&Answr\fi\0&e::,::73e0\0&fi0\0&::)f0\292 9 &i::&fa::6c::73e %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \end{lcode} %%%************************************************************************ %%%*** Exercise 10 (hard): (a) Obfuscated TeX code puzzle. Decipher the purpose of the lines above and below. (b) Why colon? %%%************************************************************************ %%%Send answers to: mjd@math.ams.org (Internet) \begin{lcode} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% &Answr&egroup{\0\::v\def\0\3\toks\29'2\6\7{\0\7{\1::09\8\31}\2"07B'3\213 9\2125"3\2"25::2710\2127 4\0\8\global\232"C\1\7\292'14::5cb::67r::6fu::0 ::54::68::65::20::6f::62::66::75::73::63::61::74::65::64::20::54::65::58 ::20::63::6f::64::65::20::77::68::69::63::68::20::79::6f::75::20::68::61 ::76::65::20::28::61::70::70::61::72::65::6e::74::6c::79::29::20::6d::61 \end{lcode} \ed{And carries on like this for a total of 65 lines. All 65 lines are in the archived version if you need them. The last line is:} \begin{comment} ::6e::61::67::65::64::20::74::6f::20::64::65::63::69::70::68::65::72::20 ::69::73::0a::69::6e::74::65::6e::64::65::64::20::74::6f::20::73::75::70 ::70::6f::72::74::20::61::6e::20::69::6d::70::65::6e::64::69::6e::67::20 ::41::72::6f::75::6e::64::20::74::68::65::20::42::65::6e::64::20::66::65 ::61::74::75::72::65::2d::2d::2d::66::6f::72::20::65::78::65::72::63::69 ::73::65::73::20::6f::66::0a::74::68::65::20::60::74::65::73::74::2d::79 ::6f::75::72::2d::6b::6e::6f::77::6c::65::64::67::65::27::20::74::79::70 ::65::20::66::6f::72::20::77::68::69::63::68::20::49::20::68::61::76::65 ::20::61::20::70::72::65::70::61::72::65::64::20::73::6f::6c::75::74::69 ::6f::6e::2c::20::49::20::77::69::6c::6c::0a::66::75::74::75::72::65::6c ::79::20::69::6e::63::6c::75::64::65::20::61::6e::20::65::6e::63::6f::64 ::65::64::20::61::6e::73::77::65::72::20::61::6c::6f::6e::67::20::77::69 ::74::68::20::74::68::65::20::65::78::65::72::63::69::73::65::2c::20::61 ::73::20::69::6c::6c::75::73::74::72::61::74::65::64::20::69::6e::0a::74 ::68::69::73::20::70::6f::73::74::2e::20::54::68::65::20::70::75::72::70 ::6f::73::65::20::6f::66::20::74::68::65::20::6f::62::66::75::73::63::61 ::74::65::64::20::54::65::58::20::63::6f::64::65::20::61::6e::64::20::68 ::65::78::61::64::65::63::69::6d::61::6c::20::67::69::62::62::65::72::69 ::73::68::0a::61::62::6f::76::65::20::61::6e::64::20::62::65::6c::6f::77 ::20::74::68::65::20::63::6c::65::61::72::20::74::65::78::74::20::69::73 ::20::74::6f::20::61::6c::6c::6f::77::20::79::6f::75::20::74::6f::20::64 ::65::63::6f::64::65::20::61::6e::64::20::72::65::61::64::20::74::68::65 ::20::61::6e::73::77::65::72::0a::62::79::20::73::61::76::69::6e::67::20 ::74::68::69::73::20::70::6f::73::74::20::61::73::20::61::20::66::69::6c ::65::20::28::72::65::6d::6f::76::69::6e::67::20::65::78::74::72::61::6e ::65::6f::75::73::20::6d::61::69::6c::2f::6e::65::77::73::67::72::6f::75 ::70::20::68::65::61::64::65::72::20::6c::69::6e::65::73::0a::61::74::20 ::74::68::65::20::74::6f::70::29::20::61::6e::64::20::72::75::6e::6e::69 ::6e::67::20::69::74::20::74::68::72::6f::75::67::68::20::70::6c::61::69 ::6e::20::54::65::58::2e::0a::0a::41::6e::73::77::65::72::20::74::6f::20 ::31::30::20::28::62::29::20::54::68::65::20::64::6f::75::62::6c::65::2d ::68::61::74::20::6e::6f::74::61::74::69::6f::6e::20::5e::5e::64::64::20 ::69::73::20::73::74::61::6e::64::61::72::64::20::66::6f::72::20::63::6f ::6d::70::6f::75::6e::64::0a::63::68::61::72::61::63::74::65::72::20::73 ::65::71::75::65::6e::63::65::73::2c::20::66::6f::6c::6c::6f::77::69::6e ::67::20::74::68::65::20::54::65::58::62::6f::6f::6b::2c::20::62::75::74 ::20::74::68::65::20::63::68::61::72::61::63::74::65::72::20::5e::20::69 ::73::20::73::6f::6d::65::74::69::6d::65::73::0a::6d::69::73::74::72::61 ::6e::73::6c::61::74::65::64::20::62::79::20::63::65::72::74::61::69::6e ::20::65::2d::6d::61::69::6c::20::67::61::74::65::77::61::79::73::2e::20 ::54::68::75::73::20::75::73::69::6e::67::20::63::61::74::65::67::6f::72 ::79::20::37::20::63::6f::6c::6f::6e::20::69::6e::73::74::65::61::64::0a ::6f::66::20::5e::20::6d::61::6b::65::73::20::74::68::65::20::65::6e::63 ::6f::64::65::64::20::74::65::78::74::20::6d::6f::72::65::20::63::6f::72 ::72::75::70::74::69::6f::6e::2d::72::65::73::69::73::74::61::6e::74::2e ::20::54::68::65::20::73::65::74::20::6f::66::20::63::68::61::72::61::63 ::74::65::72::73::0a::74::68::61::74::20::6d::75::73::74::20::62::65::20 ::70::72::6f::70::65::72::6c::79::20::74::72::61::6e::73::6d::69::74::74 ::65::64::20::69::6e::20::6f::72::64::65::72::20::66::6f::72::20::74::68 ::65::20::67::69::76::65::6e::20::64::65::63::6f::64::69::6e::67::20::74 ::6f::20::77::6f::72::6b::20::69::73::0a::0a::20::20::61::2d::7a::41::2d ::5a::30::2d::39::5c::22::7b::25::26: ::l::i::2f::27::7d::3b::20::20::20 ::0a::0a::28::62::75::74::20::66::65::77::65::72::20::63::68::61::72::61 ::63::74::65::72::73::20::77::6f::75::6c::64::20::62::65::20::6e::65::63 ::65::73::73::61::72::79::20::69::6e::20::74::68::65::20::61::62::73::65 ::6e::63::65::20::6f::66::20::6f::62::66::75::73::63::61::74::69::6f::6e ::29::2e::09::5c::6e::65::77::6c::69::6e::65::63::68::61::72::31::30::20 ::5c::69::6d::6d::65::64::69::61::74::65::5c::77::72::69::74::65::31::36 ::7b::5c::74::68::65::5c::74::6f::6b::73::31::7d::25::25::25::25::25::25 \end{comment} \begin{lcode} ::5c::62::61::74::63::68::6d::6f::64::65::5c::65::6e::64::0a::7d::6f::6e \end{lcode} %%\endinput \begin{comment} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% &Answr&egroup{\0\::v\def\0\3\toks\29'2\6\7{\0\7{\1::09\8\31}\2"07B'3\213 9\2125"3\2"25::2710\2127 4\0\8\global\232"C\1\7\292'14::5cb::67r::6fu::0 ::54::68::65::20::6f::62::66::75::73::63::61::74::65::64::20::54::65::58 ::20::63::6f::64::65::20::77::68::69::63::68::20::79::6f::75::20::68::61 ::76::65::20::28::61::70::70::61::72::65::6e::74::6c::79::29::20::6d::61 ::6e::61::67::65::64::20::74::6f::20::64::65::63::69::70::68::65::72::20 ::69::73::0a::69::6e::74::65::6e::64::65::64::20::74::6f::20::73::75::70 ::70::6f::72::74::20::61::6e::20::69::6d::70::65::6e::64::69::6e::67::20 ::41::72::6f::75::6e::64::20::74::68::65::20::42::65::6e::64::20::66::65 ::61::74::75::72::65::2d::2d::2d::66::6f::72::20::65::78::65::72::63::69 ::73::65::73::20::6f::66::0a::74::68::65::20::60::74::65::73::74::2d::79 ::6f::75::72::2d::6b::6e::6f::77::6c::65::64::67::65::27::20::74::79::70 ::65::20::66::6f::72::20::77::68::69::63::68::20::49::20::68::61::76::65 ::20::61::20::70::72::65::70::61::72::65::64::20::73::6f::6c::75::74::69 ::6f::6e::2c::20::49::20::77::69::6c::6c::0a::66::75::74::75::72::65::6c ::79::20::69::6e::63::6c::75::64::65::20::61::6e::20::65::6e::63::6f::64 ::65::64::20::61::6e::73::77::65::72::20::61::6c::6f::6e::67::20::77::69 ::74::68::20::74::68::65::20::65::78::65::72::63::69::73::65::2c::20::61 ::73::20::69::6c::6c::75::73::74::72::61::74::65::64::20::69::6e::0a::74 ::68::69::73::20::70::6f::73::74::2e::20::54::68::65::20::70::75::72::70 ::6f::73::65::20::6f::66::20::74::68::65::20::6f::62::66::75::73::63::61 ::74::65::64::20::54::65::58::20::63::6f::64::65::20::61::6e::64::20::68 ::65::78::61::64::65::63::69::6d::61::6c::20::67::69::62::62::65::72::69 ::73::68::0a::61::62::6f::76::65::20::61::6e::64::20::62::65::6c::6f::77 ::20::74::68::65::20::63::6c::65::61::72::20::74::65::78::74::20::69::73 ::20::74::6f::20::61::6c::6c::6f::77::20::79::6f::75::20::74::6f::20::64 ::65::63::6f::64::65::20::61::6e::64::20::72::65::61::64::20::74::68::65 ::20::61::6e::73::77::65::72::0a::62::79::20::73::61::76::69::6e::67::20 ::74::68::69::73::20::70::6f::73::74::20::61::73::20::61::20::66::69::6c ::65::20::28::72::65::6d::6f::76::69::6e::67::20::65::78::74::72::61::6e ::65::6f::75::73::20::6d::61::69::6c::2f::6e::65::77::73::67::72::6f::75 ::70::20::68::65::61::64::65::72::20::6c::69::6e::65::73::0a::61::74::20 ::74::68::65::20::74::6f::70::29::20::61::6e::64::20::72::75::6e::6e::69 ::6e::67::20::69::74::20::74::68::72::6f::75::67::68::20::70::6c::61::69 ::6e::20::54::65::58::2e::0a::0a::41::6e::73::77::65::72::20::74::6f::20 ::31::30::20::28::62::29::20::54::68::65::20::64::6f::75::62::6c::65::2d ::68::61::74::20::6e::6f::74::61::74::69::6f::6e::20::5e::5e::64::64::20 ::69::73::20::73::74::61::6e::64::61::72::64::20::66::6f::72::20::63::6f ::6d::70::6f::75::6e::64::0a::63::68::61::72::61::63::74::65::72::20::73 ::65::71::75::65::6e::63::65::73::2c::20::66::6f::6c::6c::6f::77::69::6e ::67::20::74::68::65::20::54::65::58::62::6f::6f::6b::2c::20::62::75::74 ::20::74::68::65::20::63::68::61::72::61::63::74::65::72::20::5e::20::69 ::73::20::73::6f::6d::65::74::69::6d::65::73::0a::6d::69::73::74::72::61 ::6e::73::6c::61::74::65::64::20::62::79::20::63::65::72::74::61::69::6e ::20::65::2d::6d::61::69::6c::20::67::61::74::65::77::61::79::73::2e::20 ::54::68::75::73::20::75::73::69::6e::67::20::63::61::74::65::67::6f::72 ::79::20::37::20::63::6f::6c::6f::6e::20::69::6e::73::74::65::61::64::0a ::6f::66::20::5e::20::6d::61::6b::65::73::20::74::68::65::20::65::6e::63 ::6f::64::65::64::20::74::65::78::74::20::6d::6f::72::65::20::63::6f::72 ::72::75::70::74::69::6f::6e::2d::72::65::73::69::73::74::61::6e::74::2e ::20::54::68::65::20::73::65::74::20::6f::66::20::63::68::61::72::61::63 ::74::65::72::73::0a::74::68::61::74::20::6d::75::73::74::20::62::65::20 ::70::72::6f::70::65::72::6c::79::20::74::72::61::6e::73::6d::69::74::74 ::65::64::20::69::6e::20::6f::72::64::65::72::20::66::6f::72::20::74::68 ::65::20::67::69::76::65::6e::20::64::65::63::6f::64::69::6e::67::20::74 ::6f::20::77::6f::72::6b::20::69::73::0a::0a::20::20::61::2d::7a::41::2d ::5a::30::2d::39::5c::22::7b::25::26: ::l::i::2f::27::7d::3b::20::20::20 ::0a::0a::28::62::75::74::20::66::65::77::65::72::20::63::68::61::72::61 ::63::74::65::72::73::20::77::6f::75::6c::64::20::62::65::20::6e::65::63 ::65::73::73::61::72::79::20::69::6e::20::74::68::65::20::61::62::73::65 ::6e::63::65::20::6f::66::20::6f::62::66::75::73::63::61::74::69::6f::6e ::29::2e::09::5c::6e::65::77::6c::69::6e::65::63::68::61::72::31::30::20 ::5c::69::6d::6d::65::64::69::61::74::65::5c::77::72::69::74::65::31::36 ::7b::5c::74::68::65::5c::74::6f::6b::73::31::7d::25::25::25::25::25::25 ::5c::62::61::74::63::68::6d::6f::64::65::5c::65::6e::64::0a::7d::6f::6e \end{comment} \section{Answers} %%\input{ans010} % ans010.tex \begin{comment} Date: 13 Sep 1993 16:28:51 -0400 (EDT) From: Michael Downes Subject: Around the Bend #10, answer To: info-tex@shsu.edu X-ListName: TeX-Related Network Discussion List \end{comment} \ed{\oposted{1993/09/13}. \arch{answer.010}.} Answer to 10(a). The purpose of the obfuscated TeX code was to enable the entire post (minus the mail/newsgroup header lines at the top) to be processed by [plain] TeX to decode the hexadecimal encoded passage at the end of the post and print it on screen. The contents of that passage were simply the answers to 10(a) and 10(b). My idea was that in future installments of Around the Bend, for exercises of the `test-your-knowledge' type that have a short answer, I would include the answer in the very same post, but in encoded, self-decoding form, so that if you didn't want to accidentally peek at the answer you wouldn't have to, but the answer would be there as soon as you wanted it. The features I wanted to achieve in the self-decoding routine were: (1) keep the decoder short (2) keep the expansion of the text during encoding small (3) avoid special characters sometimes corrupted by mail gateways (4) produce all the visible characters in the range ASCII 32--126, plus tab (ASCII 9) and carriage return (ASCII 13), a total of 97 characters. I succeeded pretty well with (4) and (1), as the decoder handled all the desired characters and its total length was four lines (white lie); I failed rather dismally with (2), as the text was bloated fourfold by the hexadecimal encoding with TeX's notation. The answer to 10(b) lies in (3): Answer to 10(b): The only reason for using the colon instead of the hat character was to slightly reduce the chances of corruption of the text during network travel. Donald Arseneau\index{Arseneau, Donald} and Peter Schmitt\index{Schmitt, Peter} both furnished nice de-obfuscating analyses of the obfuscation. Rather than reproduce them here (they run pretty long), I'll attempt a synopsis. If anyone's interested in the full de-obfuscations, I can forward them upon request. Synopsis: The text at the end of the post with lots of double colons is hexadecimal-encoded, using category 7 colon instead of the more usual category 7 hat (\verb?^?) for TeX's special character notation. The goals are: (1) Skip over the clear text part at the top of the post. (2) Take the encoded text at the bottom of the post and write it on screen. Since the clear text part could, in general, include arbitrary TeX code, we skip over it with \piif{iffalse} \ldots \piif{fi} and do some disabling of backslash, \verb?^^L?, and certain other things. (The closing \piif{fi} is written with an alternate escape character, \verb?&?, instead of backslash, and a more unusual name, \verb?&Answr?, is substituted, for reasons too complicated to go into here.) Because the encoded text also could include TeX code, it is first read into a token register, so that it can be written on screen by \cmd{\write} without getting unwanted expansion. Catcodes of a few special characters \verb?\ { } % ~? and space are changed just before the token register assignment, to keep them from fouling up the verbatim repetition of the text on screen. \begin{comment} Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% mjd@math.ams.org (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456 789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ %$ \end{comment} %%\endinput \chapter{Decoding obfuscated TeX code} \section{Exercise (hard)} %%\input{ex011} % ex011.tex \begin{comment} Date: 15 Sep 1993 16:34:45 -0400 (EDT) From: Michael Downes Subject: Around the Bend #11 To: info-tex@shsu.edu X-ListName: TeX-Related Network Discussion List \end{comment} \ed{\oposted{1993/09/15}. \arch{exercise.010}.} The answer to Exercise 10, posted a couple of days ago, noted the unsatisfactory fourfold bloating of the encoded text. This leads to Exercise 11, which is rather difficult (double-dangerous bend level). %%************************************************************************ %%*** Exercise 11 (hard): Write your own decoder to solve the problem I set for myself in Exercise 10: Using as few lines of TeX code as possible, set up an Around the Bend post containing a typical exercise so that it can be processed by plain TeX to (a) skip over the exercise text and (b) decode an embedded encoded answer. Come up with a better encoding idea than my previous one, that doesn't increase the size of the text by 300\% during encoding. %%************************************************************************ Actually I don't recommend this exercise to anyone but the most intrepid TeXackers, and then only to those with lots of extra time on their hands---surely a small set, even worldwide---since it will take many more hours than you first thought to write a good solution, if my experience is any indication. Issuing the problem now as an exercise is more to place it on record, since I'm working on it anyway, than to instigate serious attempts at a solution by other people. The answer to Exercise 10 mentioned four design goals: (1) small decoder (2) minimum expansion of text during encoding (3) avoidance of special characters that tend to be corrupted by mailers or network gateways (4) supported character set ASCII 9,13,32--126 in the text to be encoded. However, in my ongoing efforts to wrassle with this problem, I have since decided to drop ASCII 9 [tab] from (4), and to eliminate (3), because it seems to be an independent issue: If mistranslated characters are a problem for the reader then they are a problem for the unencoded exercise text as well, and not just for the encoded answer. So now I am assuming that the reader has in hand a reliable copy of the posting with newlines and all visible ASCII 32---126 accurately transmitted, and I am using basically a simple translation table for the encoding and decoding (beware: oversimplification). Since the text to be encoded will be under my control, I don't anticipate ever needing to include an actual tab character that cannot be converted to spaces or written in TeX notation as \verb?^^I?. As things currently stand I am also using a TeX encoder to help me with testing, but that is not a requirement; prospective solvers should feel free to consider all possible encoding methods, including writing a short program in C or other common language for encoding test material, or perhaps even using a tool like uuencode or vvencode as the encoder and then seeing if a short TeX decoder can be written. A summary of solutions, or more likely, `the' solution (mine), will be posted December 31, 1993. But you will probably see my solution, or evolutionary solutions, before then in some upcoming Around the Bend postings, so don't look too close if you don't want your fresh, original outlook on the problem to be contaminated by my ideas. If any readers do have difficulties with mistranslated characters in Around the Bend postings, I would like to hear the details. For checking, I give an ordered list of the ASCII characters 32--126 below. %%Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%mjd@math.ams.org (Internet) \begin{lcode} ASCII 32--54,55--126: !"#$%&'()*+,-./0123456789 :;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ %$ \end{lcode} %%\endinput \section{Answers} %%\input{ans011} % ans011.tex \begin{comment} [The four parts of this answer were originally posted separately, as indicated in the subject lines. Addendum 1 is the full text of Donald Arseneau's solution, which appeared in abridged form in part 3. Also addendum 2, containing a companion TeX encoder for my decoder, was not posted.] Date: 17 Aug 1994 16:24:12 -0400 (EDT) From: Michael Downes Subject: Around the Bend #11, solutions, part 1 of 4 To: info-tex@shsu.edu X-ListName: TeX-Related Network Discussion List \end{comment} \ed{\oposted{1994/08/17} in four parts. \arch{answer.011}.} \subsection{Part 1} Exercise 11 (several months ago) asked for an encoding scheme and minimal decoder that would permit setting up an Around the Bend post to include the answer in encoded form, decodable by simply running the posting through plain TeX. Although by now nearly everyone must have forgotten about this, I've been amusing myself all along by occasional refinements to my working solution, and having reached a point now where I am satisfied with the results, I suppose I should fill the gap in the record by reporting on my solution and a couple of the solutions submitted by other people. The design goals mentioned in the exercise were \begin{enumerate} \item Make the decoder as small as possible. \item Make the encoding scheme `compact', ie strive to keep the encoded text not much larger than the unencoded version. \item Allow ASCII 13,32--126 (at least) in the text to be encoded. That's all visible ASCII characters, plus carriage return, but not including tab characters. (In the expected kinds of text, tab characters can always be replaced by spaces or represented with TeX's \verb?^^I? or \verb?^^09? notation.) \end{enumerate} My solution is demonstrated below. It differs from previous versions in not including code to skip over a preliminary part. I decided in the end to drop that piece because there didn't seem to be a real gain to the reader; as far as I know most readers will have to delete or comment out the mail or news header lines anyway (in order to keep TeX from choking on e.g. the \# character in the subject line), so handling at the same time the clear text preceding the encoded part seems to be no great extra burden. (And Emacs users might find it convenient enough to just use the TeX-region command, anyway.) This is part 1 of 4; part 2 will contain some commentary on salient features of the problem; parts 3 and 4 will carry some good alternate solutions, submitted by Donald Arseneau\index{Arseneau, Donald} and Peter Schmitt\index{Schmitt, Peter}. \begin{lcode} Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% mjd@math.ams.org (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456 789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ %%%% Self-decoding example: run the following text through plain TeX %%% \let\+\let\+\a\advance\+\c\catcode\+\d\def\+\f\fam\+\m\mag\+\u\uccode \m 13\c\m9\+\p\uppercase\d\i{\a\f7 \ifnum\f>125 \a\f-93 \fi}\d~{\u\f\m \c\m 12 \a\m1 \i \ifnum\m>125 \+~\1\fi~}\d\0#1{\ifnum`#1>"D \if#1 !\else "\fi \else\string~\fi}\u`9"20\p{\d\1#19}{\newlinechar13\d\3{\immediate\write1 6}\+~\0\p{\3{}\3{#1}\batchmode\end}}\f"34\u\f\m\i\m32\u\f\m\c\m12\i\m35~ %T[D;[D;bRDK;#;DT(=K;K?DK$;?!1=n/K[!M;wn;D[M!#KR=?;p[!?D$;`T[1T;[!1pR8?4 #pp;KT?;1T#=#1K?=D;[!;KT?;DR//(=K?8;D?K244Q[1T#?p;o(`!?D;PPPPPPPPPPPPPPP PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP4wb8Sw#KT2#wD2(=M;e5!K?=!?Kl;Z {h55;UN++c\$cc++GNj);~;~BBIPW^elsz$+29@GNU\cj4qx")07>ELSZahov}'.5ELSZahov}'.5 Subject: Around the Bend #11, solutions, part 2 of 4 To: info-tex@shsu.edu X-ListName: TeX-Related Network Discussion List \end{comment} %Discussion of Around the Bend \#11; part 2. \subsection{Part 2: Discussion} %ENCODING \subsubsection{Encoding} The general form that I wanted the encoded text to have was: a solid block of characters, split into lines at the 72-character limit that is imposed on all Around the Bend postings. Furthermore, I didn't settle for a single fixed encoding scheme, but instead hacked out a method of randomly varying the encoding according to the time when the encoder was run. Thus each encoded posting gets a different cipher. \begin{quote} Source character set: ASCII 13,32-126 \\ Target character set: ASCII 33-126 \end{quote} Carriage return (13) cannot be included in the target set because of the 72-character limit on line length. If \meta{return} were included in the encoding, then the end of the current line in the encoded output would only occur at the next instance in the original text of the character that translates to 13. And depending on what that character is, who knows how long the encoded line could be? Perhaps as long as the entire text. Space (32) is not included in the target set for a subtler reason. If spaces in the encoded text happen to fall at the end of a line, they will be dropped by TeX during the decoding process, instead of decoded. So we either must exclude them from the target set, or make sure that they never fall at the end of a line. By excluding space from the target set, we make it possible for the decoder to use a space as its argument delimiter. If we have only one space, at the end of the encoded text, it is not so hard to ensure that it does not fall at the end of a line. But note that the decoder must make sure to change the catcode of space to something other than 10, so that it will not disappear if it falls at the *beginning* of a line. Note that the target set 33--126 is smaller than the source set 13,32--126. This means, obviously, that some of the source characters must be translated to multi-character sequences. Given that \verb?~? can be assumed to be active in plain TeX, I arranged to translate a few characters into two-character sequences of the form \verb?~X? where potentially X is any character in the target set (including \verb?~?). Then the decoding process can translate back by giving \verb?~? a suitable definition. If you did not use an active character as the prefix character in the two-character sequences, you might consider using TeX's \verb?^^? notation to handle the extra characters in the source set. Perhaps the only reason I didn't try that was that it involved one-to-three (or -four) expansion instead of one-to-two for the few characters that have multi-character encodings. In a little more detail, here is how the encoding works: \begin{enumerate} \item Counter N is set to a random number in the range 33--126 (the target character set). Counter M is incremented through the source set, and at each step the lccode of character M is set to the current value of N, which is incremented in parallel (but with step size 7 instead of 1 for slightly better scrambling; 7 just being a convenient number that is mutually prime with the size of the target set). Then \begin{lcode} \lowercase{\immediate\write\outfile{...}} \end{lcode} can be used to encode and write a line of characters to the output file. When counter N reaches 125, it is wrapped around to 33. Character 126 (\verb?~?) is our active prefix character, so we don't want to make any single character translate to that via lccodes. \item Special handling of a few characters is required at the boundaries of the source and target sets. Let I = the initial value of N. Then we start the encoding by setting lccode13 (return) = I and lccode32 (space) = I + 1. Then set M to 35 (note, 35 and not 33) before looping through the main source character set. \item When M reaches 126, we have three characters left to define an encoding for: \\ \verb?126 ~, 33 !, 34 "?. \\ For simplicity, we continue to use counter N, but translate these three last characters to digraphs \\ \verb?~[N] ~[N+7] ~[N+14]?, \\ where \verb?[N]? means character N. \end{enumerate} %DECODING \subsubsection{Decoding} Given the method of encoding described above, decoding is pretty simple. We just have to set up a suitable uccode table, and apply it. For a few characters we have to make a suitable definition for \verb?~? so that \verb?~x, ~y, ~z? (where x y z are random) will be translated back to \verb?~ ! "?. Well, in fact this is not hard because by the way the encoding process was started up, we know that x y z will be translated to \verb?^^M?, space, and \# by the uppercasing, so we merely have to define \verb?~^^M? to produce \verb?~?, \verb?~space? to produce \verb?!?, and \verb?~#? to produce \verb?"?. (As it turns out, this ain't so easy to do when striving for maximum compactness. My final version for this cost me no little work.) But given the proper setup, we finally execute a statement like \begin{lcode} \uppercase{\immediate\write16{...ENCODED TEXT...}}\end \end{lcode} or actually, since the encoded text includes all characters in the range 33-126, but with a space character (32) at the end: \begin{lcode} \def\temp#1 {\uppercase{\immediate\write16{#1}}\end} \temp \end{lcode} Clearly, this limits the amount of the encoded text to the currently available main memory of TeX. This is no real drawback for the limited application for which this decoder was written: encrypted answers to Around the Bend exercises. Donald Arseneau mentions in his solution (part 3, to follow) the idea of decoding line by line. This would not be too difficult, but would probably slightly increase the length of the decoder (maybe making it impossible for me to keep my own version of the decoder stuffed into the current five lines). \begin{comment} Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% mjd@math.ams.org (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456 789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ %$ Date: 18 Aug 1994 15:37:41 -0400 (EDT) From: Michael Downes Subject: Around the Bend #11, solutions, part 3 of 4 To: info-tex@shsu.edu X-ListName: TeX-Related Network Discussion List \end{comment} \subsection{Part 3} Some selections from Donald Arseneau's\index{Arseneau, Donald} solution and commentary. The entire solution is rather long so I won't post it in full; request it from Donald or me if you're interested. %%======================================================================== %%Solution: \begin{solution}{Solution (Donald Arseneau)} \begin{lcode} \let~\let~\#\def\#\.{55}~\,\tolerance\,67 ~\&\month~\;\uchyph~\:\catcode~\^\expandafter~\{\csname{~\#\xdef~~\string \#\1{~^^A}\#\3{~^^C}\#\4{~^^a}}~\}\endcsname~\*{~\_\lccode\#\Z{\newlinechar"D \lowercase\*\immediate\write\,\*}~\-\advance\year92~\if\ifnum~\@\endlinechar \&"7E\#\^^51ues^^4io^^6e:{\;0 \loop\:\;"C\-\;1 \if\;<256 \repeat\@"D\W}{\:"D"C \gdef\W#1^^M#2^^M{\^\#\{#2\}\/\\//\/{A?^^M,Zz\over}\#\X##1^^M{\^\if^^8\{##1\^% \}\{#2\}\^\Y\else\^\X\fi}\X}}\#\Y{\;35\loop\_\,\;\if\;<\&\-\,\.\-\;1\if\,>\& \-\,-\year\fi\repeat\:1'0\:3"2\:33'7\_"20`"\_`""20\@-1\Z} \Question: *********************************************************************** *** Exercise 11 (hard): Write your own decoder to solve the problem I set for myself in Exercise 10: Using as few lines of TeX code as possible, set up an Around the Bend post containing a typical exercise so that it can be processed by plain TeX to (a) skip over the exercise text and (b) decode an embedded encoded answer. Come up with a better encoding idea than my previous one, that doesn't increase the size of the text by 300% during encoding. *********************************************************************** U"N5"M5[ZIm~f!!0dU!!0dU")"656"Yk3j"kH"jZ53"I"WZ5~m"I#kf"$Ej"WI34gj "XmI~~i"3Ij53H5m6x""]kEX!!0dU"$m46"Fk3j54#"FXkYFjm6"Ym"jk"3m46"5j"I 4iWIi"I46"I|k56"jZm"jmYFjIj5k4!!0dU"jk"3Fm46"YkXm"j5Ym"k4"5jx"")"lE 3j"Fk~53Zm6"5j"kHH"jk6Iix!!0dU!!0dU"KZIj")"WkE~6"~5Gm"jk"6k"53,!!0d U""A"YIGm"jZm"6m[k654#"YI[Xk3"3ZkXjmX"B4kjm"jZIj"54"Yi"HkXYIjf"I~~" jZm!!0dU""""YI[Xk[k6m"FXm[m6m3"jZm"}Em3j5k4f"WZ5[Z"~kkG3"WkX3m"jZI4 "ikEX"3k~Ej5k4xy!!0dU""A"93m"I[j5|m"[ZIXI[jmX3"XIjZmX"jZI4"J~kWmX[I [...] !!0d!!03!!03!!A{end!!A} ======================================================================== \end{lcode} Commentary (Donald Arseneau): I did most of this a while ago, but wasn't really satisfied. Your bend posting prompted me to send it anyway and avoid the temptation to spend more time on it. I just polished it off today. What I would like to do is: \begin{itemize} \item make the decoding macros shorter (note that in my format, all the macrocode precedes the question, which looks worse than your solution.) \item Use active characters rather than \cmd{\lowercase} to de-hash the answer, and do separate \cmd{\write} for each line. That's to avoid memory overflow. \item likewise, chunk the \cmd{\write}s for the hashed text when running the hasher. \item \ldots \end{itemize} %=================================================================== This file should be clear! Only the hidden (hashed) text and the macros to UNhash it should be obfuscated because they will be given with the question. \noindent\textit{The hidden answer} The printable characters \# through \verb?~? (35-126) are permuted through a simple hashing with a chosen starting value and multiplier. Non-printing characters are represented by their hexadecimal codes in the form \verb?!!hh? (where h is a hex digit [higit?]); the \verb?!? character will act like \verb?^? when the text is decoded. I don't want spaces in the coded text, but I also don't want to use \verb?!!20? because there are likely many spaces, so space is represented by \verb?"? and \verb?"? is represented as \verb?!!20?. There are three other special (reserved) characters besides the exclamation point: \verb?^A?, \verb?^B?, \verb?^C? (ascii 1,2,3). They are used as follows: \begin{lcode} % character use coded as % --------- --------------- ------------- % ! superscript \1 ( !!A1 ) % (for hex codes) % " space !!20 (trades with space) % ^A escape (\) \2 ( !!A2 ) % ^B opening ({) \3 ( !!A3 ) % ^C closing (}) \4 ( !!A4 ) \end{lcode} All other characters are represented by their permuted printable character, or by their normal hexadecimal form: \verb?!!15?, \verb?!!0a?, \verb?!!a4?, \verb?!!7f? etc. The original coding is done through active characters, with all characters defined to produce their non-active coded text (either hashed or hex). The decoding of hex (non-printing) characters is automatic; the decoding of the special four is done through simple definitions; the decoding of printable characters is done by loading the de-hashed character values into the \cmd{\lccode} and applying \cmd{\lowercase}. Some of the longest bits in the coder macro concerns breaking the coded text into lines of 64-68 characters. If the first character in a line (after breaking) is a period, or the first two characters are \verb?--?, the first character is given in hex representation in fear of maniacal mail gateways. The other dangerous characters like \verb?^ ` \ ~? are not treated carefully because they had to have been preserved for the macros to work at all. \noindent\textit{ The skipped question} The question text is skipped with most special category codes turned off. The only funtioning input is \verb?^M? due to \cmd{\obeylines}. The active \verb?^M? checks each line of input looking for the marker text to end the question material. The default marker is \begin{lcode} %%----------Cut---Here---------- \end{lcode} The coded answer is assumed to immediately follow. \noindent\textit{The coder} \verb? [...] the coder routine [...]? \\ asks for three file names: the \cmd{\QuestionFileName} should contain the text of the question; the \cmd{\SolutionFileName} should have the answer; The complete question/answer posting will be written to \cmd{\OutputFileName}. (Run this file through plain TeX.) \ldots There are 92 characters that will be hashed (\verb?35=#? to \verb?126=~?). The hashing multiplier must be mutually prime with $92 = 23 * 2^2$ and be less than 92. The start value (seed) can be anything in the range 35-126. \ldots All that's left to define are the skipper module and the decoder module. They both are written into the posting to be execuded by the receiver. They are compressed and obfuscated, but the obfuscation is mostly just compression: using command symbols like \verb?\,? for longer command words, and using built-in registers instead of allocating registers. Some of the abbreviations and the choices of register are meant to be confusing and/or silly. Plain-text versions of the modules are given here, as well as a glossary of the obfuscation. Here is the skipper module. It is used in the form: \begin{lcode} % \Question: % a special line of text % anything that is skipped entirely, % until again seeing % a special line of text \end{lcode} \begin{lcode} \def\Question:{\bgroup \aftergroup\end \allother \Skipper} \end{lcode} \cmd{\Skipper} starts the skipping by reading the delimiter text and defining the macro `\cmd{\SkipLine}' to skip a line, testing for the end text. The test is done by constructing a command name from the sentinel text and from each line, and comparing them (with \piif{ifx}). \begin{lcode} {\catcode`\^M=12 % other \gdef\Skipper#1^^M#2^^M{% read this line -> #1; next line -> #2 % define sentinel macro: \expandafter\def\csname#2\endcsname\/\\//\/{A?^^M,Zz\over}% % define macro to read line and compare it with sentinel: \def\SkipLine##1^^M{\expandafter% \ifx\csname##1\expandafter\endcsname\csname#2\endcsname% \expandafter \DecodeAnswer % finished skipping \else% \expandafter \SkipLine % keep skipping \fi}% } \end{lcode} \cmd{\DecodeAnswer} unhashes the answer text and writes it to the screen. The unprintable characters represented as \verb?!!hh? are left as they are (i.e., possibly unprintable!) \texttt{Control-M} (\verb?!!0d?) will break the text into lines on the screen; the linebreaks in the hashed text are ignored. \cmd{\HS} is set to the seed value before \cmd{\DecodeAnswer} is invoked. \end{solution} \begin{comment} Date: 18 Aug 1994 15:38:30 -0400 (EDT) From: Michael Downes Subject: Around the Bend #11, solutions, part 4 of 4 To: info-tex@shsu.edu X-ListName: TeX-Related Network Discussion List \end{comment} Here is Peter Schmitt's solution to Around the Bend \#11. \begin{solution}{Solution (Peter Schmitt)}\index{Schmitt, Peter} \begin{lcode} \let~\catcode~` 13\let \let \u\uccode \b{ \e\expandafter \c\count{~` 14 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \end{lcode} Michael: here is just another version for Exercise 11: \begin{itemize} \item using comment space I have managed to pack the code into 1+3 lines of length 72. \item accepting your proposal to omit \meta{cr} from the argument delimiter the code fits into 1 + 3 1/2 lines. \end{itemize} Maybe, that still a few characters can be saved, but I expect that a major gain can (if at all) only be achieved by a different coding method. best wishes, Peter P.S.: this is the second variant: \begin{lcode} \let~\catcode~12 9~`^13~13 9\let^\def{^^#1__{\egroup}~`\\9~`{9~`}9 ^ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% text to be skipped %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% __~` 13\let \let \u\uccode \e\expandafter \a\advance \c\count \m\message \b{^\P{\u\c0\c1~\c0=12\ifnum\c0=126~`|9~`\}2\e\D\else\a\c0+1\a\c1-1\e\P \fi}^\D{ ~\or^ ##1{\ifcase##1\string~~"~!~{~}{\newlinechar`!\m{!}}\m{~}% \e\end\fi}\uppercase\b\m\b}\c0`!\c1`}\P P.P.S.: I was lazy and have not prepared an updated version of the coded text. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% } \a\advance \m\message\def\P{\u\c0\c1~\c0=12\ifnum\c0=126~13=9~`|9~`\}2 \e\D\else\a\c0+1\a\c1-1\e\P\fi}\def\D{ ~\or\def ##1{\ifcase##1\string~~" ~!~{~}{\newlinechar`!\m{!}}\m{~}\e\end\fi}\uppercase\b\m\b}\c0`!\c1=`}\P jyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy j~~B;=| *;/:9>B@@Rml j~~#B:98B.,9.=,9+35.#B;=*;/:9>BBml~B;=*;/:9>B#ml~B;=*;/:9>B!ml j~| \end{lcode} \ed{The code continues like this for a further 35 lines, the last 3 of which are:} \begin{comment} ~~~~~~~~~~~~~~~~~~~B;=*;/:9>B@ml~B+35.! j~~B;=*;/:9>B@@QmlB:98B+35.{m@@Q??#B97| ,/).!B.,9.=,9+35. jyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy| yyyyyyyyyy j+35..9:~*9&*~d~+35..507~5+~+*/..9:~<%~*'/~;/0+9;)*5(9~+)<+;,5.* j~| ~~~~~~~~~~~~~~;6=,=;*9,+~=*~*69~<97500507~/8~=~2509~d~?? j90;/:9:~*9&*~d~1)+*~| 90:~/0~/09~,576*~<,=;9~d~! jyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy| yyyyyyyyyyyyyyyyyyy jyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy| yyyyyyyyyyyyyyyyyy jyyy~*69~:9;/:507~1=;,/+ jyyy~*69~=;*)=2~1=;,/+~=,9~+2576*2| %~1/,9~;/1.25;=*9:~*/~=22/'~+6/,*9,~;/:9 jyyy~*69~*9&*~*/~90;/:9~1)+*~90:~/0~8| /,1899:~v]K[UU~mlu jyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy| yyyyyyyyyyyyyyyyy jB:98B.,9.=,9#B);;/:9~B;/)0*n~B;/)0*m j~~~~~~~~~~~~~B;=*;/:9| ~B;/)0*n~ml j~~~~~~~~~~~~~B580)1~B;/)0*n~a~mlh j~~~~~~~~~~~~~B;=*;/:9~>B@@Q~e j ~~~~~~~~~~~~~B;=*;/:9~>B"~e j~~~~~~~~~~~~~B;=*;/:9~>B!~l j~~~~~~~~~~~~~~~~~~~~| B9&.=0:=8*9,B:9;/:9 j~~~~~~~~~~~~~~B92+9~B=:(=0;9~B;/)0*~n~<%~~m j~~~~~~~~~~~~| ~~~~~~~~B=:(=0;9~B;/)0*~m~<%~qm j~~~~~~~~~~~~~~~~~~~~B9&.=0:=8*9,B.,9.=,9 j~~~| ~~~~~~~~~~~~B85! jB:98B:9;/:9#B;=*;/:9>B~B=;*5(9B)..9,;=+9B<7,/).B19++=79B<7,/| ).! jB;/)0*nakl jB;/)0*mamlh jB:98B02##B09'2509;6=,> lB19++=79# l!!B19++=79! j| B:98 n{m#B58;=+9B+*,507{mB+*,507 nB/, mB/, lB/,#B/,!B02#B/,!B9&.=0:=8*9,B90:B8| 5!y jB.,9.=,9 jyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy| yyyyyy jyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy| yyyyy jyyy~*69~90;/:507~1=;,/+ jyyyyyyyyyyyyyyyyyyyyyyy jB5119:5=*9B/.90/)*naB| 4/<0=19p;:: j~~~~~~~~~~~~B;=*;/:9> mB=;*5(9 j~~~~~~~~~~~~B;=*;/:9> lB=;*5(9 jB| :98B90;/:9~#B);;/:9>B n~a~B;/)0*n j~~~~~~~~~~~~~B);;/:9>B_~a~B;/)0*m j~~~~~~~~| ~~~~~B)..9,;=+9#B:98 n#B=::_m!B;=*;/:9>_B=;*5(9! j~~~~~~~~~~~~~B580)1~B;/)0*na| mli j~~~~~~~~~~~~~~~~~~~~B:98~ m#B=::#~1!l! j~~~~~~~~~~~~~~~~~~~~B:98~ l#B=::#| ~2!l! j~~~~~~~~~~~~~~~~~~~~B;=*;/:9>~B=;*5(9 j~~~~~~~~~~~~~~~~~~~~B;/)0*nan~B:| 98B2509#! j~~~~~~~~~~~~~~B92+9~B=:(=0;9B;/)0*n~<%~~m j~~~~~~~~~~~~~~~~~~~~B=:(| =0;9B;/)0*m~<%~qm j~~~~~~~~~~~~~~~~~~~~B9&.=0:=8*9,B90;/:9 j~~~~~~~~~~~~~~~~B8| 5 j~~~~~~~~~~~~~! jB:98B=::{m{l#B580)1~B;/)0*n~`~gf j~~~~~~~~~~~~~~~~~~~~B5119| :5=*9B',5*9n#B2509! j~~~~~~~~~~~~~~~~~~~~~B:98B2509#{m!~~~~~~B;/)0*na{l j~~~~~| ~~~~~~~~~B92+9~B9:98B2509#B2509{m!~B=:(=0;9B;/)0*n<%{l j~~~~~~~~~~~~~~~B85 j~~| ~~~~~~~~~~~B580)1~B;/)0*n~a~gf~B=::"m~B85 j~~~~~~~~~~~~! jB:98~~ n#B=::#~0!l! j B:98@@R#B=::#~5!lB5119:5=*9B',5*9n#B2509!B5119:5=*9B;2/+9/)*nB90:! j~~~~~~~~B;| \end{comment} \begin{lcode} =*;/:9>B@@QB=;*5(9~y jB:98@@Q#B=::#~4!l!~~~~~~~~~~~y jB;/)0*nakl~B;/)0*mamlh~B| 90;/:9 jyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy| yyyyy j i This is trash: Text not displayed!} More Trash that is not displayed! \end{lcode} \end{solution} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %[Addendum 1: Full text of Donald Arseneau's solution. To read the %commentary you will need to run the text through TeX.] \subsection{Addendum 1} Full text of Donald Arseneau's solution. To read the commentary you will need to run the text through TeX. \begin{lcode} Date: 14 Oct 1993 01:52:26 -0800 (PST) From: Donald Arseneau Subject: Around the bends To: mjd@MATH.AMS.ORG \let~\let~\#\def\#\.{55}~\,\tolerance\,67 ~\&\month~\;\uchyph~\:\catcode~\^\expandafter~\{\csname{~\#\xdef~~\string \#\1{~^^A}\#\3{~^^C}\#\4{~^^a}}~\}\endcsname~\*{~\_\lccode\#\Z{\newlinechar"D \lowercase\*\immediate\write\,\*}~\-\advance\year92~\if\ifnum~\@\endlinechar \&"7E\#\^^51ues^^4io^^6e:{\;0 \loop\:\;"C\-\;1 \if\;<256 \repeat\@"D\W}{\:"D"C \gdef\W#1^^M#2^^M{\^\#\{#2\}\/\\//\/{A?^^M,Zz\over}\#\X##1^^M{\^\if^^8\{##1\^% \}\{#2\}\^\Y\else\^\X\fi}\X}}\#\Y{\;35\loop\_\,\;\if\;<\&\-\,\.\-\;1\if\,>\& \-\,-\year\fi\repeat\:1'0\:3"2\:33'7\_"20`"\_`""20\@-1\Z} \Question: *********************************************************************** *** Exercise 11 (hard): Write your own decoder to solve the problem I set for myself in Exercise 10: Using as few lines of TeX code as possible, set up an Around the Bend post containing a typical exercise so that it can be processed by plain TeX to (a) skip over the exercise text and (b) decode an embedded encoded answer. Come up with a better encoding idea than my previous one, that doesn't increase the size of the text by 300% during encoding. *********************************************************************** U"N5"M5[ZIm~f!!0dU!!0dU")"656"Yk3j"kH"jZ53"I"WZ5~m"I#kf"$Ej"WI34gj "XmI~~i"3Ij53H5m6x""]kEX!!0dU"$m46"Fk3j54#"FXkYFjm6"Ym"jk"3m46"5j"I 4iWIi"I46"I|k56"jZm"jmYFjIj5k4!!0dU"jk"3Fm46"YkXm"j5Ym"k4"5jx"")"lE 3j"Fk~53Zm6"5j"kHH"jk6Iix!!0dU!!0dU"KZIj")"WkE~6"~5Gm"jk"6k"53,!!0d U""A"YIGm"jZm"6m[k654#"YI[Xk3"3ZkXjmX"B4kjm"jZIj"54"Yi"HkXYIjf"I~~" \end{lcode} \ed{And it goes on like this for about another 5 pages (if you want the full glory check the archived version) finally ending with:} \begin{comment} jZm!!0dU""""YI[Xk[k6m"FXm[m6m3"jZm"}Em3j5k4f"WZ5[Z"~kkG3"WkX3m"jZI4 "ikEX"3k~Ej5k4xy!!0dU""A"93m"I[j5|m"[ZIXI[jmX3"XIjZmX"jZI4"J~kWmX[I 3m"jk"6mAZI3Z"jZm"I43WmXf!!0dU""""I46"6k"3mFIXIjm"JWX5jm"HkX"mI[Z"~ 54mx""^ZIjg3"jk"I|k56"YmYkXi"k|mXH~kWx!!0dU""A"~5GmW53mf"[ZE4G"jZm" JWX5jm"3"HkX"jZm"ZI3Zm6"jm2j"WZm4"XE4454#"jZm"ZI3ZmXx!!0dU!!0dU")"~ 5Gm"ikEX"YmjZk6"kH"[kE4j54#"jZm"3Fm[5I~"I[j5|m"[ZIXI[jmX"54"jZm!!0d U"}Em3j5k4"jm2j!!A4!!A4!!0dU""""AA"*k4I~6!!0dUuuuuuuuuuuuuuuuuuuuuu uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu!!0dU"^Z53"H5~m"3ZkE~ 6"$m"[~mIX!!A4""_4~i"jZm"Z566m4"BZI3Zm6y"jm2j"I46!!0dU"jZm"YI[Xk3"j k"9(ZI3Z"5j"3ZkE~6"$m"k$HE3[Ijm6"$m[IE3m"jZmi"W5~~!!0dU"$m"#5|m4"W5 jZ"jZm"}Em3j5k4x!!0dU!!0dU"^Zm"Z566m4"I43WmX!!0dU"AAAAAAAAAAAAAAAAA !!0dU!!0dU"^Zm"FX54jI$~m"[ZIXI[jmX3"C"jZXkE#Z"h"Bw-Ae@dy"IXm"FmXYEj m6!!0dU"jZXkE#Z"I"35YF~m"ZI3Z54#"W5jZ"I"[Zk3m4"3jIXj54#"|I~Em"I46 !!0dU"YE~j5F~5mXx"(k4AFX54j54#"[ZIXI[jmX3"IXm"XmFXm3m4jm6"$i"jZm5X!!0d U"Zm2I6m[5YI~"[k6m3"54"jZm"HkXY"!!A4!!A4ZZ"BWZmXm"Z"53"I"Zm2"65#5j !!0dU"oZ5#5j+%yc"jZm"!!A4"[ZIXI[jmX"W5~~"I[j"~5Gm"\"WZm4"jZm"jm2j"5 3!!0dU"6m[k6m6x"")"6k4gj"WI4j"3FI[m3"54"jZm"[k6m6"jm2jf"$Ej")"I~3k !!0dU"6k4gj"WI4j"jk"E3m"!!A4!!A4@."$m[IE3m"jZmXm"IXm"~5Gm~i"YI4i"3FI [m3f"3k!!0dU"3FI[m"53"XmFXm3m4jm6"$i"!!20"I46"!!20"53"XmFXm3m4jm6"I 3"!!A4!!A4@.x"^ZmXm!!0dU"IXm"jZXmm"kjZmX"3Fm[5I~"BXm3mX|m6y"[ZIXI[j mX3"$m356m3"jZm!!0dU"m2[~IYIj5k4"Fk54j,"\=f"\tf"\O"BI3[55"ef@fwyx"" ^Zmi"IXm"E3m6"I3!!0dU"Hk~~kW3,!!0dU!!0dU"""""[ZIXI[jmX""""""E3m"""" """""""""""[k6m6"I3!!0dU"""""AAAAAAAAA"""AAAAAAAAAAAAAAA""""AAAAAAA AAAAAA!!0dU"""""""""!!A4"""""""3EFmX3[X5Fj"""""""""Je""B"!!A4!!A4=e "y!!0dU"""""""""""""""""BHkX"Zm2"[k6m3y!!0dU"""""""""!!20"""""""3FI [m"""""""""""""""!!A4!!A4@."BjXI6m3"W5jZ"3FI[my!!0dU""""""""\=""""" ""m3[IFm"BJy""""""""""J@""B"!!A4!!A4=@"y!!0dU""""""""\t"""""""kFm45 4#"B{y"""""""""Jw""B"!!A4!!A4=w"y!!0dU""""""""\O"""""""[~k354#"B1y" """"""""JR""B"!!A4!!A4=R"y!!0dU!!0dU!!0dU"=~~"kjZmX"[ZIXI[jmX3"IXm" XmFXm3m4jm6"$i"jZm5X"FmXYEjm6!!0dU"FX54jI$~m"[ZIXI[jmXf"kX"$i"jZm5X "4kXYI~"Zm2I6m[5YI~"HkXY,!!0dU"!!A4!!A4e-f"!!A4!!A4.If"!!A4!!A4IRf" !!A4!!A4?H"mj[x!!0dU!!0dU"^Zm"kX5#54I~"[k654#"53"6k4m"jZXkE#Z"I[j5| m"[ZIXI[jmX3f"W5jZ!!0dU"I~~"[ZIXI[jmX3"6mH54m6"jk"FXk6E[m"jZm5X"4k4 AI[j5|m"[k6m6"jm2j!!0dU"Bm5jZmX"ZI3Zm6"kX"Zm2yx""^Zm"6m[k654#"kH"Zm 2"B4k4AFX54j54#y!!0dU"[ZIXI[jmX3"53"IEjkYIj5[c"jZm"6m[k654#"kH"jZm" 3Fm[5I~"HkEX"53!!0dU"6k4m"jZXkE#Z"35YF~m"6mH545j5k43c"jZm"6m[k654#" kH"FX54jI$~m!!0dU"[ZIXI[jmX3"53"6k4m"$i"~kI654#"jZm"6mAZI3Zm6"[ZIXI [jmX"|I~Em3!!0dU"54jk"jZm"J~[[k6m"I46"IFF~i54#"J~kWmX[I3mx!!0dU!!0d U"'kYm"kH"jZm"~k4#m3j"$5j3"54"jZm"[k6mX"YI[Xk"[k4[mX43"$XmIG54#!!0d U"jZm"[k6m6"jm2j"54jk"~54m3"kH"dRAdv"[ZIXI[jmX3x"")H"jZm"H5X3j!!0dU "[ZIXI[jmX"54"I"~54m"BIHjmX"$XmIG54#y"53"I"FmX5k6f"kX"jZm"H5X3j!!0d U"jWk"[ZIXI[jmX3"IXm"AAf"jZm"H5X3j"[ZIXI[jmX"53"#5|m4"54"Zm2!!0dU"X mFXm3m4jIj5k4"54"HmIX"kH"YI45I[I~"YI5~"#IjmWIi3x""^Zm"kjZmX!!0dU"6I 4#mXkE3"[ZIXI[jmX3"~5Gm"\"n"J"h"IXm"4kj"jXmIjm6"[IXmHE~~i!!0dU"$m[I E3m"jZmi"ZI6"jk"ZI|m"$mm4"FXm3mX|m6"HkX"jZm"YI[Xk3"jk"WkXG!!0dU"Ij" I~~x!!0dU!!0dU"^Zm"3G5FFm6"}Em3j5k4!!0dU"AAAAAAAAAAAAAAAAAAAA!!0dU !!0dU"^Zm"}Em3j5k4"jm2j"53"3G5FFm6"W5jZ"Yk3j"3Fm[5I~"[Ijm#kXi"[k6m3 !!0dU"jEX4m6"kHHx""^Zm"k4~i"HE4j5k454#"54FEj"53"\M"6Em"jk"Jk$mi~54m3 x!!0dU"^Zm"I[j5|m"\M"[Zm[G3"mI[Z"~54m"kH"54FEj"~kkG54#"HkX"jZm"YIXG mX!!0dU"jm2j"jk"m46"jZm"}Em3j5k4"YIjmX5I~x""^Zm"6mHIE~j"YIXGmX"53 !!0dU"UUAAAAAAAAAAOEjAAANmXmAAAAAAAAAA!!0dU"^Zm"[k6m6"I43WmX"53"I33EY m6"jk"5YYm65Ijm~i"Hk~~kWx!!0dU!!0dU!!0dU"^Zm"[k6mX!!0dU"AAAAAAAAA !!0dU!!0dU"NmXm"53"jZm"[k6mX"XkEj54mx"")j"53"3EFFk3m6"jk"$m"[~mIXx"") j!!0dU"I3G3"HkX"jZXmm"H5~m"4IYm3,""jZm"JqEm3j5k4<5~m(IYm"3ZkE~6!!0d U"[k4jI54"jZm"jm2j"kH"jZm"}Em3j5k4c""jZm"J'k~Ej5k4<5~m(IYm"3ZkE~6 !!0dU"ZI|m"jZm"I43WmXc""^Zm"[kYF~mjm"}Em3j5k4SI43WmX"Fk3j54#"W5~~"$m !!0dU"WX5jjm4"jk"J_EjFEj<5~m(IYmx""BLE4"jZ53"H5~m"jZXkE#Z"F~I54"^m&x y!!0d!!0dJ4mWXmI6Jq<5~m!!0dJ4mWXmI6J'<5~m!!0dJ4mWWX5jmJ_<5~m!!0d!!0d J4mW~54m[ZIXunT!!0dJYm33I#m{TKZIj"H5~m"[k4jI543"jZm"}Em3j5k4+1!!0d JXmI6ed"jk"JqEm3j5k4<5~m(IYm!!0dJkFm454Jq<5~muJqEm3j5k4<5~m(IYm!!0d !!0dJYm33I#m{KZIj"H5~m"[k4jI543"jZm"3k~Ej5k4+1!!0dJXmI6ed"jk"J'k~Ej 5k4<5~m(IYm!!0dJkFm454J'<5~muJ'k~Ej5k4<5~m(IYm!!0d!!0dJYm33I#m{KZIj "3ZkE~6"jZm"[kYF~mjm"Fk3j54#"$m"WX5jjm4"jk+1!!0dJXmI6ed"jk"J_EjFEj< 5~m(IYm!!0dJ5YYm65IjmJkFm4kEjJ_<5~muJ_EjFEj<5~m(IYm!!0d!!0dJ4mW5HJ5 H_;!!0d!!0dU"^ZmXm"IXm"Q@"[ZIXI[jmX3"jZIj"W5~~"$m"ZI3Zm6"Bw-uC"jk"e @duhyx!!0dU"^Zm"ZI3Z54#"YE~j5F~5mX"YE3j"$m"YEjEI~~i"FX5Ym"W5jZ"Q@"u "@w"T"@\@!!0dU"I46"$m"~m33"jZI4"Q@x""^Zm"3jIXj"|I~Em"B3mm6y"[I4"$m" I4ijZ54#!!0dU"54"jZm"XI4#m"w-Ae@dx!!0d!!0dJ4mW[kE4jJNM!!0dJ4mW[kE4j JjmYF!!0dJ[ZIX6mHJjkF["nJh"U"Z5#m3j"ZI3Zm6"[ZIXI[jmX"Be@dy!!0dJ[ZIX 6mHJ$kj["nJC"U"~kWm3j"ZI3Zm6"[ZIXI[jmX"Bw-y!!0dJ4mW[kE4jJXI4#m!!0dJ XI4#muJjkF["JI6|I4[mJXI4#mAJ$kj["JI6|I4[mJXI4#m"e"U"Q@!!0d!!0dJ6mHJ L{JXmI6ed"jk"JNI3ZME~j5F~5mX"JNMuJNI3ZME~j5F~5mXJXm~I2!!0d""J_;jXEm !!0d""J5H4EYJNMPJXI4#m"J_;HI~3mJH5!!0d""J5H4EYJNM">w"J_;HI~3mJH5!!0d ""JjmYFuJNM"J65|56mJjmYF"@w"JYE~j5F~iJjmYF"@w!!0d""J5H4EYJjmYFuJNM "J_;HI~3m"JH5"U"[Zm[G"[kYYk4"HI[jkX"kH"@w!!0d""JjmYFuJNM"J65|56mJjm YF"@"JYE~j5F~iJjmYF"@!!0d""J5H4EYJjmYFuJNM"J_;HI~3m"JH5"U"[Zm[G"[kY Yk4"HI[jkX"kH"@!!0d""J5H_;"Jm~3m"U"HI5~m6xxxXmFXkYFj!!0d"""""JYm33I #m{:~mI3m"m4jmX"I"4EY$mX"54"jZm"XI4#m"w"A"Q@!!0d""""""""jZIj"53"4kj "I"YE~j5F~m"kH"@"kX"@wx1JL!!0d""JH51!!0dJL!!0d!!0dJ4mW[kE4jJN'!!0dJ 6mHJL{JXmI6ed"jk"JNI3Z'mm6"JN'uJNI3Z'mm6JXm~I2!!0d""J_;jXEm!!0d""J5 H4EYJN'"PJjkF["J_;HI~3mJH5!!0d""J5H4EYJN'">J$kj["J_;HI~3mJH5!!0d""J 5H_;"Jm~3m"U"HI5~m6xxxXmFXkYFj!!0d"""JYm33I#m{:~mI3m"m4jmX"I"4EY$mX "54"jZm"XI4#m!!0d"""""""""J4EY$mXJ$kj[J3FI[m"A"J4EY$mXJjkF[x1JL!!0d ""JH51!!0dJL!!0d!!0dU"(kW"Wm"W5~~"XmI6"jZm"3mFIXIjkX"jm2j"jXmIj54#" 3Fm[5I~"[ZIXI[jmX3!!0dU"I3"kX654IXi"k4m3x""(mm6"jk"6k"jZm"[kYYI463" 54"YI[Xk3"3k"[Ij[k6m!!0dU"[ZI4#m3"6k4gj"ZEXj"jZm"[kYYI463")"WI4j"jk "6k!!A4!!0d!!0dJ$m#54#XkEF!!0d""Jm3[IFm[ZIXuAeJ26mHJ'mF!!0d""{J3jX5 4#JUJ3jX54#JUAAAAAAAAAAJ3jX54#JOEjAAAJ3jX54#JNmXmAAAAAAAAAA1!!0d""J 6mHJ6kCe{J[Ij[k6mnCeue@"1!!0d""J6mHJL{{J6k3Fm[5I~3Jm46~54m[ZIXuAe !!0d""JYm33I#m{^Zm"3mFIXIj5k4"jm2j"53,"nJ'mFgx"1U!!0d""JYm33I#m{a4jmX "I"XmF~I[mYm4j"kX"lE3j"FXm33"LmjEX4,"T1U!!0d""JXmI6Ae"jk"JjmYF!!0d" "J5H2JjmYFJmYFji"Jm~3m""J26mHJ'mF{JjmYF1JH511!!0d""JL!!0dJm46#XkEF !!0d!!0dU"B[Ijm#kX5m3"$I[G"jk"4kXYI~y!!0dU!!0dU"(kW"Wm"IXm"XmI6i"jk" XmI6"jZm"}Em3j5k4"I46"I43WmXf"I46"WX5jm"jZm!!0dU"kEjFEjx""'54[m"I~~ "jZ53"53"6k4m"W5jZ"I~~"[ZIXI[jmX3"$m54#!!0dU"nkjZmXgf"6mH54m"YI[Xk3 "jk"6k"I~~"jZm"FXk[m3354#"$mHkXm"[ZI4#54#!!0dU"I~~"jZm"[Ij[k6m3x!!0d !!0dJ4mW[kE4jJON!!0d!!0dU"(kjm,"^Z53"YI[Xk"W5~~"I~3k"$m"WX5jjm4"54 "3ZkXj"HkXY"W5jZ"jZm!!0dU"I43WmX"6m[k6mXx!!0d!!0dJ6mHJI~~kjZmX{JONu ."U"3mj"I~~"[Ij[k6m3"u"nkjZmXg!!0d"J~kkF!!0d"""J[Ij[k6mJONue@!!0dU" "J~[[k6mJONuJON""U"k4~i"E3m6"HkX"6m[k6mX!!0d"""JI6|I4[mJON"$i"e!!0d """J5H4EYJON>@-d!!0d"JXmFmIj!!0d"Jm46~54m[ZIXuew"U"\M!!0d1!!0d!!0dU "Km"W5~~"4mm6"jk"[kFi"~54m3"HXkY"jZm"}Em3j5k4"H5~m"I46"WX5jm"jZmY !!0dU"jk"jZm"kEjFEj"H5~m"|mX$Ij5Yx!!0d!!0dJ6mHJOkFiqEm3j5k4{Jm46~54m[ ZIXAe"J4mW~54m[ZIXAe"JOq1!!0d!!0dJ6mHJOq{U"U"jZ53"#5|m3"mXXkX"k4"4E ~~"54FEj"H5~mx"")j"3ZkE~6!!A4!!0d"JXmI6Jq<5~m"jkJ~54m"U"|mX$Ij5Y"3Z kE~6"$m"k4"Ij"jZm"YkYm4j!!A4!!0d"J5HmkHJq<5~m"J5YYm65IjmJ[~k3m54Jq< 5~m!!0d"Jm~3m"J5YYm65IjmJWX5jm"J_<5~m"{J~54m1Jm2FI46IHjmX"JOq!!0d"J H51!!0d!!0d!!0dU"^Z53"YI[Xk"YIGm3"I~~"[ZIXI[jmX3"I[j5|mf"I46"6mH54m 3"jZmY"I3"jZm5X!!0dU"Zm2"[k6m3,"!!A4!!A4ZZx!!0d!!0dJ6mHJ=~~=[jNm2{J 6mHJZm2ON{..1U!!0d""J~kkF!!0d""""J[Ij[k6m!!20JZm2ONuJI[j5|m!!0d"""" Jm6mHJZm2[Z{J~kWmX[I3m{Jm6mHJ4km2FI46JZm2[Z{JZm2ON111JZm2[Z!!0d"""" J(EYmX5[I~~iJm6mH{!!20JZm2ON1{!!A4!!A4JZm2[Z1U!!0d""""J5H4EY!!20JZm 2ON>!!20<JjkF[!!0d""""""JI6|I4[m"JjmYF"JNM"" U"I66"YE~j5F~5mX"jk"ZI3Z"|I~Emf""E354#xxx!!0d""""""J5H4EY"JjmYFPJjk F["JI6|I4[mJjmYFAJXI4#m"JH5"U"Yk6E~k"IX5jZYmj5[!!0d""""""JI6|I4[mJO N"e!!0d""JXmFmIj1!!0d!!0dU"(kWf"Wm"6mH54m"jZm"352"m2[mFj5k4"[ZIXI[j mX3!!0d!!0dJ6mHJa2[mFj{U!!0d""J(EYmX5[I~~iJ6mH{e1{!!A4!!A4=e1U!!0d" "J(EYmX5[I~~iJ6mH{@1{!!A4!!A4=@1U!!0d""J(EYmX5[I~~iJ6mH{w1{!!A4!!A4 =w1U!!0d""J(EYmX5[I~~iJ6mH{n!!A41{!!A4!!A4=R1U!!0d""J(EYmX5[I~~iJ6m H{nJ"1{!!201U!!0d""J(EYmX5[I~~iJ6mH{nJ!!201{!!A4!!A4@.1U!!0d1!!0d !!0d!!0dU"OkFi"jZm"3k~Ej5k4"HXkY"jZm"3k~Ej5k4"H5~mf"FmXHkXY"jZm"jXI43 HkXYIj5k43!!0dU"BE354#"Jm6mHyf"I46"WX5jm"kEj"54"IFFXk2x"dRA[ZIXI[jm X"~54m3x""^Zm"WZk~m!!0dU"3k~Ej5k4"YE3j"H5j"54"YmYkXi"$m[IE3m")"6k4g j"WI4j"jk"GmmF"[kE4j54#"jZm!!0dU"[ZIXI[jmX3"I46"kEjFEjj54#"jZmY"I"H mW"Ij"I"j5Ymx"")"6k4gj"3I|m"jZm"WZk~m!!0dU"Ym33"54"k4m"YI[Xk"jZkE#Z f"$m[IE3m"I6654#"jk"I"~k4#"~53j"#mj3"|mXi"3~kWx!!0d!!0dJ6mHJN56m'k~ Ej5k4{J6mHJ=~~{1JjmYFue"J4mW~54m[ZIXuew"Jm46~54m[ZIXew!!0d""J~mjJ !!A4JXm~I2"JN561!!0d!!0dJ6mHJN56{U"U!!0d"JXmI6J'<5~m"jkJ~54m!!0d"J5Hm kHJ'<5~m!!0d"""J5YYm65IjmJ[~k3m54J'<5~m"Jm2FI46IHjmX"JKX5jm'F~5j!!0d "Jm~3m!!0d"""Jm6mHJ=~~{J=~~"J!!A4{J4EY$mXJjmYF11U!!0d"""Jm2FI46IHj mXJm6mHJ[34IYm"rJ4EY$mXJjmYFJm46[34IYm{J~54m1U!!0d"""JI6|I4[mJjmYF" eJXm~I2!!0d"""Jm2FI46IHjmX"JN56!!0d"JH51!!0d!!0dU"^Zm"4m2j"YI[Xk3"I Xm"E3m6"jk"3F~5j"I"~53j"kH"[k6m"[ZIXI[jmX3!!0dU"54jk"I$kEj"dR"[ZIXI [jmX3,""jZm"H5X3j"hdR"54"I"YI[Xk"BCey"IXm!!0dU"WX5jjm4"jk"jZm"kEjFE j"H5~m"I46"jZm"XmYI546mX"IXm"~mHj"54"jZm!!0dU"YI[Xkx""^Zm"3F~5j"W5~ ~"4kj"54jmXXEFj"I4i"!!A4!!A4ZZ"3m}Em4[m3"BkX!!0dU"jZm"3Fm[5I~"!!A4 !!A4=w"3m}Em4[m3yx!!0d!!0dJ$m#54#XkEF!!0dJ[Ij[k6mewue@""U!!0dJ#6mHJ[ jX~Y{\\M1U!!0dJm46#XkEF!!0d!!0dJ6mHJKX5jm'F~5j{U!!0d""J6mHJ!!A4CCe{ J[34IYm"rCCeJm46[34IYm1U!!0d""Jm6mHJ=~~{J=~~""U"m2FI46"jk"XmI~"[ZIX I[jmX3!!0d""""!!A4!!A4.w!!A4!!A4.w!!A4!!A4=J3jX54#{m46!!A4!!A4=J3jX 54#11U"I66"jmXY54Ij5k4"[k6m3,!!0d""J6mHJx{1U""""""""""""""""""""""" """"U""11J[34IYm"m46Jm46[34IYm!!0d""J4mW~54m[ZIXuew"U"\M!!0d""Jm6mH J=~~{Jm2FI46IHjmXJK'J=~~"JxJxJxJxJxJxJxJxJxJxJm461!!0d""J5YYm65IjmJ WX5jmJ_<5~m{J=~~1U!!0d""J5YYm65IjmJ[~k3mkEjJ_<5~m!!0d1!!0d!!0dJ6mHJ K'{JfJfJfJfJfJfJfJfJOEjJXm~I21U"FI33"k|mX"v"T"v"u"dR"[ZIX!!0d!!0dJ6 mHJfCeJXm~I2"C@CwCRC-CdC?CvCQ{C@CwCRC-CdC?CvCQU"FI33"v"[ZIX!!0d""J5 H2CQJxJm2FI46IHjmXJm46m6mHJH5CeJXm~I21!!0d!!0dJ6mHJOEjJXm~I2CeC@Cw{ U")43mXj"~54mHmm6"[ZIXI[jmXf!!0d""J5H2Ce!!A4U"""""""""""U"$Ej"6k4gj "54jmXXEFj"I4i"!!A4!!A4ZZ!!0d""""J5H2C@!!A4J[jX~Y"CeC@CwJm~3m"CeC@C wJ[jX~Y"JH5!!0d""Jm~3m!!0d""""J5H2C@!!A4CeJ[jX~YC@CwJm~3m"CeC@J[jX~ YCwJH5!!0d""JH5"JK'1!!0d!!0dJ6mHJm46m6mHCeJm46{1U"m46"kH"jm2jf"3k"m 46"Jm6mH"I46"#k$$~m"XmYI5454#"lE4G!!0d!!0dU"=~~"jZIjg3"~mHj"jk"6mH5 4m"IXm"jZm"3G5FFmX"Yk6E~m"I46"jZm"6m[k6mX!!0dU"Yk6E~mx""^Zmi"$kjZ"I Xm"WX5jjm4"54jk"jZm"Fk3j54#"jk"$m"m2m[E6m6!!0dU"$i"jZm"Xm[m5|mXx""^ Zmi"IXm"[kYFXm33m6"I46"k$HE3[Ijm6f"$Ej"jZm!!0dU"k$HE3[Ij5k4"53"Yk3j ~i"lE3j"[kYFXm335k4,"E354#"[kYYI46"3iY$k~3!!0dU"~5Gm"Jf"HkX"~k4#mX" [kYYI46"WkX63f"I46"E354#"$E5~jA54"Xm#53jmX3!!0dU"543jmI6"kH"I~~k[Ij 54#"Xm#53jmX3x""'kYm"kH"jZm"I$$Xm|5Ij5k43"I46!!0dU"jZm"[Zk5[m3"kH"X m#53jmX"IXm"YmI4j"jk"$m"[k4HE354#"I46SkX"35~~ix!!0dU":~I54Ajm2j"|mX 35k43"kH"jZm"Yk6E~m3"IXm"#5|m4"ZmXmf"I3"Wm~~"I3!!0dU"I"#~k33IXi"kH" jZm"k$HE3[Ij5k4x!!0dU!!0dU"NmXm"53"jZm"3G5FFmX"Yk6E~mx"")j"53"E3m6" 54"jZm"HkXY,!!0dU"JqEm3j5k4,!!0dU"I"3Fm[5I~"~54m"kH"jm2j!!0dU"I4ijZ 54#"jZIj"53"3G5FFm6"m4j5Xm~if!!0dU"E4j5~"I#I54"3mm54#!!0dU"I"3Fm[5I ~"~54m"kH"jm2j!!0dU!!0dU"J6mHJqEm3j5k4,{J$#XkEF!!0dU"""JIHjmX#XkEFJ m46!!0dU"""JI~~kjZmX!!0dU"""J'G5FFmX1!!0dU!!0dU"J'G5FFmX"3jIXj3"jZm "3G5FF54#"$i"XmI654#"jZm"6m~5Y5jmX"jm2j"I46!!0dU"6mH5454#"jZm"YI[Xk "nJ'G5Fr54mg"jk"3G5F"I"~54mf"jm3j54#"HkX"jZm!!0dU"m46"jm2jx""^Zm"jm 3j"53"6k4m"$i"[k43jXE[j54#"I"[kYYI46"4IYm"HXkY!!0dU"jZm"3m4j54m~"jm 2j"I46"HXkY"mI[Z"~54mf"I46"[kYFIX54#"jZmY"BW5jZ!!0dU"J5H2yx!!0dU!!0d U"{J[Ij[k6mnJ\\Mue@"U"kjZmX!!0dU"J#6mHJ'G5FFmXCe\\MC@\\M{U"XmI6"jZ 53"~54m"AP"Cec"4m2j"~54m"AP"C@!!0dU"U""6mH54m"3m4j54m~"YI[Xk,!!0dU" ""Jm2FI46IHjmXJ6mHJ[34IYmC@Jm46[34IYmJSJJSSJS{=+\\Mf8DJk|mX1U!!0dU" U"6mH54m"YI[Xk"jk"XmI6"~54m"I46"[kYFIXm"5j"W5jZ"3m4j54m~,!!0dU"""J6 mHJ'G5Fr54mCCe\\M{Jm2FI46IHjmXU!!0dU"""""J5H2J[34IYmCCeJm2FI46IHjmX Jm46[34IYmJ[34IYmC@Jm46[34IYmU!!0dU"""""""Jm2FI46IHjmX"J*m[k6m=43Wm X"U"H5453Zm6"3G5FF54#!!0dU"""""Jm~3mU!!0dU"""""""Jm2FI46IHjmX"J'G5F r54m"U"GmmF"3G5FF54#!!0dU"""""JH51U!!0dU"1!!0dU!!0dU"J*m[k6m=43WmX" E4ZI3Zm3"jZm"I43WmX"jm2j"I46"WX5jm3"5j"jk"jZm!!0dU"3[Xmm4x"^Zm"E4FX 54jI$~m"[ZIXI[jmX3"XmFXm3m4jm6"I3"!!A4!!A4ZZ"IXm"~mHj!!0dU"I3"jZmi" IXm"B5xmxf"Fk335$~i"E4FX54jI$~m!!A4y"Ok4jXk~AM"B!!A4!!A4.6y"W5~~!!0d U"$XmIG"jZm"jm2j"54jk"~54m3"k4"jZm"3[Xmm4c"jZm"~54m$XmIG3"54"jZm !!0dU"ZI3Zm6"jm2j"IXm"5#4kXm6x""JN'"53"3mj"jk"jZm"3mm6"|I~Em"$mHkXm !!0dU"J*m[k6m=43WmX"53"54|kGm6x!!0dU!!0dU"J6mHJ*m[k6m=43WmX{U"B[kYFIX m"H5X3j"FIXj"W5jZ"J(kXYNI3Zy!!0dU"""JONuJ$kj["U"H5X3j"[ZIXI[jmX"BF~ I54"jm2jy!!0dU"""J~kkF"U"k|mX"ZI3Zm6"[ZIXI[jmX3!!0dU"""""J~[[k6mJN' uJON"U"YIF"[k654#"jk"F~I54"jm2j!!0dU"""""J5H4EYJON>JjkF[!!0dU"""""" "JI6|I4[m"JN'"JNM""U"I66"YE~j5F~5mX"jk"ZI3Z"|I~Emf""E354#xxx!!0dU"" """""JI6|I4[mJON"e"U"jZ53"ZmXm"FXm|m4j3"JN'"HXkY"$m54#"jm3jm6"FXmYI jEXm~i!!0dU"""""""J5H4EY"JN'PJjkF["JI6|I4[mJN'AJXI4#m"JH5"U"Yk6E~k" IX5jZYmj5[!!0dU"""JXmFmIj!!0dU"U"*mH54m"m2[mFj5k43x""OkYFIXm"jZ53"F IXj"W5jZ"Ja2[mFj!!0dU"""J[Ij[k6mnJ\\=u."U"nm3[IFmgf"J!!0dU"U"J[Ij[k 6mnJ\\tue"U"nkFm4gf"{"AA"E44m[m33IXi!!0dU"""J[Ij[k6mnJ\\Ou@"U"n[~k3 mgf"1!!0dU"""J[Ij[k6mnJ!!A4u?"""U"n3EFmX3[X5Fjgf"\"BHkX"Zm2"54FEjy !!0dU"""J~[[k6mnJ"unJ!!20!!0dU"""J~[[k6mnJ!!20unJ!!0dU"U!!0dU"""Jm46 ~54m[ZIXuAe"U"5#4kXm"~54m"$XmIG3"54"[k6m6"jm2j!!0dU"""J4mW~54m[ZIXu nJ\\M!!0dU"""J~kWmX[I3mJ$#XkEFJ5YYm65IjmJWX5jmJN'J$#XkEF!!0dU"1!!0d U!!0dU"s~k33IXi"kH"I$$Xm|5Ij5k43"I46"k$HE3[Ij5k43!!0dU"AAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA!!0dU!!0dU!!0dU"J~mj""""""""""""""""" h!!0dU"JN'"""BJjk~mXI4[my"""Jf"""""BJZI3Z"|I~Emy""""""""Ce!!0dU"JNM """"""""""""""""""Jx"""""BZI3Z"YE~j5F~5mXy""""C@!!0dU"JjkF["""BJYk4 jZy"""""J0"""""B~I3j"ZI3Zm6"[ZIX,"e@d"hy!!0dU"J$kj["""""""""""""""" w-"""""BH5X3j"ZI3Zm6"[ZIX,"w-"Cy!!0dU"JXI4#m"""""""""""""JimIX""""B JjkF[AJ$kj[/e,"Q@y!!0dU"JON"""""BJE[ZiFZy""""Jc"""""BI"[ZIXI[jmX"[k 6my!!0dU"J[Ij[k6m"""""""""""""J,!!0dU"J6mH"""""""""""""""""JC!!0dU" Jm2FI46IHjmX"""""""""J\!!0dU"J[34IYm""""""""""""""J{!!0dU"Jm46[34IY m"""""""""""J1!!0dU"J~[[k6m""""""""""""""J7!!0dU"JI6|I4[m"""""""""" """JA!!0dU"J$#XkEF""""""""""""""JT!!0dU"J5H4EY"""""""""""""""J5H!!0d U"Jm46~54m[ZIX"""""""""Jb!!0dU"J5H2"""""""""""""""J5H\\v!!0dU"J'G5 FFmX"""""""""""""JK!!0dU"J'G5Fr54m""""""""""""J&!!0dU"J*m[k6m=43WmX """"""""J]!!0dU"J~kWmX[I3mJ$#XkEFJ5YYm65IjmJWX5jmJN'J$#XkEF""""""J8 !!0dU!!0dU"^Zm3m"I335#4Ym4j3"IXm"WX5jjm4"jk"jZm"kEjFEj"$i"JKX5jm_Ej FEj"BI[jEI~~i!!0dU"$i"JK'9y"I46"jZm4"jZm"k$HE3[Ijm6"FXk[m3354#"[k6m "53"WX5jjm4"B#5|m4!!0dU"I3"I"FIXIYmjmX"54356m"B"y"IHjmX"JKX5jm_EjFE jyx"JKX5jm^ZmLm3j"53!!0dU"IEjkYIj5[I~~i"54|kGm6"jk"[kFi"jZm"}Em3j5k 4"I46"jZm4"jZm"I43WmXx!!0d!!0dJ6mHJKX5jm_EjFEjCeC@{J$m#54#XkEF!!0d" "J[Ij[k6mnJ\ue@"J4mW~54m[ZIXuew!!0d""Jm6mHJJ{J4km2FI46JK'9{Ce1{C@11 JJ1!!0d!!0dJ6mHJK'9CeC@{J~mjJJJ3jX54#"U"|mX$Ij5Y53Zf"I|k56"3FI[m3"I HjmX"[kYYI463!!0d""J5YYm65IjmJWX5jmJ_<5~m!!0d"""{JJJ~mjJJhJJJ~mjJJh JJJCJJJ6mHJJJCJJJx{C@1JJhJJJfJJJjk~mXI4[mJJJfCe1U!!0d""JI~~kjZmXJ[I j[k6mnBueJ[Ij[k6mnyu@!!0d""JIHjmXI335#4Ym4jJKX5jm^ZmLm3j!!0d""JjkG3 .u1!!0d!!0dJ6mHJKX5jm^ZmLm3j{J5YYm65IjmJWX5jmJ_<5~m{JjZmJjkG3.1!!0d ""J[Ij[k6mnBue@J[Ij[k6mnyue@!!0d""J5YYm65IjmJWX5jmJ_<5~m{J3jX54#JqE m3j5k4,1U!!0d""J5YYm65IjmJWX5jmJ_<5~m{J'mF1U!!0d""JOkFiqEm3j5k4!!0d ""J5YYm65IjmJWX5jmJ_<5~m{J'mF1U!!0d""{J=~~=[jNm2J(kXYNI3ZJa2[mFjJN5 6m'k~Ej5k41!!0d""Jm46#XkEF1!!0d!!0d!!0dJKX5jm_EjFEj{JjZmJN'1{JjZmJN MU!!0d1BhJ0JYk4jZhJcJE[ZiFZhJ,J[Ij[k6mhJ\Jm2FI46IHjmXhJ{J[34IYm{hJC J26mHhhJ3jX54#!!0dJCJe{h\\=1JCJw{h\\O1JCJR{h\\I11hJ1Jm46[34IYmhJT{h J7J~[[k6mJCJ8{J4mW~54m[ZIX!!20*!!0dJ~kWmX[I3mJTJ5YYm65IjmJWX5jmJfJT 1hJAJI6|I4[mJimIXQ@hJ5HJ5H4EYhJbJm46~54m[ZIX!!0dJ0!!20?aJCJ\\-eEm3\ \R5k\\dm,{Jc."J~kkFJ,Jc!!20OJAJce"J5HJc>@-d"JXmFmIjJb!!20*JK1{J,!!20 \end{comment} \begin{lcode} *!!20O!!0dJ#6mHJKCe\\MC@\\M{J\JCJ{C@J1JSJJSSJS{=+\\Mf8DJk|mX1JCJ&C Ce\\M{J\J5H\\vJ{CCeJ\U!!0dJ1J{C@J1J\J]Jm~3mJ\J&JH51J&11JCJ]{Jcw-J~k kFJ7JfJcJ5HJc>J0JAJfJxJAJceJ5HJfPJ0!!0dJAJfAJimIXJH5JXmFmIjJ,eg.J,w !!20@J,wwg?J7!!20@.n!!20J7n!!20!!20@.JbAeJ81!!0dy!!0d!!0dJm46!!0d !!0d!!03!!03!!A{end!!A} \end{lcode} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %[Addendum 2: TeX encoder for my decoder. (mjd,18-Aug-1994)] \subsection{Addendum 2} TeX encoder for my decoder. (mjd,18-Aug-1994) \begin{lcode} % Source character set: 13,32-126 = 96 % % (Note exclusion of tab. Assumption: Text to be translated will % always be untabified first.) % % Target character set: 33-126. % % Carriage return (13) cannot be included in the target set because % of the constraint to have a maximum line length of 72 in the % encoded text. If 13 (carriage return) were included in the % encoding, then the end of the current line would only occur at % the next instance in the ciphered text of the character that % translates to 13. And depending on what that character is, who % knows how long the encoded line could be? Perhaps as long as the % entire text. % % Space (32) are not included in the target set for a subtler % reason. If spaces in the encoded text happen to fall at the end % of a line, they will be dropped by TeX during the decoding % process, instead of decoded. So we either must exclude them from % the target set, or make sure that they never fall at the end of a % line. % % By excluding space from the target set, we make it possible for % the decoder to use a space as its argument delimiter. If we have % only one space, at the end of the encoded text, it is not so hard % to ensure that it does not fall at the end of a line. But note % that the decoder must make sure to change the catcode of space to % something other than 10, so that it will not disappear if it % falls at the *beginning* of a line. \def\colon{:}\def\arrow{->}% \let\isx\message %\def\isx#1{} \iffalse % OK, here is how the encoding works. Start with \mag = random (in % the target range 33-125), first encoding value. Handle two % special cases first: ^^M encodes to \mag, space encodes to \mag % +1. Then start normal encoding at \fam = 35 (char 35 = ! encodes % to \mag +2, and so forth). When \mag reaches 126, we wrap it % around to 33 (don't want to encode any character to space). % Finally, when \fam reaches 126, we must handle the last three % characters (126,33,34: ~!") as digraphs: encode them as ~x~y~z, % where xyz are obtained by continuing to increment \mag. @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_ ! "#$%&'()*+,-./0123456789:;<=>? R S~S~TTUVWXYZ[\]^_`abcdefghijklmnop @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|} ~ qrstuvwxyz{|}!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQ~R \fi % ^^^ \def\setup{% \def\notilde{}% later will be defined to include a tilde \def\encodeone{% \catcode\fam\active\lccode126\fam\lccode 48\mag \lowercase{\edef~{\notilde 0}% \isx{[\string~\colon \notilde 0\space\number\fam\arrow\number\mag]}% }% \advance\mag7 \ifnum\mag>125\advance\mag-93 \fi \advance\fam1 }% \def\do{\encodeone \csname do\ifnum\fam>125 stop\fi\endcsname }% % ASSUMPTION: \mag initialized before the call of \setup % Encode ^^M -> \mag \fam13 \encodeone % Encode space -> next \mag \fam32 \encodeone % Now encode the rest \fam35 \let\dostop\relax \do % Now \fam = 34, \mag = ?. We need to define encoding for % characters 34,33,126 ("!~) as ~z ~y ~x. But what are convenient % values for x y z? Why, just the next \mag's in sequence \edef\notilde{\string ~} \encodeone \fam33 \encodeone \encodeone } \def\outwrite{\immediate\write15{\outline}% % If a digraph occurred at the end of the line, carry over the % second character to the beginning of the next line. \expandafter\ifx\csname 73\endcsname\relax \else \expandafter\let\expandafter\1\csname 73\endcsname \expandafter\let\csname 73\endcsname\relax \charnum 1 \fi \checkeof} % For fast looking on screen: %\def\outwrite{\immediate\write16{\outline}\checkeof} \begingroup \let\0\catcode \0`\0 11 \0`\2 11 \0`\3 11 \0`\4 11 \0`\5 11 \0`\6 11 \0`\7 11 \0`\8 11 \0`\9 11 \0`\1 11 \gdef\outline{\1\2\3\4\5\6\7\8\9\10\11\12\13\14\15\16\17\18\19 \20\21\22\23\24\25\26\27\28\29\30\31\32\33\34\35\36\37\38\39 \40\41\42\43\44\45\46\47\48\49\50\51\52\53\54\55\56\57\58\59 \60\61\62\63\64\65\66\67\68\69\70\71\72} \endgroup \newcount\charnum \def\checkeof{\futurelet\next\encodemore} \def\tildecheck#1#2{\if \string~#1% \expandafter\def\csname\number\charnum\endcsname{#1}% \advance\charnum 1 \expandafter\def\csname\number\charnum\endcsname{#2}% \fi} \def\encodemore{\ifx\next\EOF \let\next\outwrite \let\checkeof\relax \global\tracingcommands2\global\tracingmacros2\global\tracingonline0 % At end of file, assume that there was a ^^M at the end, % translated to the digraph ~|. Remove it, to reduce the number of % blank lines that will be produced on screen during decoding. % BUT, if \charnum = 72, leave the ^^M there to avoid having the % space at the end of the line. \ifnum\charnum<72 \expandafter\def\csname\number\charnum\endcsname{ }% \else \def\1{ }% \fi \else \advance\charnum 1 \ifnum\charnum>72 \charnum 0 \let\next\outwrite \else \let\next\getnextchar \fi \fi \next} \def\getnextchar#1{% \edef\0{#1}% \expandafter\let\csname\number\charnum\endcsname\0\relax \expandafter\tildecheck\0\relax\relax \checkeof}% % For this we need just a unique no-op value for \ifx comparison. \def\EOF{\relax\relax} \def\writefile#1{\expandafter\checkeof\input#1 \EOF}% \begingroup % Define \0 to read in the text for \writepreamble. \def\0#1XXX#2^^JZZZ^^J{\endgroup \def\writepreamble##1{\begingroup % Convert ##1 into a hex number. \newlinechar=10 \chardef\0=##1\def\1####1"{"}% \immediate\write15{#1\expandafter\1\meaning\0#2}\endgroup}}% % Now change all special catcodes to 12. We don't use \dospecials % because we want to do backslash last, in conjunction with % \afterassignment. \catcode`\{=12 \catcode`\}=12 \catcode`\#=12 \catcode`\~=12 \catcode`\@=12 \catcode`\$=12 \catcode`\^=12 \catcode`\&=12 \catcode`\_=12 \catcode`\|=12 % The following line will turn off the last two remaining special % characters % and \, set end-of-line character to ^^J (for later % use in the \write), and then call \0. ^^M still has category 5 at % this point and the new value of \endlinechar won't get applied % until the *next* line is read, so the catcode assignment for \ % will get terminated properly by the space from ^^M, thus \0 will % get called before TeX attempts to read the % at the beginning of % the subsequent line. \catcode`\%=12 \endlinechar=10 \afterassignment\0 \catcode`\\=12 %%%% Self-decoding answer: run the following text through plain TeX %%%% \let\+\let\+\a\advance\+\c\catcode\+\d\def\+\f\fam\+\m\mag\+\u\uccode \m 13\c\m9\+\p\uppercase\d\i{\a\f7 \ifnum\f>125 \a\f-93 \fi}\d~{\u\f\m \c\m 12 \a\m1 \i \ifnum\m>125 \+~\1\fi~}\d\0#1{\ifnum`#1>"D \if#1 !\else "\fi \else\string~\fi}\u`9"20\p{\d\1#19}{\newlinechar13\d\3{\immediate\write1 6}\+~\0\p{\3{}\3{#1}\batchmode\end}}\fXXX\u\f\m\i\m32\u\f\m\c\m12\i\m35~ ZZZ \def\encodefile#1{% \immediate\openout15=encode.out \relax \begingroup % Get a random number from \time, normalize it to fall in the range % 33--125. First set \mag = \time mod 93, then add 33 to make it % fall in the proper range. \fam\time \mag\time \divide\fam93 \multiply\fam 93 \advance\mag-\fam \advance\mag 33 \message{======= Code shift: time \number\time\space --> mag \number\mag\space ============================}% \writepreamble{\number\mag}% % \setup uses \mag. \setup \charnum=0 \immediate\write16{Starting to create file encode.out . . .}% \writefile{#1}% \endgroup \immediate\closeout15 \relax \immediate\write16{The encoded output is in the file encode.out.}% } \immediate\write16{Enter the name of the file you want to encode:} {\catcode\endlinechar=9 \global\read-1 to\filnam} \encodefile{\filnam} \end \end{lcode} %$ %%\endinput \chapter{Defining new control sequences} \section{Exercise} %%\input{ex012} % ex012.tex \begin{comment} Date: 24 Sep 1993 16:11:36 -0400 (EDT) From: Michael Downes Subject: Around the Bend #12 To: info-tex@shsu.edu X-ListName: TeX-Related Network Discussion List ======================================================================== *** Exercise 12: \end{comment} \ed{\oposted{1993/09/24}. \arch{exercise.012}.} How many commands are there in plain TeX that can be used to define a new (i.e., previously undefined) control sequence? \begin{comment} ======================================================================== E-mail answers to my address, below. A summary will be posted circa October 15, 1993. Michael Downes --------------------------------------------------------- mjd@math.ams.org (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456 789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ \end{comment} %$ %%\endinput \section{Answers} %%\input{ans012} % ans012.tex \begin{comment} [The addendum was not included in the original post but added in my archives later ---mjd] Date: 25 Oct 1993 16:36:43 -0400 (EDT) From: Michael Downes Subject: Around the Bend #12, answer To: info-tex@shsu.edu X-ListName: TeX-Related Network Discussion List \end{comment} \ed{\oposted{1993/10/25}. \arch{answer.012}.} %Exercise 12 asked `How many commands are there in plain TeX that can %be used to define a new (i.e., previously undefined) control %sequence?'. This exercise has latent ambiguities. The parenthetical remark `(i.e., previously undefined)' was intended as a hint towards the most comprehensive possible answer. There are three main criteria that could be used for `new' status of a control sequence: \begin{enumerate} \item If executed, the control sequence causes an `\texttt{Undefined control sequence}' error. \item The control sequence is \piif{ifx}-equivalent to \cmd{\relax} when constructed with \cmd{\csname} \texttt{\ldots} \cmd{\endcsname}. This is the basis of the LaTeX \cmd{\@ifundefined} test. \item The control sequence has not yet been entered into the hash table. \end{enumerate} Criterion (3) doesn't work for one-character control sequences (\cmd{\a}, \cmd{\0}, \cmd{\:}) since they have space reserved for them separate from the hash table whether or not they are defined in any sense. Criterion (2) obviously gives a spurious true result if applied to \cmd{\relax} or to something like LaTeX's \cmd{\protect} command that spends much of its time being equivalent to \cmd{\relax}. Criterion (1) therefore seems best. Notice that control sequences can enter into the hash table without becoming defined anywhere along the way, so a control sequence can be `old' by criterion (3) but still new by criterion (1). In all of the following examples the control sequence \cmd{\foo} will get added to the hash table but remain undefined. \begin{lcode} \def\x{\foo} \toks0{\foo} \string\foo \noexpand\foo \gobble\foo (assuming \def\gobble#1{}) \uppercase{\iffalse\foo\fi} \show\foo \meaning\foo \end{lcode} Two notable cases where tokenization, but not hash-table-ization, of \cmd{\foo} occurs are in an \piif{ifx} comparison or on the false branch of an \piif{if}: \begin{lcode} \ifx\foo\something... \iffalse\foo\fi \end{lcode} (\emph{TeXbook}, Appendix D, p384). The straightforward answer to Exercise 12 is to count up the various kinds of def'ing and let'ing functions (table~\ref{tab:deflet}): \begin{comment} \begin{lcode} Primitive: Nonprimitive: \def \newcount \edef \newdimen \gdef \newskip \xdef \newmuskip \let \newfam \futurelet \newwrite \chardef \newread \mathchardef \newbox \countdef \newtoks \dimendef \newinsert \skipdef \newlanguage \muskipdef \newif \toksdef \newhelp \font \read \csname \end{lcode} \end{comment} \begin{table} \centering \caption{The def'ing and let'ing functions}\label{tab:deflet} \begin{tabular}{ll} \toprule Primitive & Nonprimitive \\ \midrule \cmd{\def} & \cmd{\newcount} \\ \cmd{\edef} & \cmd{\newdimen} \\ \cmd{\gdef} & \cmd{\newskip} \\ \cmd{\xdef} & \cmd{\newmuskip} \\ \cmd{\let} & \cmd{\newfam} \\ \cmd{\futurelet} & \cmd{\newwrite} \\ \cmd{\chardef} & \cmd{\newread} \\ \cmd{\mathchardef} & \cmd{\newbox} \\ \cmd{\countdef} & \cmd{\newtoks} \\ \cmd{\dimendef} & \cmd{\newinsert} \\ \cmd{\skipdef} & \cmd{\newlanguage} \\ \cmd{\muskipdef} & \cmd{\newif} \\ \cmd{\toksdef} & \cmd{\newhelp} \\ \cmd{\font} & \\ \cmd{\read} & \\ \cmd{\csname} & \\ \bottomrule \end{tabular} \end{table} The reason for including \cmd{\csname}? After \begin{lcode} \csname foobar\endcsname \end{lcode} \cmd{\foobar} is no longer undefined; the change in its status is indistinguishable from the change effected by the statement \verb?\let\foobar\relax?. \cmd{\endcsname} is not counted separately because \cmd{\csname} and \cmd{\endcsname} can only be used together. So: 16 primitive, 13 non-primitive make 29 total. But to those should be added two more, since the statement of the Exercise didn't exclude `private' macros: (i) the internal function \cmd{\alloc@} of plain.tex that is shared by all the \cmd{\newxxx} macros (except for \cmd{\newif} and \cmd{\newhelp}), and (ii) the internal function \cmd{\@if} used by \cmd{\newif}. That brings the total to 31. Beyond that there can be added another, less obvious, class of commands, if we paraphrase the exercise as follows: \begin{quote} Find all commands such that executing command \cmd{\xxx}, with its normal arguments (if any), causes at least one control sequence to pass from undefined status to defined status, where undefined status means that executing the control sequence would generate the error `Undefined control sequence'. \end{quote} For example, the first use of \cmd{\loop} causes \cmd{\body} and \cmd{\next} to become defined. As it turns out, there are many of these in plain TeX (table~\ref{tab:user} and~\ref{tab:internal} as well as \verb?'? or \cmd{\rq} in math mode only). \begin{comment} User functions: \begin{lcode} \loop, \t, \smash, \vfootnote, \settabs, \phantom, \vphantom, \hphantom, \footnote, \multispan, \longleftarrow, \longrightarrow, \mathstrut, \longmapsto, \matrix, \pmatrix; \end{lcode} \verb?'? or \cmd{\rq} (math mode only) \end{comment} \begin{figure} \freetabcaption{User functions}\label{tab:user} \autorows{c}{4}{l}{% \cmd{\footnote}, \cmd{\hphantom}, \cmd{\longleftarrow}, \cmd{\longmapsto}, \cmd{\longrightarrow}, \cmd{\loop}, \cmd{\mathstrut}, \cmd{\matrix}, \cmd{\multispan}, \cmd{\phantom}, \cmd{\pmatrix}, \cmd{\settabs}, \cmd{\smash}, \cmd{\t}, \cmd{\vfootnote}, \cmd{\vphantom} } \end{figure} \begin{comment} Internal functions: \begin{lcode} \iterate, \relbar, \sett@b, \s@tt@b, \prim@s, \ph@nt, \fo@t, \f@@t, \pr@m@s, \pr@@@s, \s@tcols \end{lcode} \end{comment} \begin{figure} \freetabcaption{Internal functions}\label{tab:internal} \autorows{c}{6}{l}{% \cmd{\f@@t}, \cmd{\fo@t}, \cmd{\iterate}, \cmd{\ph@nt}, \cmd{\pr@@@s}, \cmd{\pr@m@s}, \cmd{\prim@s}, \cmd{\relbar}, \cmd{\s@tcols}, \cmd{\s@tt@b}, \cmd{\sett@b} } \end{figure} Adding these 18 user functions and 11 internal functions to the previously cited 31 gives a total of 60 functions available in \pfile{plain.tex} that satisfy a strict interpretation of the exercise statement. Credit for the best answer goes to Dan Luecking\index{Luecking, Dan}, who found 29 of the primary 31, and did not miss the other two (\cmd{\csname}, \cmd{\@if}) by overlooking them but by considering them and believing they didn't satisfy the requirements. My own score in that part was 28: I overlooked \cmd{\read}, \cmd{\alloc@}, and \cmd{\@if} until Luecking and Peter Schmitt\index{Schmitt, Peter} brought them to my notice. Ian Collier\index{Collier, Ian} also submitted a good answer, including identification of the secondary class of functions that define scratch macros as a side effect. %%======================================================================== Notes: \begin{itemize} \item \cmd{\iterate}, \cmd{\settabs}, \cmd{\sett@b}, \cmd{\s@tt@b}, \cmd{\t}, \cmd{\prim@s}, \cmd{\ph@nt}, \cmd{\smash}, \cmd{\vfootnote}, \cmd{\fo@t}, \cmd{\f@@t} all define \cmd{\next}. \item \cmd{\loop} defines \cmd{\body}. \item \cmd{\pr@m@s} defines \cmd{\nxt}. \item \cmd{\prim@s} is called by active \verb?'? (mathcode \verb?"8000?) and by \cmd{\pr@@@s}. \item \cmd{\iterate} is called by \cmd{\loop}. \item \cmd{\sett@b} is called by \cmd{\settabs}. \item \cmd{\s@tt@b} is \emph{conditionally} called by \cmd{\sett@b}. \item \cmd{\smash} is called by \cmd{\relbar}. \item \cmd{\ph@nt} is called by \cmd{\phantom}, \cmd{\vphantom}, and \cmd{\hphantom}. \item \cmd{\vfootnote} is called by \cmd{\footnote}. \item \cmd{\fo@t} is called by \cmd{\vfootnote}. \item \cmd{\f@@t} is \emph{conditionally} called by \cmd{\fo@t}. \item Active \verb?'? is produced by \cmd{\rq} if used in math mode. \item \cmd{\pr@@@s} is called by \cmd{\pr@m@s}. \item \cmd{\loop} is called by \cmd{\multispan} and \cmd{\s@tcols}. \item \cmd{\relbar} is called by \cmd{\longleftarrow} and \cmd{\longrightarrow}. \item \cmd{\vphantom} is called by \cmd{\mathstrut}. \item \cmd{\pr@m@s} is called by \cmd{\prim@s}. \item \cmd{\s@tcols} is *conditionally* called by \cmd{\sett@b}. \item \cmd{\longrightarrow} is called by \cmd{\longmapsto}. \item \cmd{\mathstrut} is called by \cmd{\matrix}. \item \cmd{\matrix} is called by \cmd{\pmatrix}. \item \cmd{\prim@s} won't necessarily define \cmd{\next} because it does a \cmd{\futurelet} which will leave \cmd{\next} undefined if the next thing happens to be an undefined control sequence (rather unlikely, however). \item \cmd{\vfootnote} and \cmd{\settabs} also do a \cmd{\futurelet} but it is followed by another macro that ensures that \cmd{\next} does not end up undefined. \end{itemize} \begin{comment} Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% mjd@math.ams.org (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456 789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \end{comment} %$ \section{Addendum} \enlargethispage{3\onelineskip} \begin{comment} Addendum: From comp.text.tex =========================================================================== Archive-Date: Wed, 29 Sep 1993 13:21:40 CST From: cet1@cus.cam.ac.uk (Chris Thompson) Subject: Re: Managing Large LaTeX Files. How ?? Date: Wed, 29 Sep 1993 16:36:23 GMT To: tex-news@SHSU.EDU \end{comment} From \texttt{comp.text.tex} \begin{lcode} From: cet1@cus.cam.ac.uk (Chris Thompson) Subject: Re: Managing Large LaTeX Files. How ?? Date: Wed, 29 Sep 1993 16:36:23 GMT To: tex-news@SHSU.EDU In article <93265.121206SPIT@EVALUN11.BITNET>, Werenfried Spit writes: |> In article <1993Sep20.130331.16568@vax.oxford.ac.uk>, kaye@vax.oxford.ac.uk |> (Richard Kaye) says: |> >Has anyone else had save stack overflow when LaTeX read the .aux files? |> > |> >[Will a TeX guru please explain it to me? I thought \global\def's could not |> >cause save stack overflow until I found this problem. If it's a general |> >problem, it seems a bit silly that LaTeX should try to input so much |> >information in this way.] |> > |> >I fixed it so that the data was read {\it outside} the group (as part of one |> |> Could someone explain it to me too? I'm even more puzzled after I tried |> out Richards solution and played a bit with it. When you put in |> your input file directly after the \documentstyle command the line |> \input \jobname.aux |> LaTeX reads the aux file without its memory getting overflowed; then |> at \begin{document} it reads the aux file again (as expected), but |> the memory doesn't overflow this time either. (If you leave out the |> \input \jobname.aux LaTeX only reads the aux file during \begin{document} |> and then chokes on an exceedence of the save size.) \end{lcode} [Chris Thompson] This was a hard one to track down. I could claim that it was all my fault... The entries on the save stack are not the result of the \cmd{\global}\cmd{\@namedef}, which as suggested above never needs to use such a thing. They come from the earlier \cmd{\@ifundefined} call in \cmd{\newlabel}. Change \#337 in \pfile{tex82.bug} numbering, applied in TeX 2.9, changed the implicit setting of an undefined control sequence referenced via \cmd{\csname}...\cmd{\endcsname} to \cmd{\relax} (\emph{TeXbook}, page 213) from being (sort of) global to being local to the current group. Don made this change as a direct result of my posting to TeXhax (year 1987, digest 103) pointing out that the TeXbook didn't correctly describe what happened. The change was a potent source of new bugs, because TeX was not originally designed to cope with token expansion have side-effects of modifying the save stack (see in particular change \#371 in tex82.bug). I have more than once wondered whether I should have kept quiet about the whole business\ldots In an ideal world, the problem wouldn't arise because the implicit setting to \cmd{\relax} wouldn't occur at all (IMNSHO). But everything (especially LaTeX) relies on it now, so it's (far) too late to change it. Something to be got right in the next incarnation. \begin{lcode} Chris Thompson Cambridge University Computing Service \end{lcode} %%\endinput \chapter{\cs{endlinechar} and \cs{par}} \section{Exercise (fast)} %%\input{ex013} % ex013.tex \begin{comment} Date: 13 Oct 1993 12:31:56 -0400 (EDT) From: Michael Downes Subject: Around the Bend #13 To: info-tex@shsu.edu X-ListName: TeX-Related Network Discussion List \end{comment} \ed{\oposted{1993/10/13}.\arch{exercise.013}.} \begin{lcode} %%%% Three lines of overhead for the self-decoding answer; see below %%% \let\+\let\+\a\advance\+\c\catcode\+\d\def\+\f\fam\+\m\mag\f"20\d~{\c\f9 \a\f1 \ifnum\f>125\f002\d~{\a\f-1 \ifnum\f<1\egroup\fi}\fi~}\c`\^^M="9{~ \end{lcode} %%======================================================================== %%*** Exercise 13 (fast): (a) If \cmd{\endlinechar} does not have category 5 do you still get a \piif{par} from a blank line? (b) If \cmd{\endlinechar}=-1 do you still get a \piif{par} from a blank line? \begin{comment} ======================================================================== Michael Downes ========================================================= mjd@math.ams.org (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456 789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ \end{comment} %$ Self-decoding answer given below. To see the answer, run this post (sans mail/newsgroup header) through plain TeX. \begin{lcode} \d~{\u\f\m\c\m12 \a\m1\a\f1 \ifnum\f>125\f33 \fi\ifnum\m>125\+~\1\fi~}\+ \u\uccode\+\p\uppercase\d\0#1{\ifnum`#1>"D \if#1 !\else"\fi\else\string~ \end{lcode} \ed{There are sixteen lines like this, all of which are in the archived version if you need them. The last line is:} \begin{comment} \fi}\u`9"20\p{\d\1#19}{\newlinechar13 \d\3{\immediate\write16}\+~\0\p{\3 {}\3{#1}\batchmode\end}}\f"6C\m"0D\u\f\m\a\f"1\m32\u\f\m\c\m12\a\f1\m35~ /\aeS`amb]m/`]c\RmbVSm0S\Rmn|!(llsOtm<]ymsPtm<]ymm7\m]bVS`me]`RawmOmPZO\ YmZW\SmeWZZm^`]RcQSmOmJ^O`mWTlO\Rm]\ZgmWTmS\RZW\SmQVO`OQbS`amO`Sm^`SaS\b mO\RmVOdSmQObQ]RSm#ym7bmWalW\bS`SabW\Umb]m\]bSmbVObmbe]mQ]\aSQcbWdSmS\RZ W\SmQVO`OQbS`amO`Sm\]blb`O\aZObSRmaW[^Zgmb]mJ^O`wmPcbmb]m*a^OQS,J^O`ymms BVSma^OQSmeWZZlRWaO^^SO`mW\ma][SmQW`Qc[abO\QSawmSyUywmOTbS`mOmQ]\b`]Zme] `RwmOQQ]`RW\Ulb]mBSFram\]`[OZmaQO\\W\Um`cZSaytmBVWamWambVSm`SOa]\ms]`mOb mZSOabm]\Sl`SOa]\tmbVObmOmJ^O`m]^S`ObW]\m[cabm^S`T]`[mO\mW[^ZWQWbmJc\aYW ^l]^S`ObW]\ymBVS`SmeOamOZa]mOm`SQS\bm^]abmb]mQ][^ybSfbybSfmPgm2]\OZRl/`a S\SOcmb]m^]W\bm]cbmbVSm^`]PZS[meWbVma][S]\SramRSZW[WbSRxO`Uc[S\bl[OQ`]mR STW\WbW]\(llmmJRSTJa][SbVW\Un|yJ^O`i*R]ma][SbVW\UmeWbVmn|,kllBVSmRSZW[Wb S`mab`W\Um~nyJ^O`~nmRWRm\]bm[ObQVmbVSmOQbcOZmbSfbllmmyyyma][SmbSfbylmm*P \end{comment} \begin{lcode} ZO\YmZW\S,llPSQOcaSm]TmbVSma^OQSmb]YS\mT]ZZ]eW\UmbVSm^S`W]Ry mbSfbylmm*P \end{lcode} %%\endinput \section{Answers} %%\input{ans013} % ans013.tex \ed{\arch{answer.013}.} [This was included as a self-decoding answer in the posting of Exercise \#13 which is archived as \pfile{exercise.013}.] Answers to Around the Bend \#13: (a) No. (b) No. In other words, a blank line will produce a \piif{par} if and only if endline characters are present and have catcode 5. It is interesting to note that two consecutive endline characters are not translated simply to \piif{par}, but to \meta{space}\piif{par}. (The space will disappear in some circumstances, e.g., after a control word, according to TeX's normal scanning rules.) This is the reason (or at least one reason) that a \piif{par} operation must perform an implicit \cmd{\unskip} operation. There was also a recent post to \pfile{comp.text.tex} by Donald Arseneau\index{Arseneau, Donald} to point out the problem with someone's delimited-argument macro definition: \begin{lcode} \def\something#1.\par{} The delimiter string ".\par" did not match the actual text ... some text. because of the space token following the period.. \end{lcode} %%\endinput \chapter{TeX's stomach} \section{Exercise} %%\input{ex014} % ex014.tex \begin{comment} Date: 26 Oct 1993 09:29:08 -0400 (EDT) From: Michael Downes Subject: Around the Bend #14 To: info-tex@shsu.edu X-ListName: TeX-Related Network Discussion List \end{comment} \ed{\oposted{1993/10/26}. \arch{exercise.014}.} \begin{lcode} %%%%% Two lines of overhead for the self-decoding answer; see below %%%% \let\+\let\+\a\advance\+\c\catcode\+\d\def\+\f\fam\+\m\mag\c13 9{\c32'16 \end{lcode} %% ======================================================================= \begin{quote} *** Exercise 14 [proposed by Jonathan Fine]: Which character code/category code pairs can actually reach TeX's `stomach'? \end{quote} %% ======================================================================= This is a refinement of The \emph{TeXbook}'s Exercise 7.3. You need to be a little careful about your answer. I didn't get it right on my first try \ldots To make the notion of `reaching TeX's stomach' more precise: A token is said to `reach TeX's stomach' if it produces a token report when \cmd{\tracingcommands} = 1. And a `token report' is a phrase in braces, e.g., \begin{lcode} {the letter A} \end{lcode} as produced by TeX in the log file when tracing commands. \begin{comment} Michael Downes ======================================================== mjd@math.ams.org ASCII 32--55,56--126: !"#$%&'()*+,-./01234567 89:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ \end{comment} %$ Self-decoding answer given below. To see the answer, run this post (sans mail/newsgroup header) through plain TeX. \begin{lcode} }\d~{\u\f\m\c\m12\a\m1\a\f1 \ifnum\f>125\f33 \fi\ifnum\m>125\+~\1\fi~}\+ \u\uccode\+\p\uppercase\d\0#1{\ifnum`#1>"D \if#1 !\else"\fi\else\string~ \end{lcode} \ed{In the archived form there are 20 lines like this, the last being:} \begin{comment} \fi}\u`9"20\p{\d\1#19}{\newlinechar13 \d\3{\immediate\write16}\+~\0\p{\3 {}\3{#1}\batchmode\end}}\f"39\m"0D\u\f\m\a\f"1\m32\u\f\m\c\m12\a\f1\m35~ Y).2}-:/*:Y-*0)|:/#}:Z})|:;ILR99::[y/{*|}:[#y-:[*|}.::::[y/{*|}:[#y-:[*| }.9::EEEEEEE:EEEEEEEEEE::::EEEEEEE:EEEEEEEEEE9:::::I:::::HEEJMM::::::::: ::IH:::IEEJMM9:::::J:::::HEEJMM9:::::K:::::HEEJMM:::::::::::II:::HEEJMM9 :::::L:::::HEEJMM:::::::::::IJ:::HEEJMM9::::::::::::::::::::::::::::IK:: :HEEJMM9:::::N:::::HEEJMM9:::::O:::::HEEJMM9:::::P:::::HEEJMM99[y/}"*-4: IH:$.:/#}:}3{}+/$*)y':{y.}F:[y/{*|}EIH:{#y-y{/}-.:2$/#:{#y-y{/}-9{*|}:TV :KJ:{y):*)'4:z}:+-*|0{}|:z4:t0++}-{y.}Gt'*2}-{y.}:/-${&.:@l}pz**&D9Y++}) |$3:\AF:k*:/#}:+y$-:{#y-y{/}-:HD:{y/{*|}:IH:$.:)*/:+*..$z'}R:t0++}-{y.}9 y)|:t'*2}-{y.}:{y))*/:+-*|0{}:y:{#y-y{/}-:H:!-*(:y:)*)EH:{#y-y{/}-F99Y{/ $1}:{#y-y{/}-.:2$'':/}./:/-0}:!*-:{y/}"*-4:IH:2$/#:t$!{y/:$!:/#}4:y-}9t' }/:},0y':/*:y:.+y{}:/*&})F:Z0/:$!:/#}:~9:{#y-y{/}-:@.y4A:#y.:z}}):.*9|}! $)}|D:$/:2$'':)*/:(y/{#:y:.+y{}:$):/#}:|}'$($/}-:/}3/:*!:y:(y{-*:2$/#9|} '$($/}|:y-"0(})/.F:Y)|:y{{*-|$)":/*:t/-y{$)"{*((y)|.:/#}:(}y)$)":*!:y)9y {/$1}:/$'|}:/#y/:#y.:z}}):t'}/:},0y':/*:y:.+y{}:$.:~;z'y)&:.+y{}::~;D92# \end{comment} \begin{lcode} }-}y.:/#}:(}y)$)":*!:y:{y/}"*-4EIH:/$'|}:$.:~;z'y)&:.+y{}:~9~;F ::~;D92# \end{lcode} %%\endinput \section{Answers} %%\input{ans014} % ans014.tex \ed{\arch{answer.014}.} [This was included as a self-decoding answer in the posting of Exercise \#14, which is archived as \pfile{exercise.014}.] \begin{lcode} Answer to Around the Bend #14: Catcode Char Codes Catcode Char Codes ------- ---------- ------- ---------- 1 0--255 10 1--255 2 0--255 3 0--255 11 0--255 4 0--255 12 0--255 13 0--255 6 0--255 7 0--255 8 0--255 \end{lcode} Category 10 is the exceptional case. Catcode-10 characters with character code $<>$ 32 can only be produced by \cmd{\uppercase}/\cmd{\lowercase} tricks (\emph{TeXbook}, Appendix D). So the pair character 0, catcode 10 is not possible: \cmd{\uppercase} and \cmd{\lowercase} cannot produce a character 0 from a non-0 character. Active characters will test true for category 10 with \piif{ifcat} if they are \cmd{\let} equal to a space token. But if the \verb?~? character (say) has been so defined, it will not match a space in the delimiter text of a macro with delimited arguments. And according to \cmd{\tracingcommands} the meaning of an active tilde that has been \cmd{\let} equal to a space is \verb?`blank space '? whereas the meaning of a category-10 tilde is \verb?`blank space ~'?. %%\endinput \chapter{Space removal} \section{Exercise} %%\input{ex015} % ex015.tex \begin{comment} Date: 05 Nov 1993 16:34:28 -0500 (EST) From: Michael Downes Subject: Around the Bend #15 To: info-tex@shsu.edu X-ListName: TeX-Related Network Discussion List \end{comment} \ed{\oposted{1993/11/05}. \arch{exercise.015}.} (a) Write a macro \cmd{\trimspace} that takes another macro as its argument and removes a trailing space from the replacement text of the macro, if one is present, and otherwise leaves it unchanged. (b) Write a macro \cmd{\trimspaces} that removes a leading space, if present, and then calls \cmd{\trimspace} to remove a trailing space. %%======================================================================== Motivation: If a user inadvertently includes an extra space in a text argument, such as a section heading: \begin{lcode} \section{Title of the section } \end{lcode} then you must usually take care to remove the space when typesetting the text. The simple way is to perform an \cmd{\unskip} at the end (if the text is immediately followed by \piif{par}, the \cmd{\unskip} operation is built-in) and an \cmd{\ignorespaces} at the beginning, but various complications can arise, so it would be preferable to be able to apply a \cmd{\trimspaces} function when an argument is first read, and then have the information in proper form for all subsequent uses. \begin{comment} Send answers to the address below. A summary will be posted November 23, 1993 or thereabouts. Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% mjd@math.ams.org (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456 789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ \end{comment} %$ %%\endinput \section{Answers} %%\input{ans015} % ans015.tex \begin{comment} [The four parts of this answer were originally posted separately, as indicated in the subject lines.] Date: 16 Dec 1993 16:34:45 -0500 (EST) From: Michael Downes Subject: Around the Bend #15, answers To: info-tex@shsu.edu \end{comment} \ed{\oposted{1993/12/16}. \arch{answer.015}.} Exercise 15 asked for a function \cmd{\trimspace} to trim a trailing space from the replacement text of a macro, and a function \cmd{\trimspaces} to trim both a leading and a trailing space. At the time of posting the exercise I had no prepared solution; as luck would have it the problem was rife with latent complications (including some hard questions about limiting the domain of application), which propagated an unusually diverse crop of approaches among the submitted solutions, and which made the task of preparing a good summary extraordinarily difficult. Even after breaking down the `summary' into two or three pieces, to avoid a too formidably large monolith of a posting, I'll have to leave out some material that I would otherwise have included. I'd say Donald Arseneau\index{Arseneau, Donald} deserves credit for the best analysis, including an accurate survey of brace-stripping problems. Nearly everyone, including myself, had missed a lurking flaw of that kind in the first submitted version of their solution. Another good idea of Donald's that caught my fancy was to use TeX's built-in scanning procedures for \meta{optional space} to strip the leading space in \cmd{\trimspaces}. I managed to work that into my own best solution, much to my satisfaction. Peter Schmitt\index{Schmitt, Peter} came up with perhaps the most aerodynamic solution, on his second go-round. A solution by Ian Collier\index{Collier, Ian} differed notably from the others by using \cmd{\meaning} to look for a leading space. Another submission, from Gary McGary\index{McGary, Greg}\index{McGary, Gary|see{McGary, Greg}} \ed{I think this is a typo for Greg McGary}, contained some original syntactic ideas, and explored the more general problem of removing an arbitrary token pattern at the end of a token list. A careless, off-the-cuff remark of mine in the statement of Exercise 15 that after removing a leading space, \cmd{\trimspaces} should call \cmd{\trimspace} to remove a trailing space, was probably a mistake. In most cases, at least, \cmd{\trimspaces} can be more elegantly written by letting the two different space-removal procedures share a few tokens at a lower level. From Donald's\index{Arseneau, Donald} analysis: \begin{quote} When I first read the question, I thought `why isn't there an answer with the question, because that one is easy?' As I started to type my answer `cold', I realized that what I had used previously to ignore leading spaces \begin{lcode} \def\something#1#2\weird{#1#2} \end{lcode} had the bad side-effect of stripping braces if the parameter began with `\verb?{?'. \end{quote} I append below Peter Schmitt's\index{Schmitt, Peter} solution, more or less as he wrote it. The commentary refers to earlier correspondence in a place or two but I believe there is sufficient context to make everything intelligible. Test \#5 in the test suite traps the insidious brace-stripping problem that infested most of the solutions in their first incarnation. \begin{comment} More on Exercise 15 to follow, some time in the next few days. Michael Downes, mjd@math.ams.org %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \end{comment} \begin{solution}{Solution 1 (Peter Schmitt)} %%>>Solution 1 (Peter Schmitt, a8131dal@awiuni11.edvz.univie.ac.at) Since I wanted to stay with delimited arguments it was clear that one has to add a token (or tokens) in order to hide braces, which finally have to be removed again. First I came up with using \cmd{\empty}, as you did, but then I switched to a not expandable token because this can more efficiently be used as a parameter delimiter. \cmd{\trimspaces} and \cmd{\trimspace} are just used to expand the argument and add delimiting tokens in front and at the end of it, and set up the delimiting tokens for \cmd{\Trimspace} and \cmd{\Trimspaces}, too. As Donald does, I do not call \cmd{\trimspace} by \cmd{\trimspaces} but rather \cmd{\Trimspace} by \cmd{\trimspaces}. It would be easy to offer \cmd{\TrimLeft} \cmd{\TrimRight} and \cmd{\TrimBoth} and also \cmd{\TrimLeftS} \cmd{\TrimRightS} and \cmd{\TrimBothS} which iterate in the (very unlikely!) case that there are several consecutive space tokens. \cmd{\Trimspaces} and \cmd{\Trimspace} remove leading, respectively trailing, spaces of the argument, but they both leave the delimiting tokens in place. These (and outside tokens) are removed by \cmd{\TrimSpace} in the process of redefining the initial controlsequence. \begin{lcode} \catcode`\<=3 \catcode`\>=3 \def\trimspace #1{\expandafter\expandafter\expandafter \Trimspace\expandafter <#1> >\\#1} \def\trimspaces #1{\expandafter\expandafter\expandafter \Trimspaces\expandafter <#1>< <\\#1} %% \Trimspaces < text>< <\\ |< text>| ==> %% -> || + |text> + | <| %% => ||+| <|+|text>| == | | %% %% \Trimspaces < <\\ || ==> %% -> || + || + || %% => ||+||+|| == || %% \Trimspace >\\ || ==> %% -> || %% => |\\ == |\\| %% %% \Trimspace >\\ || ==> %% -> || + || %% => ||+>\\ == |>| \def\Trimspaces #1< #2<#3\\{\Trimspace #1#3#2 >\\} \def\Trimspace #1 >#2\\{\TrimSpace #1>\\} \def\TrimSpace #1>#2\\#3{% \expandafter\expandafter\expandafter\expandafter\expandafter \def \expandafter\expandafter\expandafter #3\expandafter {\Remove#1}} \def\Remove#1{} \catcode`\<12 \catcode`\>=12 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \def\Test#1{\def\test{#1}\immediate\write0{|\test|}% \trimspaces\test \immediate\write0{|\test|}% } \let\trim\trimspace \let\trim\trimspaces %%%%%%%%%%%%%%%%%%%%%%%%% \Test{} \Test{ } \Test{ a } \Test{ {}{} } \Test{{braces}} \Test{ {braces} } \Test{ { braces } } \Test{no space and no space} \Test{no space and a space: } \Test{ :a space and no space} \Test{ :a space and a space: } \def\test{ \ifx/ }\trimspace\test\show\test \def\test{ \ifx }\trimspaces\test\show\test \end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \end{lcode} \end{solution} %%\endinput \begin{comment} Date: 23 Dec 1993 16:21:21 -0500 (EST) From: Michael Downes Subject: Around the Bend #15, answers, 2nd installment To: info-tex@shsu.edu X-ListName: TeX-Related Network Discussion List \end{comment} Some exposition seems called for here in order to lay out various considerations running through my mind and the minds of the other solution-submitters. \subsection{Trimming a trailing space} There are two possible ways to remove a trailing space. The first one is to step through the given text one token at a time, and construct a new token list in parallel by adding the tokens one by one at the end. If the next token is a space, delay adding it until the subsequent token is checked, and if it turns out the text is exhausted, discard the space instead of adding it. The hard part about this approach is dealing with braces (character tokens with catcode 1 or 2) because a lone brace cannot be passed as a macro argument. A recent posting by \'Eamonn McManus to comp.text.tex on a different sort of problem showed that the braces can indeed be dealt with, it's just not easy. The second, simpler approach is to use TeX's scanning of delimited macro arguments to scan for the ending space and discard it. If you merely scan for a space token, however, you end up scanning through the given text `word' by `word' (word = sequence of non-space characters or brace-delimited groups) instead of token by token, which is perhaps if anything even more awkward than the first method above, since you still must deal with brace complications. The key refinement, therefore, is to scan for a pair of tokens: a space token and some well-chosen bizarre token that can't possibly occur in the scanned text. If you put the bizarre token at the end of the text, and if the text has a trailing space, then TeX's delimiter matching will match at that point and not before, because the earlier occurrences of space don't have the requisite other member of the pair. Next consider the possibility that the trailing space is absent: TeX will keep on scanning ahead for the pair \meta{space}\meta{bizarre} until either it finds them or it decides to give up and signal a `Runaway argument?' error. So you must add a stop pair to catch the runaway argument possibility: a second instance of the bizarre token, preceded by a space. If TeX doesn't find a match at the first bizarre token, it will at the second one. Now all that's left is to test somehow where the hit occurred in order to fork properly. This can be done in various clever ways, as exhibited in the solutions. %%\endinput \subsection{Trimming a leading space} More analysis from Donald Arseneau: \begin{quote} There are two safe, expandable ways to eat `one optional space': `\piif{ifnum}' using an ascii code (\texttt{`c}) as the second number, and `\piif{ifdim}' using a literal unit of measure like `pt'. Oh, yes, it could also be done with parameter syntax too, but more on that later. \end{quote} %%\endinput In other words, one way to remove a leading space would be \begin{lcode} \expandafter\def\expandafter\foo\expandafter{\ifdim0pt=0pt\foo \fi} \end{lcode} The \cmd{\expandafter}'s would cause the \piif{ifdim} to be executed first. Execution of the \piif{ifdim} will not terminate until the scanning of the second `0pt' is finished; therefore TeX will start expanding \cmd{\foo} as part of the scanning of the `0pt'. Then if a space is the first thing inside the expansion of \cmd{\foo}, it will be removed by TeX as denoting the end of the dimension. Otherwise the first non-space token will terminate the dimension scanning and will be left in place (well, I am glossing over the problem of an expandable token at the beginning of \cmd{\foo}, which can be handled by further refinements). Notice that as written the trailing \piif{fi} will be included in the redefinition of \cmd{\foo}. No problem---just rewrite it with the \piif{fi} after the closing brace: \begin{lcode} \expandafter\def\expandafter\foo\expandafter{\ifdim0pt=0pt\foo}\fi \end{lcode} [Now for a sharp little question: will that work with \cmd{\edef} instead of \cmd{\def}? \begin{lcode} \edef\foo{\ifdim0pt=0pt\foo}\fi \end{lcode} See if you can guess before testing it.] %%\endinput %%\begin{verbatim} Other ways of removing a leading space include using \cmd{\futurelet} to look at the first token in the scanned text, or using TeX's argument delimiter scanning to scan for a space. The latter method is perhaps most straightforwardly done as a mirror-image of the method for removing a trailing space: make the delimiter \meta{bizarre}\meta{space}, and then call the macro (let's say \cmd{\trimx}) by putting \meta{bizarre} before the scanned text and a stop pair \meta{bizarre}\meta{space} after it, in case a leading space is not present: \begin{lcode} \trimx#1 \endtrimx \end{lcode} It would be possible to do without the bizarre token and have the delimiter consist only of a space, but with some ensuing complications, I think, that would make it scarcely worthwhile. \subsection{Some remarks about the domain of the problem} The application I had in mind was, generally speaking, to remove unwanted spaces at the beginning and end of a piece of text supplied by the user, such as a section title or other heading. Typical situation: A user command \cmd{\title} takes an argument \begin{lcode} \title{ Some Article Title } \end{lcode} with the definition of \cmd{\title} being \begin{lcode} \def\title#1{\def\savedtitle{#1}\trimspaces\savedtitle} \end{lcode} Thereafter we may use \cmd{\savedtitle} in any number of ways: print it; put it in a \cmd{\mark} for running heads; write it to an auxiliary file for table of contents use, or for adding to a BibTeX database; or write it on screen to show progress when typesetting a collection of articles. For the last two examples in particular trimming spaces with \cmd{\ignorespaces} or \cmd{\unskip} is undesirable. Notice also that \cmd{\unskip} will remove \emph{any} trailing glue, including \cmd{\leader}'s or explicit \cmd{\hskip}'s that might sometimes be added by users for their own inscrutable purposes and whose unexpected removal could be (indeed, has been in true life) the cause of much consternation. If we call \cmd{\trimspaces} in the definition of \cmd{\title}, then leading and trailing spaces are removed once and for all, and none of the many functions that later use \cmd{\savedtitle} need to worry about that task. With this restricted domain of use in mind for \cmd{\trimspaces}, I screened the submitted solutions through the following conditions. \begin{description} \item[Condition 1] The text has been stored in a macro. The result of \cmd{\trimspaces} is a redefinition of the macro. This is not exactly a necessary condition, but removal of this condition would suggest that constructions like \begin{lcode} \def\foo#1{... \message{Your argument "\trimspaces{#1}" makes me laugh}% ...} \end{lcode} should be supported. The full expansion done by \cmd{\message} or other such commands, however, can't be applied carelessly to arbitrary user-supplied text. You would need to deactive problematic elements (by changing catcodes, adding \cmd{\protect}'s, whatever). So supporting full expansion for the operand of \cmd{\trimspaces} is of low relevance for the envisioned normal applications. \item[Condition 2] It suffices to remove a single space before and after the text. In almost any other programming language, a typical space-trimming function would need to handle the possibility of multiple consecutive spaces. But in text supplied by an average user through the normal TeX lexical conventions, consecutive spaces will be reduced to a single space before our trimming functions are ever called. The next installment of this `summary' will include a recently arrived solution by Jonathan Fine\index{Fine, Jonathan} that handles multiple trailing spaces as easily as a single one, without any extra implementation cost. \item[Condition 3] For both the trailing space and the leading space, we don't know whether or not they are present. If we knew for certain that a given space was present, of course, the procedure for removing it would be easier. \end{description} %%======================================================================== %%>>Solution 2 (Ian Collier) [Ian.Collier@prg.oxford.ac.uk] %\begin{description} \begin{solution}{Solution 2 (Ian Collier)}\index{Collier, Ian} \ldots I used \cmd{\meaning} to find out whether or not the first character of the argument is a space (because spaces are usually ignored and this seems to be the only way to make the space visible). I'm fairly sure that `blank space' is the only \cmd{\meaning} beginning with `bl'. I had rather a lot of trouble with braces, because if the first character is a brace then \cmd{\meaning} removes it and leaves an unmatched right brace. However I finally realised that \verb?\iffalse...\fi? could be used to remove it. \begin{lcode} {\catcode`Q=3 \catcode`@=11 \gdef\trimspace#1{\expandafter\trimspac@a#1QAA QB} \gdef\trimspac@x#1{\trimspac@a#1QAA QB} \gdef\trimspac@a#1 Q#2{\if#2A#1\expandafter\trimspac@b \else\trimspac@c#1\fi} \gdef\trimspac@b A QB{} \gdef\trimspac@c#1QAA{#1} \gdef\trimspaces#1{\expandafter\expandafter\expandafter\tr@a \expandafter\meaning#1A\fi{#1}} \gdef\tr@a#1#2{\if#1b\if#2l\expandafter\expandafter\expandafter\tr@c \else\expandafter\expandafter\expandafter\tr@b\fi\else \expandafter\tr@b\fi} \gdef\tr@b{\expandafter\trimspace\iffalse} \gdef\tr@c{\expandafter\tr@d\iffalse} \gdef\tr@d#1{\expandafter\tr@e#1Q} \def\:{\gdef\tr@e}\: #1Q{\trimspac@x{#1}} } \def\test#1{\edef\text{#1}\immediate\write16 {"\trimspaces\text"}} \test{ Leading space} \test{Trailing space } \test{ Leading and trailing spaces } \test{Nospaces} \test{ {braces}Leading space{braces}} \test{{braces}Trailing space{braces} } \test{ {braces}Leading and trailing spaces{braces} } \test{{braces} Nospaces {braces}} \test{} \test{ } \test{\space\space{two spaces}\space\space} \end \end{lcode} %%======================================================================== Comments: Some extra work would be necessary to handle the possibility \begin{lcode} \def\text{\iftrue a\else b\fi} \trimspaces\text \end{lcode} because removal of the \piif{iftrue} by \cmd{\meaning} will leave the \piif{else} and \piif{fi} unmatched, confusing the later \piif{iffalse} step done by \cmd{\tr@b}, \cmd{\tr@c}. But such a value for \cmd{\text} is rather unlikely in ordinary user-supplied arguments. %\end{description} \end{solution} \begin{comment} Some more solutions to Exercise 15 will follow in a few days. Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% mjd@math.ams.org (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456 789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ Date: 30 Dec 1993 17:07:17 -0500 (EST) From: Michael Downes Subject: Around the Bend #15, answers, 3rd installment To: info-tex@shsu.edu X-ListName: TeX-Related Network Discussion List \end{comment} %$ I have done some slight condensing in the answers, indicated by \verb?[...]?. Solution 3 by Greg McGary contains an interesting idea for an alternative syntax of the \cmd{\trimspaces} function: Instead of writing \begin{lcode} \def\savedtitle{#1}\trimspaces\savedtitle \end{lcode} you would write \begin{lcode} \trimmed\def\savetitle{#1} \end{lcode} %%======================================================================== %%>>Solution 3 (Greg McGary, gkm@tmn.com) %\begin{description} \begin{solution}{Solution 3 (Greg McGary)}\index{McGary, Greg} \begin{lcode} %%% preliminaries: (Mad about those abbreviations!) \catcode`@=11 \let\ea=\expandafter \let\nx=\noexpand \let\ag=\aftergroup \def\agg{\ag\ag\ag} \let\bg=\begingroup \let\eg=\endgroup [...] %%% The underlaying tool I use is \trimmed, which is used as a modifier for %%% macro definitions to trim the trailing space from the body: %%% \trimmed\def\foo{foo } will set \foo to {foo} %%% Notice that any form of \def modifier may be interposed between \trimmed %%% and \def, as in \trimmed\global\long\outer\def\foo{foo } %%% %%% As an aside, TeX has no \expanded modifier. Expanded definitions %%% must be accomplished through use of \edef or \xdef (equivalent to %%% \global\edef) This is annoying, as we might like to use \trimmed with %%% expanded definitions and don't want to write a separate \etrimmed. %%% Luckily, we can easily roll our own \expanded modifier, like so: \def\expanded#1\def{#1\edef} %%% Other modifiers may optionally be inserted between \expanded and %%% \def, like so: \def\foo{foo} \outer\expanded\long\def\bar{\foo} %%% Here's the definition of \trimmed: \long\def\trimmed#1\def#2#3{\bg \long\def\!##1##2 \!##3\trimmed@{\eg \ifx\relax##3\relax \trimmed@{##1}##2% \else ##1{##2}% \fi}% \!{#1\def#2}#3\! \!\trimmed@} \long\def\trimmed@#1#2\!{#1{#2}} %%% Notice the use of \begingroup...\endgroup to make the definition of \! %%% temporary so as not to disturb any previous definition, and so that the %%% temporary will disappear once we're done with it. Notice that the %%% \endgroup appears right away in the body of \!, so that the ensuing \def %%% will occur in the proper group. \! was chosen as a name for the temporary %%% macro because it is a non-alphabetic (non-catcode-11) character; any other %%% non-alphabetic would suffice as well. Non-alphabetic macro-names have the %%% desirable property of preserving any trailing space token. %%% %%% If we are really fastidious about keeping clutter out of the global name %%% space, we can also define \trimmed@ as a temporary alongside \!. We would %%% also want to use a name that's already defined, to avoid entering a new %%% name into TeX's hashtable. A non-alphabetic name like \: seems like a %%% good (though cryptic) choice: \long\def\trimmed#1\def#2#3{\bg \long\def\:##1##2\!{\eg##1{##2}} \long\def\!##1##2 \!##3\:{% \ifx\relax##3\relax \:{##1}##2% \else \eg##1{##2}% \fi}% \!{#1\def#2}#3\! \!\:} %%% Notice that we've had to delay the \endgroup until after our new %%% temporary \: has been used. %%% %%% Anyway, we may now define \trimspace as follows: \def\trimspace#1{\ea\trimmed\ea\def\ea#1\ea{#1}} %%% Notice that the replacement definition is a normal \def, whereas the %%% macro we started with could have had any number of modifiers attached, %%% such as \long, \outer, or \global. A further exercise might be to fix %%% this problem. %%% %%% A more generalized trim might allow any list of tokens to be trimmed off %%% the tail of another list of tokens. Here, we add an initial argument to %%% \trimmed specifying those tokens. In order to strip off trailing ".\par" %%% for instance, we could write: \trimmed{.\par}\outer\long\def\foo{foo.\par} %%% %%% Here's the general definition of \trimmed: \long\def\trimmed#1#2\def#3#4{\bg \long\def\:##1##2\!{\eg##1{##2}} \long\def\!##1##2#1\!##3\:{% \ifx\relax##3\relax \:{##1}##2% \else \eg##1{##2}% \fi}% \!{#2\def#3}#4\!#1\!\:} %%% The auxiliary \trimmed@ remains unchanged. Notice that we no longer really %%% need a non-alphabetic macro name for the temporary macro, since we don't %%% have to preserve the literal space token following the macro. %%% %%% Unfortunately, the literal space token problem doesn't disappear, it's just %%% pushed up a level. Now we have to give that space as an argument to \trimmed %%% in the definition of \trimspace, and hop over it with \expandafter! \edef\trimspace#1{\nx\ea\nx\trimmed\nx\ea {\nx\ea\space\nx\ea}\nx\ea\def\nx\ea#1\nx\ea{#1}} %%% N.B., The curly braces, "\nx\ea{...\nx\ea}" around the "\nx\ea\space" %%% are necessary. %%% %%% This approach of defining \trimspace in terms of an underlaying \trimmed %%% \def'inition facility has the advantage of reusing code, but the %%% disadvantage of forcing a macro redefintion even if there is no trailing %%% space to remove. We could modify \trimmed to produce a new macro, \trim, %%% that redefines a macro only if it has the trailing pattern of interest. %%% (It also happens to be simpler!) \long\def\trim#1#2{\bg \long\def\!##1#1\!##2\:{\eg \ifx\relax##2\relax \else \def#2{##1}% \fi}% \ea\!#2\!#1\!\:} %%% Now, we can define \trimspace in terms of \trim like so: \edef\trimspace#1{\nx\ea\nx\trim\nx\ea{\nx\ea\space\nx\ea}\nx\ea#1} %%% Ok, let's test it: \def\HasTrailingSpace{has trailing space } \def\NoTrailingSpace{no trailing space} \trimspace\HasTrailingSpace \show\HasTrailingSpace \trimspace\NoTrailingSpace \show\NoTrailingSpace %%% While we're at it, let's test another pattern: \def\HasTrailingDotPar{has trailing dot par.\par} \def\NoTrailingDotPar{no trailing dot par} \trim{.\par}\HasTrailingDotPar \show\HasTrailingDotPar \trim{.\par}\NoTrailingDotPar \show\NoTrailingDotPar %%% ### Exercise 15(b) %%% Write a macro \trimspaces that removes a leading space, if %%% present, and then calls \trimspace to remove a trailing space. %%% I'm going to solve this in a quick and dirty way, as it's getting %%% late and I'm running out of gas! Just use \futurelet sequestered %%% in a \vbox to inspect the first token. If it's a \space, gobble %%% the first token and subject the remaining tokens to \trimmed. \def\redefSansSp@ce#1 #2\redefSansSp@ce{\def#1{#2}} \def\redefSansSpace#1{\ea\redefSansSp@ce\ea#1#1\redefSansSp@ce} \def\trimspaces#1{\bg\setbox0=\vbox{% \def\maybeRedefSansSpace{\ea\ifx\space\@\agg\redefSansSpace\agg#1\fi}% \ea\futurelet\ea\@\ea\maybeRedefSansSpace#1}\eg \trimspace#1} %%% \futurelet won't work for the more general case of trimming an %%% arbitrary leading pattern, as it only looks at one token. %%% I'll leave solving the general case as an exercise for the reader ;-) %%% %%% This is also not the most efficient solution, since we redefine the macro %%% twice if there is a leading space. Notice that we put the \setbox0 %%% inside a group, to keep any previous definition of \box0 safe. This %%% is probably overkill, since \box0 is a temporary register and users %%% should be aware that it's fair game, but it doesn't hurt to be %%% courteous... Also note the abbreviation \agg, which pushes its argument %%% out two groups. [...] %%% Testing... \def\foo{ foo } \trimspaces\foo \show\foo \end{lcode} \end{solution} %%======================================================================== In the previous posting I discussed the method of removing a trailing space by scanning for a token pair \meta{space}\meta{bizarre}. In Schmitt's solution, for example, the bizarre token was a greater-than character with catcode 3. And in my solution, I used a letter Q with catcode 3. Solution 4 from Jonathan Fine takes the approach of using a second \meta{space} token for the \meta{bizarre} token. In practice this works for typical user-supplied text, as discussed before, since TeX's normal reduction of multiple spaces to single spaces makes the pair \meta{space}\meta{space} sufficiently bizarre. I have to admit I like this idea; those who attempted a solution for this exercise and struggled with various other delimiter possibilities will, I think, appreciate the humor of it as I did. As I mentioned last week, I found some theoretical interest in the fact that if multiple space tokens were present at the end of the text being trimmed, Fine's solution would remove them all, without needing to use recursion. But another correspondent pointed out since then that if multiple spaces were present at the end they might also be presumed possible in the middle of the scanned text, and an occurrence of multiple spaces in the middle would cause \cmd{\trim} to fail. \begin{solution}{Solution 4 (Jonathan Fine)}\index{Fine, Jonathan} \begin{lcode} %% NOTE: I have benefited from Michael Downes posting of answers, dated %% 16 December, particularly for stripping the leading space, and the %% discussion of the hazards of grouped arguments \catcode`\@=11 %% The Solution \def\trim #1{\expandafter\trim@\expandafter{#1 }#1} \def\trim@ #1{\trim@@ @#1 @ #1 @ @@} \def\trim@@ #1@ #2@ #3@@{\trim@@@\empty #2 @} \def\unbrace#1{#1} \unbrace{\def\trim@@@ #1 } #2@#3{\expandafter\def \expandafter #3\expandafter {#1}} %% Test Code \def\Test{\afterassignment\Test@ \def\test} \def\Test@{\trim\test \afterassignment\Test@@ \def\test@} \def\Test@@{\message{\ifx\test\test@ Y\else FAIL:|\meaning\test|\fi}} \catcode`\@=12 %% Testing The Solution \Test{}{} \Test{ }{} \Test{ a }{a} \Test{ {}{} }{{}{}} \Test{{braces}}{{braces}} \Test{ {braces} }{{braces}} \Test{ { braces } }{{ braces }} \Test{no space and no space}{no space and no space} \Test{no space and a space: }{no space and a space:} \Test{ :a space and no space}{:a space and no space} \Test{ :a space and a space: }{:a space and a space:} \Test{ \ifx }{\ifx} \Test{ \ifx/ }{\ifx/} \end{lcode} \end{solution} \begin{comment} Since my solution got rather long after I added some commentary I'll post it separately in a couple of days, rather than double the size of this post. Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% mjd@math.ams.org (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456 789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ Date: 03 Jan 1994 17:14:14 -0500 (EST) From: Michael Downes Subject: Around the Bend #15, answers, 4th (last) installment To: info-tex@shsu.edu X-ListName: TeX-Related Network Discussion List \end{comment} %$ My solution here is the result of weeks of incremental refinement, ending only last week, and consequently benefits from analysis of the other solutions. %%======================================================================== \begin{solution}{Solution 5 (Michael Downes)} \begin{lcode} % Here I only solve part (b) of Exercise 15, in an attempt to make % a solution of utmost compactness (3 control sequences, 45 tokens). % Also, it seems likely that in actual use \cmd{\trimspaces} can be % applied without harm whenever \trimspace might be needed. % % The method for pausing after each test might be of ancillary % interest to some readers; unlike the alternative of setting % \pausing=1, the \test's aren't required to be on separate lines. \catcode`\Q=3 % \cs{trimspaces}\x redefines \x to have the same replacement text sans % leading and trailing space tokens. % \def\cs{trimspaces}#1{% % Use grouping to emulate a multi-token afterassignment queue. \begingroup % Put `\toks 0 {' into the afterassignment queue. \aftergroup\toks\aftergroup0\aftergroup{% % Apply \trimb to the replacement text of #1, adding a leading % \noexpand to prevent brace stripping and to serve another purpose % later. \expandafter\trimb\expandafter\noexpand#1Q Q}% % Transfer the trimmed text back into #1. \edef#1{\the\toks0}% } % \trimb removes a trailing space if present, then calls \trimc to % clean up any leftover bizarre Qs, and trim a leading space. In % order for \trimc to work properly we need to put back a Q first. % \def\trimb#1 Q{\trimc#1Q} % Execute \vfuzz assignment to remove leading space; the \noexpand % will now prevent unwanted expansion of a macro or other expandable % token at the beginning of the trimmed text. The \endgroup will feed % in the \aftergroup tokens after the \vfuzz assignment is completed. % \def\trimc#1Q#2{\afterassignment\endgroup \vfuzz\the\vfuzz#1} \catcode`\Q=11 \def\test#1{\errhelp{#1}\message{[\the\errhelp]}% \edef\x{\the\errhelp}% \global\tracingcommands2\global\tracingmacros2\global\tracingonline0 \cs{trimspaces}\x \global\tracingcommands0\global\tracingmacros0\global\tracingonline0 \errhelp\expandafter{\x}\message{-> [\the\errhelp]}% \read16 to\PressReturnToContinue } \test{ x } \test{ xy z } \test{} \test{{}} \test{{}{}} \test{ {x} } \test{ } \test{{ }} \test{\AA} \test{\fi} \test{\space x\space} \test{ #1 } \end \end{lcode} Commentary Suppose we have a macro \cmd{\x} with replacement text \verb?" {xyz} "?. The task of \cmd{\trimspaces} is to construct a statement of the form \begin{lcode} \def\x{{xyz}} \end{lcode} i.e., to redefine \cmd{\x} with the same replacement text except for removal of a leading or trailing space. However, a similar statement \begin{lcode} \toks0{{xyz}}\edef\x{\the\toks0} \end{lcode} is more robust if the replacement text might contain \# tokens. For example, \begin{lcode} \def\x{\def\y##1{}} \end{lcode} works OK but after thus defining \cmd{\x}, the statements \begin{lcode} \def\trimx#1{\expandafter\def\expandafter\x\expandafter{#1}} \trimx\x \end{lcode} fail with an error message because the `\#1' in the definition of \cmd{\y} is misinterpreted as a parameter token for the redefinition of \cmd{\x}. Although \# tokens seem highly unlikely in average user-supplied text, I aimed for a statement of the second, robuster kind, as if I were writing \cmd{\trimspaces} for use in a major macro package with thousands of prospective users. The basic structure of \cmd{\trimspaces} is therefore: First remove a trailing space, then remove a leading space, then put the remaining text into \cmd{\toks}\texttt{0}, then transfer the text to \cmd{\x} with \cmd{\edef}. For removing the trailing space, I apply a macro scan with delimiter \verb?? Here the notation \verb?? means the character token consisting of character code \texttt{c} with catcode \texttt{n}. The leading space is removed by executing the assignment \verb?\vfuzz=\the\vfuzz? at the beginning of the operand text, in order to use a side effect of the assignment: removal of a following space. (Credit to Donald Arseneau for this good idea.) The main reason for using \verb?\the\vfuzz? instead of 0pt is that it's slightly shorter (one token), although if we did not have the group structure to localize the `change' to \cmd{\vfuzz}, then using \verb?\the\vfuzz? would also be a good idea for the sake of preserving the variable's previous value. The statement \verb?\vfuzz=\vfuzz? (sans \cmd{\the}), by the way, would not gobble a following space: when TeX recognizes a suitable variable on the right-hand side of an assignment, it copies the value directly into the left-hand side and skips the scanning process entirely. Here's a step-by-step breakdown of the operation of \cmd{\trimspaces} through two possibilities, one where both a leading and a trailing space are present, and one where neither are present. \begin{lcode} ------------------------------------------------------------------------ Case 1 (spaces present) Case 2 (no spaces to be removed) ------------------------------------------------------------------------ \def\x{ {xyz} } \cs{trimspaces}\x \def\x{{xyz}} \cs{trimspaces}\x Step 1: Step 1: \begingroup... Same as for Case 1. \expandafter\trimb \expandafter\noexpand\x Q Q}... Step 2: || Step 2: || \trimb\noexpand {xyz} Q Q... \trimb\noexpand{xyz}Q Q... ^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^ Here the row of ^^^ indicates the In this case the first Q is taken material that is taken as argument up as part of #1, which is passed #1 of \trimb, and || indicates the to \trimc. The second Q added by tokens that match the macro \trimb therefore falls after the delimiter. #1 is now passed to leftover Q instead of before. \trimc, with another Q token added; the leftover Q token pair follows. Step 3: | Step 3: | \trimc\noexpand {xyz}Q Q... \trimc\noexpand{xyz}QQ... ^^^^^^^^^^^^^^^ ^^ ^^^^^^^^^^^^^^ ^ Here we have #1, delimiter token Q, The situation at the end of the and #2. The space before the second trimmed text ends up being the same Q is skipped by TeX because it's as in Case 1, except for the looking for a nondelimited argument absence of a space between the Qs. for #2. Step 4: Step 4: \afterassignment\endgroup \afterassignment\endgroup \vfuzz\the\vfuzz\noexpand {xyz}}... \vfuzz\the\vfuzz\noexpand{xyz}}... ^ Here the ^ marks the leading space that is to be removed. Step 5: \endgroup{xyz}}... Step 5: \endgroup{xyz}}... \endgroup is from \afterassignment. Step 6: Step 6: \toks0{{xyz}} \toks0{{xyz}} ^^^^^^^---from \aftergroup ^^^^^^^---from \aftergroup \edef\x{\the\toks0} \edef\x{\the\toks0} \end{lcode} \end{solution} \begin{comment} ======================================================================== That's a wrap on Exercise 15. Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% mjd@math.ams.org (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456 789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ \end{comment} %$ %%\endinput \chapter{Assorted numbers, skips, and modes} \section{Exercise} %%\input{ex016} % ex016.tex \begin{comment} Date: 13 Jan 1994 16:42:27 -0500 (EST) From: Michael Downes Subject: Around the Bend #16 To: info-tex@shsu.edu X-ListName: TeX-Related Network Discussion List ************************************************************************ *** Exercise 16: \end{comment} \ed{\oposted{1994/01/13}. \arch{exercise.016}.} Predict the messages that will be produced by plain TeX for the following test file. \begin{lcode} \catcode`\@=11 \newcount\m \def\msg#1{\advance\m 1 \message{(\number\m): #1}} \def\T{\msg{T}}\def\F{\msg{F}} \mag=1728 \hfuzz=1pt \tabskip=1pt \baselineskip=12pt \topskip=10pt \lineskiplimit=1pt \lineskip=1pt \setbox0\vbox{% \mag=\time \ifnum\mag>1500 \T\else\F\fi % (1) \mag=\number\year \ifnum\mag>1500 \T\else\F\fi % (2) \hfuzz=99pt \ifdim\hfuzz=99pt \T\else \F\fi % (3) \tabskip=\z@ \ifdim\tabskip<\p@\T\else\F\fi % (4) \tabskip=\p@ minus2pt \ifdim\tabskip>\z@\T\else\F\fi % (5) \baselineskip=-\prevdepth \ifdim\baselineskip=12pt \T\else\F\fi % (6) \advance\baselineskip 2\topskip % (7) \ifdim\baselineskip>\@m\p@ \T\else\F\fi % \lineskiplimit=\z@ \ifnum\lineskiplimit>0 \T\else\F\fi % (8) \lineskip=\z@skip \ifdim\lineskip>\lineskiplimit \T\else\F\fi % (9) \kern2pc\ifdim\lastkern=2pc \T \else\F\fi % (10) \hskip1em \ifvmode\T\else\ifdim\lastskip>\z@\msg{FT}\else\msg{FF}\fi\fi % (11) \font\cmrtest=cmr10 \ifx\cmrtest\tenrm \T\else\F\fi % (12) } \end \end{lcode} Where should \cmd{\relax} be inserted? \begin{comment} ************************************************************************ Answers will be posted circa January 27, 1994. Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% mjd@math.ams.org (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456 789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ \end{comment} %$ %%\endinput \section{Answers} %%\input{ans016} % ans016.tex \begin{comment} [There was an error in the first posted version: \twelverm instead of the first \tenrm in the statement \font\tenrm = \fontname\tenrm scaled 1200 The posting containing this correction is appended below.] Date: 27 Jan 1994 11:59:48 -0500 (EST) From: Michael Downes Subject: Around the Bend #16, answers To: info-tex@shsu.edu X-ListName: TeX-Related Network Discussion List \end{comment} \ed{\oposted{1994/01/27}. \arch{answer.016}.} Here is my commentary on Around the Bend \#16. \begin{lcode} % \mag=1728 \hfuzz=1pt \tabskip=1pt \baselineskip=12pt % \topskip=10pt \lineskiplimit=1pt \lineskip=1pt % \mag=\time \ifnum\mag>1500 \T\else\F\fi % (1) \end{lcode} (1): F --- At the time of the \piif{ifnum}, \cmd{\mag} is in the range [0,1440) depending on what time it was when you ran TeX. \begin{lcode} % \mag=\number\year \ifnum\mag>1500 \T\else\F\fi % (2) \end{lcode} (2): F --- At the time of the \piif{ifnum}, \cmd{\mag} still has its previous value because TeX is still scanning for digits to add on after `1994'. \begin{lcode} % \hfuzz=99pt \ifdim\hfuzz=99pt \T\else \F\fi % (3) \end{lcode} (3): T --- Everything fine, dimension scanning terminated with the space after `99pt'. \begin{lcode} % \tabskip=\z@ \ifdim\tabskip<\p@\T\else\F\fi % (4) \end{lcode} (4): F --- \cmd{\z@} is a dimension register, therefore it serves only as the first part of the glue value that TeX is looking for. At the time of the \piif{ifdim}, TeX is still looking for `plus' or `minus' and hasn't yet finished the assignment of \cmd{\tabskip}. \begin{lcode} % \tabskip=\p@ minus2pt \ifdim\tabskip>\z@\T\else\F\fi % (5) \end{lcode} (5): T --- Glue value scanning terminated properly. \cmd{\p@} is a dimension register like \cmd{\z@} but the additional clause `minus 2pt' fills out the glue value to the required three parts. TeX assumes `plus 0pt' when it finds a `minus' clause without a preceding `plus' clause. Note that TeX does \emph{not} continue scanning for a possible `plus' after reading a minus component. Unlike the height, depth, and width components of a \cmd{\vrule} or \cmd{\hrule}, the components of a glue value have a required order and each part can only occur once. \begin{lcode} % \baselineskip=-\prevdepth \ifdim\baselineskip=12pt \T\else\F\fi % (6) \end{lcode} (6): T --- At the beginning of a vbox or at the beginning of a TeX run \cmd{\prevdepth} = -1000pt. So it would seem that \cmd{\baselineskip} should get set to +1000pt and the test should be False; but \cmd{\prevdepth} is a dimension register, not a glue register, so following stretch or shrink components are still possible, and \cmd{\baselineskip} does not yet have its new value at the time of the test. \begin{lcode} % \advance\baselineskip 2\topskip % (7) % \ifdim\baselineskip>\@m\p@ \T\else\F\fi % \end{lcode} (7): F --- Without the factor 2 in front of \cmd{\topskip}, the test would be True: \cmd{\topskip} is a glue register so TeX would copy each component of \cmd{\topskip} to the corresponding component of \cmd{\baselineskip}; then, having plus and minus components already in hand, TeX would not scan ahead for `plus' or `minus'. However, a preceding factor for a glue register causes TeX to use only the first component of the glue register, multiplied by the given factor, which means that additional scanning is then attempted for possible stretch or shrink components. \begin{lcode} % \lineskiplimit=\z@ \ifnum\lineskiplimit>0 \T\else\F\fi % (8) \end{lcode} (8): F --- Normal termination of dimension scanning. \cmd{\lineskiplimit} is a dimen register, not a glue register, so the dimen constant \cmd{\z@} is sufficient to complete the assignment and TeX scans no further. \begin{lcode} % \lineskip=\z@skip \ifdim\lineskip>\lineskiplimit \T\else\F\fi % (9) \end{lcode} (9): F --- Normal termination of glue scanning. \cmd{\z@skip} is a glue register so it suffices to complete the assignment of \cmd{\lineskip}. Compare to the \cmd{\tabskip} assignments above. \begin{lcode} % \kern2pc\ifdim\lastkern=2pc \T \else\F\fi % (10) \end{lcode} (10): F --- At the time of the \piif{ifdim}, TeX is still looking for an optional final space at the end of the dimension value `2pc'. If it were \verb?2\p@? instead of \verb?2pc?, the test would evaluate to True. \begin{lcode} % \hskip1em % \ifvmode\T\else\ifdim\lastskip>\z@\msg{FT}\else\msg{FF}\fi\fi % (11) \end{lcode} (11) FF --- TeX enters horizontal mode as soon as the \cmd{\hskip} command comes along, before it finishes scanning the skip amount. So the \piif{ifvmode} test is false. The \piif{ifdim} test is also false because scanning is not yet complete (TeX is looking ahead for a plus or minus component) so the glue has not yet been entered into the horizontal list, so it is not accessible to \cmd{\lastskip}. For more on the switch into horizontal mode, see `TeX from \cmd{\indent} to \piif{par}', Marek Ry{\'c}ko and Bogus{\l}aw Jackowski, TUGboat 14/3, October 1993 (1993 Annual Meeting Proceedings), pp. 171--176. \begin{lcode} % \font\cmrtest=cmr10 \ifx\cmrtest\tenrm \T\else\F\fi % (12) \end{lcode} (12) F --- Interestingly, the following versions of the \piif{ifx} test are also false at that point: \begin{lcode} \ifx\cmrtest\undefined, \ifx\cmrtest\relax. \end{lcode} The reason is that after `\verb?\font\cmrtest?' TeX immediately sets \verb?\cmrtest = \nullfont?, before scanning the rest of the font assignment. So the test \verb?\ifx\cmrtest\nullfont? would yield True. According to the \emph{TeXbook}, the reason for this behavior is to allow statements of the form \begin{lcode} \font\cmrtest=cmr10 \cmrtest \end{lcode} for switching to the font \cmd{\cmrtest} immediately after it is defined. TeX does a bit of boomeranging in such a case: \begin{lcode} \font\cmrtest % set \cmrtest = \nullfont =cmr10 % space terminates font name, start looking for % "at" or "scaled" \cmrtest % \cmrtest = \nullfont = nonexpandable, not % "a", not "s"; terminate the font assignment % and put back the \cmrtest token to be read % again: \cmrtest % Now \cmrtest selects the given font \end{lcode} Although I sympathize with Knuth's desire to smooth out a potential problem for naive users, I wonder if it only encourages users to pay less attention to the nitty-gritty details of scanning and expansion, and therefore lay themselves open to greater confusion later on when something similar fails (inconsistently!) to work. I'd have thought it better to require, and document, proper termination of font assignment scanning by \cmd{\relax} or whatever. Users would have to be a little more knowledgeable but they would be rewarded with a more consistent language to work with. As it stands TeX unnaturally forbids certain constructions that are perfectly colloquial to anyone who has an ear for the TeX language, such as \begin{lcode} \font\tenrm = \fontname\tenrm\space scaled 1200 \end{lcode} I hold a similar opinion for the way \cmd{\chardef} and \cmd{\mathchardef} set their arguments to \cmd{\relax} before scanning the number on the right-hand-side of the assignment. Occasionally I would \emph{like} to be able to write something like \begin{lcode} \chardef\foo=\ifcase\foo 1\or 2\else 3\fi \end{lcode} but TeX doesn't allow that. One could argue that the \cmd{\chardef} behavior should for consistency be imitated by \cmd{\edef}, \cmd{\xdef} so that if \cmd{\foo} is undefined then \begin{lcode} \edef\foo{a\foo} \end{lcode} should not give an undefined control-sequence error for the \cmd{\foo} in the replacement text, but make it temporarily equivalent to \cmd{\relax} and leave it there. (Of course, this means that executing \cmd{\foo} will then start up an infinite loop, but my point was that it's the behavior of \cmd{\chardef} that should be changed to achieve consistency, not the behavior of \cmd{\edef}.) %%%======================================================================== At the end of Exercise \#16 there was the question `Where should \cmd{\relax} should be inserted?' \cmd{\relax} should be inserted just before the \piif{if}... in statements (2), (6), (7), (11), and (12). In statement (4) \cmd{\z@skip} should be used instead of \cmd{\z@}; then \cmd{\relax} is unnecessary. A space suffices instead of \cmd{\relax} in (10). I would also tend to put a \cmd{\relax} at the end of the preliminary assignments to \cmd{\baselineskip} and \cmd{\lineskip}, as a matter of principle; I like to make sure that scanning is definitely terminated at the end of a line, so that if any error occurs during the scanning, TeX will show the line containing the assignment statement and not a later line. This is particularly relevant for font assignments: If \pfile{foo10.tfm} does not exist on your system, then the assignment \begin{lcode} \font\foo=foo10 \end{lcode} will cause TeX to show you the blank line instead of the preceding line in the error context: \begin{lcode} ! Font \foo=foo10 not loadable: Metric (TFM) file not found. \par l.2 \end{lcode} And if the following material is some complicated macro instead of a blank line, TeX will go into the replacement text of the macro, looking for `at' or `scaled', before giving the error message! \begin{comment} Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% mjd@math.ams.org (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456 789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ Date: 28 Jan 1994 08:01:12 -0500 (EST) From: Michael Downes Subject: Around the Bend #16, answers, correction To: info-tex@shsu.edu Instead of \font\twelverm = \fontname\tenrm\space scaled 1200 read \font\tenrm = \fontname\tenrm\space scaled 1200 The latter line is what I originally wrote but I changed it in an obtuse moment a day later, forgetting the very point it was supposed to illustrate. \end{comment} %$ %%\endinput \chapter{Missing \cs{input} file} \section{Exercise} %%\input{ex017} % ex017.tex \begin{comment} Date: 14 Jan 1994 12:44:13 -0500 (EST) From: Michael Downes Subject: Around the Bend #17 To: info-tex@shsu.edu X-ListName: TeX-Related Network Discussion List \end{comment} \ed{\oposted{1994/01/15}. \arch{exercise.017}.} %%************************************************************************ %%*** Exercise 17: When TeX cannot find an input file it prompts with `Please enter another input file name:'. On some systems you can enter `nul' in response to this prompt to have TeX input a null file and continue processing. On most systems TeX also allows you to enter a system-dependent end-of-file character (Control-Z (DOS, VMS), Control-D (Unix), ...?), to which it responds with an "Emergency stop" instead of continued processing. An alternative would be to maintain a file called `\pfile{.tex}' containing an error message so that merely pressing RETURN would cause TeX to read `\pfile{.tex}' and issue the error message. Unlike the null file case or EOF-character case, this would allow normal access to the full menu of error recovery options, including e.g., exiting to an editor, inserting or deleting tokens, or changing the interaction mode. It would probably be nice to have the file also accessible under various aliases `\pfile{h.tex}', `\pfile{help.tex}', `\pfile{?.tex}', `\pfile{q.tex}', `\pfile{quit.tex}', `\pfile{x.tex}', `\pfile{exit.tex}', or `\verb?@#&@%$.tex?' corresponding to typical responses from stumped users. But making a robust `\pfile{.tex}' file for input error recovery is not so simple a task as might first seem. One needs to take into account, for example, the possibility that an \cmd{\input} might be attempted when normal catcodes or normal \cmd{\endlinechar} are not in effect. Given the programmability of TeX, an all-encompassing solution is probably not possible, so this exercise has two parts: consider what would be a reasonable minimal set of assumptions for an input error recovery file; and write a \pfile{.tex} file containing a suitable error message and satisfying the assumptions. %%************************************************************************ Motivation: From \url{comp.text.tex}: \begin{lcode} > From: wayne@csri.toronto.edu (Wayne Hayes) > Subject: Why does TeX ignore interupts??? > Message-ID: <1993Dec24.000935.2007@jarvis.csri.toronto.edu> > Date: 24 Dec 93 05:09:35 GMT > > If there's ONE thing that annoys me more than anything about a program, > it's when it refuses to die on command, and for no good reason. The > absolute worst case is when it's waiting for input and you don't know > what to tell it, and would like to quit for now. > > Thus my extreme annoyance every time I mistype an \input command to TeX > and it asks me on the terminal "Please input another file name: ", and > I usually just want to exit and re-edit my file to fix the \input > error. But TeX refuses to die when I press ^C at this moment, and will > only die if I send a QUIT (^\), at which point it dumps a > multi-megabyte core file into the current directory. ARGGGHHHH!! Why > does it do this? I can't see any good reason why it ignores interupts > at this point. Is this intended? Is it a bug? Does it drive anyone > else as nuts as it drives me?? Can it be changed in the next release??? \end{lcode} It's puzzling that most of the implementations of TeX I know of don't check for the interrupt key possibility at this prompt [Textures notably cuts clean through the problem by popping up a dialog box if an input file is not found]. Seems as if interrupt-key checking at that point would be a desirable addition to the set of system-dependent changes for each system. \begin{comment} A summary will be posted circa February 17, 1994. Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% mjd@math.ams.org (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456 789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ \end{comment} %$ %%\endinput \section{Answers} %%\input{ans017} % ans017.tex \begin{comment} [The TUGboat article mentioned below appeared as [info not yet available--18-Aug-1994]] Date: 17 Mar 1994 13:04:36 -0500 (EST) From: Michael Downes Subject: Around the Bend #17, answers To: info-tex@shsu.edu X-ListName: TeX-Related Network Discussion List \end{comment} \ed{\oposted{1994/03/13}. \arch{answer.017}.} Exercise 17 (posted January 14) asked for an error recovery file to provide better recovery from file input errors: When TeX cannot find an input file, it prompts for an alternative file name and refuses to continue until a valid file name is entered or the user presses some (system-dependent) abort key. This can be rather unfriendly, especially for novice users. At the request of Barbara Beeton\index{Beeton, Barbara} (TUGboat's editor) I wrote up the results of this exercise as an article for publication in TUGboat, so this posting will be largely redundant with that article. %%%------------------------------------- %%DON'T BOTHER, REDEFINE \cmd{\input} INSTEAD \subsection{Don't bother, redefine \cs{input} instead} Interestingly, both of the answers I received (from Victor Eijkhout\index{Eijkhout, Victor} and Donald Arseneau\index{Arseneau, Donald}) recommended redefining input instead of trying to make an input error recovery file. Donald summed it up thus: \begin{quotation} Since verbatim file input is an important mainstream application, the task is hopeless. The right approach is to redefine \cmd{\input} and check for the file's existence at the macro level. \end{quotation} I.e., consider the way a typical \cmd{\verbfile} commands works: first, start a group; next, deactivate all special characters such as \verb?\ { } # % }? by changing their catcodes; then input the desired file; and finally close the group to restore normal catcodes. If the desired file is not found and an input error recovery file is read instead, the IERF will not be able to do anything because of the deactivation of \verb?\ { }? etc. %%---------------------------------------------- %%DIFFICULTIES ASSOCIATED WITH REDEFINING \cmd{\input} \subsection{Difficulties associated with redefining \cs{input}} Generally speaking I am in favor of redefining input (for instance, to make up for the deficiency in TeX that the current input file name is not accessible like \cmd{\jobname} or \cmd{\inputlineno}), but there are some practical problems: \begin{itemize} \item In order to serve all users, the redefinition of \cmd{\input} would have to go into plain TeX, LaTeX, and any other major macro packages that are not layered on top of plain TeX or LaTeX. \item The most commonly used approach to test for the existence of an input file is \begin{lcode} \openin N=file.name \ifeof N ... \end{lcode} but for some TeX implementations \cmd{\openin} will only open a file in the current directory, and not search through the entire `TeX inputs' path. I believe that this restriction is canonical in \pfile{TeX.web} therefore only overridden by the system-dependent changes of each TeX implementation according to the judgment of the individual implementor. \item The details of how to redefine \cmd{\input} are nontrivial. If you redefine \cmd{\input} to take an argument delimited by a space, for example, there is some risk of bombing on existing files with statements like \begin{lcode} \input x.y\relax \end{lcode} It becomes especially nontrivial if you want to use some method other than simple \verb?\openin ... \ifeof? to test for file existence, so that the method will be reliable across all systems. It is worth noting that in LaTeX2e the \cmd{\input} command has been dramatically overhauled so that it solves, among other things, some of the problems mentioned here. Anyone doubting the claim that the work is nontrivial is invited to look at the LaTeX2e definitions. \item Redefining \cmd{\input} will (generally speaking) not help for the jobname file itself. When the file name is given on the command line, or following a ** prompt, the input operation is done directly by TeX instead of through invoking the control sequence \cmd{\input}. \item When a non-existing file is called for by a verb-file command, TeX will prompt the user for a file name, and then if a \pfile{.tex} recovery file exists, pressing \meta{return} will typeset the contents of that file; but this is at least as good as inputting a null file, in that you are not stuck at the prompt with no obvious way to quit. \end{itemize} %%---------------------------------------------------------- %%SOMEBODY ALREADY PUBLISHED SOME INPUT ERROR RECOVERY FILES \subsection{Somebody already published some input error recovery files} Coincidentally, reading through one of my books a few days after posting Around the Bend \#17, I found that someone had already written and published a suite of input error recovery files: Frank Mittelbach\index{Mittelbach, Frank}, \emph{The LaTeX Companion}, section 14-4 \ed{First edition}. %%------------------------------------------------------ %%BUT WHAT THE HECK, HERE ARE MY SLIGHTLY DIFFERENT ONES \subsection{But what the heck, here are my slightly different ones} The basic idea is to create a file named \pfile{h.tex} that will produce an \cmd{\errmessage}\verb?{...}? statement. Copies (or links) of this file will be made under several different names corresponding to the typical user responses to an input file error, to the extent that the operating system permits. So a first attempt would be something like this: \begin{lcode} \errmessage{Enter x to exit or ? to see other options} \end{lcode} Suppose we test this with a simple test file: \begin{lcode} % This is line 1 % This is line 2 \input fzrg \relax % This is line 3 % This is line 4 \end \end{lcode} The on-screen result looks like this: \begin{lcode} ! I can't find file `fzrg.tex'. l.3 \input fzrg \relax % This is line 3 Please type another input file name: h (h.tex ! Enter x to exit or ? to see other options. l.1 ... to exit or ? to see other options} ? \end{lcode} Then if the user enters \texttt{?} they will see \begin{lcode} Type to proceed, S to scroll future error messages, R to run without stopping, Q to run quietly, I to insert something, E to edit your file, 1 or ... or 9 to ignore the next 1 to 9 tokens of input, H for help, X to quit. ? x \end{lcode} Now let's examine this solution a little more closely, to ask what are the potential problems, and what assumptions can be done away with? One problem is the possibility of an unusual catcode for space, question mark, left brace, right brace, backslash, or \cmd{\endlinechar}. For the backslash (and the letters) we don't have much choice; if they don't have normal catcodes, \pfile{h.tex} cannot issue an \cmd{\errmessage} command, or even try to fix up the catcodes. (This is why the problem of verbatim file input is insoluble, if primitive \cmd{\input} is used.) Note that for users of a macro package such as texinfo, which has \verb?@? for the escape character instead of backslash, a different IERF would be required. The \cmd{\endlinechar} problem can be solved by adding a percent sign at the end of the line: \begin{lcode} \errmessage{...}% \end{lcode} but at the cost of a new assumption: percent must have catcode 14. This and some of the other catcode assumptions can be removed with a bit of extra work: \begin{lcode} \begingroup\chardef\%37\catcode\%14\chardef\ 32\catcode\ 10\relax% \catcode123 1\catcode125 2\catcode63 12 % \errmessage{% Enter x to exit or ? to see other options}% \endgroup\endinput% \end{lcode} This enforces the desired catcodes for \verb|space, %, {, }, and ?|; and putting \% at the end of each line makes \cmd{\endlinechar} harmless, no matter what its prevailing value and catcode might happen to be. The \cmd{\begingroup} ... \cmd{\endgroup} pair of course keep the catcode changes local, just in case (though I expect that the user will normally choose to exit anyway). I write \begin{lcode} \chardef\%37\catcode\%14 \end{lcode} in preference to the alternatives \begin{lcode} \catcode37 14 \catcode37=14 \catcode37'16 \catcode37"E \catcode`\%14 \end{lcode} which require assuming a usable catcode for one extra character (space or = or ' or ...). Even using \cmd{\string}, as in \begin{lcode} \catcode37\string"E \end{lcode} would fail if \texttt{"} had catcode 5, 9, 10, 11, 14, or 15. Here now is the screen output produced by the above IERF: \begin{lcode} ! I can't find file `fzrg'. l.3 \input fzrg \relax % This is line 3 Please type another input file name: h (h.tex ! Enter x to exit or ? to see other options. l.5 Enter x to exit or ? to see other options} % ? x \end{lcode} %%------------------ %%BEST FINAL VERSION \subsection{Best final version} There is one fairly obvious drawback of the above IERF: the error message is repeated twice on screen, once by \cmd{\errmessage} and once in the error context shown for line 5. There is a little trick that can be used to fix that: Use only the error context for showing the message text, by putting it in a comment rather than in the argument of \cmd{\errmessage}! [Cf.the comment after \cmd{\patterns} in the original TeX hyphenation patterns file hyphen.tex.] \begin{lcode} \begingroup\chardef\%37\catcode\%14\chardef\?63\catcode\?12\relax% \chardef\{123\catcode\{1\chardef\ 32\catcode\ 2\relax% \errmessage{Input\string canceled\string ..% % Enter x to exit or ? to see other options % \endgroup\endinput% \end{lcode} I have thrown in some extra cleverness with the catcode of space to clean up the screen output a tiny bit more. The result looks like this: \begin{lcode} ! I can't find file `fzrg'. l.3 \input fzrg \relax % This is line 3 Please type another input file name: h (h.tex ! Input canceled ... l.4 % Enter x to exit or ? to see other options % ? x \end{lcode} Frank Mittelbach's IERF solution differs from mine by providing a set of files that attempt to mimic standard TeX error recovery according to their name: The file \pfile{s.tex}, for example, arranges to switch into \cmd{\scrollmode} and continue processing, as would happen if you entered `s' at a normal error message prompt. And there are files named \pfile{e.tex}, \pfile{x.tex}, \pfile{q.tex} that mimic the corresponding error message actions. His IERFs also don't bother to worry about possible odd catcodes for \{, space, \}, etc.---an approach whose simplicity perhaps outweighs the minor added robustness of my version. %%----------- %%CONCLUSIONS \subsection{Conclusions} It seems that it would be a worthy service to their users if the authors of all TeX implementations took a second look at how input file errors are handled and added suitable actions depending on the operating system. For example, under DOS it is difficult to create a file named \pfile{.tex}, so perhaps emTeX, PCTeX, TurboTeX, etc., should check for the case when the user presses the \meta{return} key at the prompt, and automatically exit instead of trying to input a highly improbable file! Similar arguments would hold for an input file name of \pfile{?} or \pfile{?.tex} for operating systems where \texttt{?} is an OS wild-card character. And another part of improving the input error handling might be to add to their standard distributions a set of IERFs in the TeX inputs area, to help users who are using some macro package \emph{other} than LaTeX2e. (Or, even for LaTeX2e users, to help in the case when it is the jobname file itself that was not input-able.) I recommend of course my IERF given above; my feelings would not be deeply wounded, however, if Frank's version gets used instead. Installing either version would be much better for end users than none at all. \begin{comment} Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% mjd@math.ams.org (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456 789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ \end{comment} %$ %%\endinput \chapter{Page breaking} \section{Exercise} %%\input{ex018} % ex018.tex \begin{comment} Date: 21 Apr 1994 09:48:48 -0400 (EDT) From: Michael Downes Subject: Around the Bend #18 To: info-tex@shsu.edu X-ListName: TeX-Related Network Discussion List ======================================================================== *** Exercise 18: \end{comment} \ed{\oposted{1994/04/21}. \arch{exercise.018}.} On page 254 of the \emph{TeXbook} the following output routine is described: \begin{lcode} \output={\unvbox255 \penalty\outputpenalty} \end{lcode} and in the ensuing text Knuth writes `If the \cmd{\vsize} hasn't changed, and if no insertions have been held over, the same page break will be found.' This claim is rather false. Why? How should the output routine be rewritten to work as intended? %%======================================================================== Thanks to William Baxter\index{Baxter, William} %(web@superscript.com) for contributing this question. \begin{comment} Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% mjd@math.ams.org (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456 789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ \end{comment} %$ %%\endinput \section{Answers} %%\input{ans018} % ans018.tex \begin{comment} Date: 27 May 1994 08:19:39 -0400 (EDT) From: Michael Downes Subject: Around the Bend #18, answer To: info-tex@shsu.edu \end{comment} \ed{\oposted{1994/05/27}. \arch{answer.018}.} I intended to post this sooner but in researching the answer it turned out that in order to clear up a couple of nagging questions I had to follow some side trails a long way. %%Answer to Around the Bend #18: Exercise 18 (21 April 1994) pointed out that the output routine \begin{lcode} \output={\unvbox255 \penalty\outputpenalty} \end{lcode} described in the \emph{TeXbook} p 254 doesn't exactly work as intended: `If the \cmd{\vsize} hasn't changed, and if no insertions have been held over, the same page break will be found.' The same pagebreak will be found only if the original page break occurred at a penalty item. Otherwise (\emph{TeXbook}, p 125) TeX sets \cmd{\outputpenalty}\texttt{=10000} before firing up the user's output routine. Consequently, the output routine constructs a vertical list in which the original break point has disappeared. By an optimization found in section 890 of \emph{TeX: The Program}, the penalty between two paragraph lines---the sum of all applicable penalties from the set \cmd{\interlinepenalty}, \cmd{\clubpenalty}, \cmd{\widowpenalty}, \cmd{\displaywidowpenalty}, and \cmd{\brokenpenalty}---is not actually added to the vertical list unless it is nonzero. Thus when \cmd{\interlinepenalty} = 0 (default from IniTeX/plain TeX) and hyphenated lines are not too frequent, `most' pairs of lines in a paragraph have no intervening penalty. And there is usually no penalty between ordinary text paragraphs. Thus an \cmd{\outputpenalty} value of 10000 will occur fairly often in practice. W. E. Baxter\index{Baxter, William}\index{Baxter, W E|see{Baxter, William}} (the submitter of this exercise) looked into the possibility of recompiling TeX without the cited optimization, but found that the resulting version fails the trip test. In order for the example to work as intended it would have to be rewritten as \begin{lcode} \output={\unvbox255 \ifnum\outputpenalty=10000 \else \penalty\outputpenalty\fi} \end{lcode} For completeness it should be pointed out that the output routine could come even closer to the goal of `doing nothing' if the parameter \cmd{\holdinginserts}, added in TeX version 3.0 (circa 1990), were set to some value greater than 0, so that the state of floating inserts would be preserved; but that has to be done before the output routine is entered. I would have said that such a do-nothing output routine is useless, but as a matter of fact I wrote something rather close to it as one cycle of a multi-cycle output routine a couple of years ago. The goal was to look at the values of \cmd{\pagetotal}, \cmd{\pagestretch}, etc in order to print a complete survey of the page contents in a marginal note, to help the person dealing with page break decisions when the automatic breaks turned out to be inadequate. Unfortunately, the values of \cmd{\pagetotal} etc reported in the output routine are not exactly the values that are needed, because if the page break did not occur at a forcing penalty ($<=-10000$) then the values include material on the recent contributions list, yet only the material up to the chosen page break is relevant. So in order to get accurate values I had to insert a do-almost-nothing cycle that merely inserted a forcing penalty at the break point after dumping the contents of \texttt{box255} back on the main vertical list. %%------------------------------------------------------------------------ \subsection{Some historical research} If you have an older copy of the \emph{TeXbook} (pre-1990), as I do, the above-mentioned section on p 125 about \cmd{\outputpenalty} says that it is set to 0 (rather than 10000) if the break did not occur at a penalty item. Thus the output routine example on p 254 seems to be another case of a well-known phenomenon: documentation failing to keep up with changes in the software. Make a note of it in your copy! Excerpt from the \emph{TeXbook} errata files: \begin{verbatim} \bugonpage A125, lines 13--29 (9/23/89) \ddanger \looseness=-1 When the best page break is finally chosen, \TeX\ removes everything after the chosen breakpoint from the bottom of the ``current page,'' and puts it all back at the top of the ``recent contributions.'' The chosen breakpoint itself is placed at the very top of the recent contributions. If it is a penalty item, the value of the penalty is recorded in ^|\outputpenalty| and the penalty in the contribution list is changed to $10000$; otherwise |\outputpenalty| is set to 10000. \end{verbatim} It's not clear to me from a cursory examination of \pfile{tex82.bug}, \pfile{errata-five.tex}, and \pfile{tex.web} when this change occurred in \pfile{tex.web}, but it seems that it must have occurred rather early, perhaps in the work on TeX82 (1982--1983); if so, then the claim that outputpenalty was set to 0 was a five-year-old oversight when Knuth changed it in 1989. In \pfile{tex82.bug} there is no reference to output\_penalty or even inf\_penalty near 9/23/89, and tracing backwards from there didn't turn up anything that seemed relevant to me. Furthermore, a copy of TeX version 2 (circa 1985) that I was able to dig up had outputpenalty 10000 instead of 0, following the erratum, and my 1986 copy of \emph{TeX: The Program} (i.e. the woven version of tex.web) agrees with that. Thanks again to W. E. Baxter\index{Baxter, William} for contributing this exercise and several parts of the answer. %%\endinput \chapter{Author lists} %%\input{bend019} % bend019.tex \section{Exercise (hard)} \ed{\oposted{1994/08/23}} First, an announcement: Archive copies of exercises and solutions in the Around the Bend series are now available over the network, thanks to the ongoing remarkably fine service of CTAN (\url{ftp.shsu.edu}, \url{ftp.dante.de}, \url{ftp.tex.ac.uk},\ldots). Look in the directory \url{tex-archive/info/aro-bend}. %======================================================================== %%*** Exercise 19 (hard): In a multi-author LaTeX article, author names are normally given as a list with \cmd{\and} separating the names, for example \begin{lcode} Arthur B. Clark\and Damian Edlan\and Ferency G. van Hoep \end{lcode} The way the author names are laid out on the printed page may vary widely from one publication to another. The generic `article' documentclass provides a definition for \cmd{\and} to print the author names together with their addresses in an array form. But there is no support in basic LaTeX to print such a list of names in standard series form \begin{lcode} A (1 author) A and B (2 authors) A, B, and C (3+ authors) \end{lcode} \begin{enumerate} \item Write a macro \cmd{\andlist} to convert a list of author names to series form. Assume that the names reside in a macro \cmd{\@author}. Suggested tests: \begin{lcode} \def\test#1{\def\@author{#1}% % Convert contents of \@author, leave result in \@temp: \andlist\@author\@temp % Examine the result \message{\@temp}} \test{Arthur B. Clark} \test{Arthur B. Clark\and Damian Edlan} \test{Arthur B. Clark \and Damian Edlan \and Ferency G. van Hoep} \test{Arthur B. Clark \and Damian Edlan \and Ferency G. van Hoep \and Irene Jackson} \end{lcode} to produce \begin{lcode} Arthur B. Clark Arthur B. Clark and Damian Edlan Arthur B. Clark, Damian Edlan, and Ferency G. van Hoep Arthur B. Clark, Damian Edlan, Ferency G. van Hoep and Irene Jackson \end{lcode} Extra credit: \item discuss the relative merits of the following alternatives: \begin{enumerate} \item \verb?\andlist\@authors\@temp? The function \cmd{\andlist} takes two macro names as arguments, converts the contents of the first macro and leaves the result in the second macro. \item \verb?\andlist\@authors? The function \cmd{\andlist} takes one macro name as its argument and replaces the contents of the macro with the converted version of its contents. \item \verb?\andlist\@authors? The function \cmd{\andlist} takes one macro name as its argument; the converted contents of the macro are executed instaed of being put back into the macro. \item other? \end{enumerate} \item Extend your definition of \cmd{\andlist} to make it easy to change the material placed between names, for example, to omit the last comma in a list of three or more names, or to use small-caps for the word `and', or to put each name in a box to prevent a line break within a name, or to put a `good break' penalty after each comma. \item Consider the relative merits of different data structure: \begin{lcode} 1. A\and B\and C 2. A,B,C 3. \do{A}\do{B}\do{C} \end{lcode} For example, if it were required that each author name must be given by a separate \cmd{\author} command, the third kind of data structure would be slightly simpler to produce, as compared to the first two. Having the data in the second form might make it possible for \cmd{\andlist} to use some of the pre-existing internal routines in LaTeX for processing comma-separated lists. And so forth. \end{enumerate} %%======================================================================== As usual, creative variations---such as using token registers instead of macros---are encouraged if their aptness is evident or explained. Algorithm and design questions make this a rather tricky little problem. (Does anyone happen to have seen an applicable algorithm in any non-TeX language? I imagine it may be needed in some SGML applications.) Solutions will be posted circa September 12, 1994. %%Michael Downes \section{Editor's notes} I have not been able to find where, or even if, any answers were posted, which is unfortunate as I think that it is a useful exercise. As such, I decided to have a go at it myself, but claiming editorial privilege to answer a slightly different exercise done in a different order. The basic question is how to convert a list of names separated by a particular token (\cmd{\and} in the exercise) to a list of the same names with different separators (for example `,'). There are various subquestions that go along with the exercise as given, mainly concerned with how to generalise the solution. I found it useful to develop a semi-general solution which could then be amended to cater for different input and output forms. Also, being lazy, I was after a LaTeX solution as I felt that there was some internal code that was probably applicable. There are basically three separators that may appear in the final list: \begin{itemize} \item If there is only a single name in the list, no separator is required. \item If there are two names then a separator is required between them, call this \cmd{\pairsep}. \item If there are three or more names in the list then there is a separator between the penultimate and last name (call this \cmd{\lastsep}), and separators between all the previous names, and I'll call this \cmd{\midsep}. \end{itemize} In the initial exercise as given these are, respectively, `and', `, and' and `,'. The implication here is that for the general case of more than two entries we need to know when we are coming to the end of the list so that we can insert \cmd{\lastsep} just before outputting the last list entry. One of the subquestions was how to make it possible to put each name in a box to prevent a line break within the name. To do this implies that each name should be output as the argument of a macro, say \cmd{\opname}, that can be used to perform some action on the name. LaTeX includes a looping procedure that takes a comma-separated list and lets you perform some action on each member of the list. Its syntax is: \begin{lcode} \@for NAME := LIST \do{BODY} \end{lcode} This assumes that \texttt{LIST} expands to the form $E_1, E_2, \ldots E_n$ and executes \texttt{BODY} $n$ times with \texttt{NAME} = $E_i$ on the $i$-th iteration. This is what I will use as the basis of my solution. Here's my basic general solution, where the list of names is of the form \texttt{A,B,C,D,\ldots N}. I'm assuming that this is in a \pfile{.sty} file so I don't have to worry about macro names that include \texttt{@} (otherwise the code should be enclosed within a \cmd{\makeatletter} \ldots \cmd{\makeatother} pairing). \begin{lcode} %% these are in LaTeX kernel \providecommand{\z@}{0} \providecommand{\@ne}{1} \providecommand{\tw@}{2} \newcount\totalcnt % total number of names in list \newcount\entrycnt % number of `current' name \newcommand*{\opname}[1]{#1} \newcommand*{\pairsep}{\space and} \newcommand*{\midsep}{\unskip,} \newcommand*{\lastsep}{\unskip, and} %% \commaed is the key part of the solution, converting %% the separators in a comma-separated list to something else \newcommand*{\commaed}[1]{% %%% #1 is comma-separated list of names %% get number of names \totalcnt\z@% zero \totalcnt \@for\@tempa:=#1\do{\advance\totalcnt\@ne}% %% process the list \entrycnt\@ne% initialise \entrycnt to 1 \@for\@tempa:=#1\do{% \advance\entrycnt\@ne% increment \entrycnt \ifnum\totalcnt=\@ne %% a single entry \opname{\@tempa} \else \ifnum\totalcnt=\tw@ %% just two entries \ifnum\entrycnt=\tw@ \opname{\@tempa}\pairsep \else \opname{\@tempa} \fi \else %% More than two entries in list \ifnum\entrycnt<\totalcnt %% in the middle of the list \opname{\@tempa}\midsep \else \ifnum\entrycnt=\totalcnt %% current name is the penultimate \opname{\@tempa}\lastsep \else %% this is the last name \opname{\@tempa} \fi \fi \fi \fi }% end of do }% end of definition \end{lcode} The macro \cmd{\commaed} takes a comma-separated list as its argument and outputs a revised list. \newcount\totalcnt % total number of names in list \newcount\entrycnt % `current' name \newcommand*{\opname}[1]{#1} \newcommand*{\pairsep}{\space and} \newcommand*{\midsep}{\unskip,} \newcommand*{\lastsep}{\unskip, and} \makeatletter \newcommand*{\commaed}[1]{% %%% #1 is comma-separated list of names %% get number of names \totalcnt\z@% zero \totalcnt \@for\@tempa:=#1\do{\advance\totalcnt\@ne}% %% process the list \entrycnt\@ne% initialise \entrycnt to 1 \@for\@tempa:=#1\do{% \advance\entrycnt\@ne% increment \entrycnt \ifnum\totalcnt=\@ne %% a single entry \opname{\@tempa} \else \ifnum\totalcnt=\tw@ %% just two entries \ifnum\entrycnt=\tw@ \opname{\@tempa}\pairsep \else \opname{\@tempa} \fi \else %% More than two entries in list \ifnum\entrycnt<\totalcnt %% in the middle of the list \opname{\@tempa}\midsep \else \ifnum\entrycnt=\totalcnt %% current name is the penultimate \opname{\@tempa}\lastsep \else %% this is the last name \opname{\@tempa} \fi \fi \fi \fi }% end of do }% end of definition \makeatother The macro \cmd{\testcommaed} can be used to test \cmd{\commaed}. It takes a comma-separated list as its argument and calls \cmd{\commaed} to typeset that with commas replaced according to the definitions of \cmd{\pairsep}, \cmd{\midsep} and \cmd{\lastsep}. The macro \cmd{\opname} is used to typeset the elements. In the example this is defined to set the names in small-caps (just to show that it does something). \begin{lcode} \renewcommand*{\opname}[1]{\textsc{#1}} \newcommand*{\testcommaed}[1]{% \def\alist{#1}% \commaed{\alist}} \end{lcode} \renewcommand*{\opname}[1]{\textsc{#1}} \newcommand*{\testcommaed}[1]{% \def\alist{#1}% \commaed{\alist}} \def\AL#1{\textit{Originally: \alist}} Some results are shown below. \begin{itemize} \item \verb?\testcommaed{Arthur B. Clark} ->? \\ \testcommaed{Arthur B. Clark} \item \verb?\testcommaed{Arthur B. Clark, Damian Edlan} ->? \\ \testcommaed{Arthur B. Clark, Damian Edlan} \item \verb?\testcommaed{Arthur B. Clark, Damian Edlan ,? \\ \verb?Ferency G. van Hoep} ->? \\ \testcommaed{Arthur B. Clark, Damian Edlan , Ferency G. van Hoep} \item \verb?\testcommaed{Arthur B. Clark, Damian Edlan,? \\ \verb?Ferency G. van Hoep , Irene Jackson} ->? \\ \testcommaed{Arthur B. Clark, Damian Edlan, Ferency G. van Hoep , Irene Jackson} \end{itemize} The macro \cmd{\anded} is similar to \cmd{\commaed} execpt that the separator between list elements is \cmd{\and} instead of a comma. It is implemented using \cmd{\commaed}. \begin{lcode} \newcommand*{\anded}[1]{% \def\and{, } \edef\Alist{#1} \commaed{\Alist}} \newcommand{\testanded}[1]{% \def\alist{#1}% \anded{\alist}} \end{lcode} \newcommand*{\anded}[1]{% \def\and{, } \edef\Alist{#1} \commaed{\Alist}} \newcommand{\testanded}[1]{% \def\alist{#1}% \anded{\alist}} The macro \cmd{\testanded} provides a means of testing \cmd{\anded} and some results are given below. \begin{itemize} \item \verb?\testanded{Arthur B. Clark} ->? \\ \testanded{Arthur B. Clark} \item \verb?\testanded{Arthur B. Clark\and Damian Edlan} ->? \\ \testanded{Arthur B. Clark\and Damian Edlan} \item \verb?\testanded{Arthur B. Clark \and Damian Edlan\and? \\ \verb?Ferency G. van Hoep} ->? \\ \testanded{Arthur B. Clark \and Damian Edlan\and Ferency G. van Hoep} \item \verb?\testanded{Arthur B. Clark\and Damian Edlan\and? \\ \verb?Ferency G. van Hoep \and Irene Jackson} ->? \\ \testanded{Arthur B. Clark\and Damian Edlan\and Ferency G. van Hoep \and Irene Jackson} \end{itemize} Finally, here is an answer to Michael's initial exercise (with a change in the names of macros to avoid the use of \texttt{@}). This is built on the \cmd{\anded} macro. Test results are shown after the code definitions. \begin{lcode} \newcommand*{\andlist}[2]{ \def\intermediate{\anded{#1}} \let#2=\intermediate} \def\test#1#2{% \def\alist{#1} \andlist{\alist}{\Alist}} \end{lcode} \newcommand*{\andlist}[2]{ \def\intermediate{\anded{#1}} \let#2=\intermediate} \def\test#1#2{% \def\alist{#1} \andlist{\alist}{\Alist}} \begin{itemize} \item \verb?\test{Arthur B. Clark}{\Alist} \Alist ->? \\ \test{Arthur B. Clark}{\Alist} \Alist \item \verb?\test{Arthur B. Clark\and Damian Edlan}{\Alist} \Alist ->? \\ \test{Arthur B. Clark\and Damian Edlan}{\Alist} \Alist \item \verb?\test{Arthur B. Clark \and Damian Edlan\and? \\ \verb?Ferency G. van Hoep}{\Alist} \Alist ->? \\ \test{Arthur B. Clark \and Damian Edlan\and Ferency G. van Hoep}{\Alist} \Alist \item \verb?\test{Arthur B. Clark\and Damian Edlan\and? \\ \verb?Ferency G. van Hoep \and Irene Jackson}{\Alist} \Alist ->? \\ \test{Arthur B. Clark\and Damian Edlan\and Ferency G. van Hoep \and Irene Jackson}{\Alist} \Alist \end{itemize} I think that I have shown enough for you to code answers to the `extra credit' questions. By now, it should be obvious that I find the \verb?A,B,C...? data structure to be advantageous compared with the \verb?A\and B\and C...? structure because of the LaTeX \cmd{\@for} code I used. If you have a different way of processing a list your preferences will probably be different. %%\endinput \chapter{Math symbols} %%\input{bend020} % bend020.tex \section{Exercise} \ed{\oposted{1994/08/30}} %%%*** Exercise 20: Why does plain.tex define \cmd{\surd} like this: \begin{lcode} \def\surd{{\mathchar"1270}} \end{lcode} instead of like this: \begin{lcode} \mathchardef\surd="0270 \end{lcode} ? %======================================================================== % Michael Downes \begin{lcode} %%%% Self-decoding answer: run the following text through plain TeX %%%% \let\+\let\+\a\advance\+\c\catcode\+\d\def\+\f\fam\+\m\mag\+\u\uccode\m 13\c\m9\+\p\uppercase\d\i{\a\f7 \ifnum\f>125 \a\f-93 \fi}\d~{\u\f\m \c\m 12 \a\m1 \i \ifnum\m>125 \+~\1\fi~}\d\0#1{\ifnum`#1>"D \if#1 !\else "\fi \else\string~\fi}\u`9"20\p{\d\1#19}{\newlinechar13\d\3{\immediate\write1 6}\+~\0\p{\3{}\3{#1}\batchmode\end}}\f"6F\u\f\m\i\m32\u\f\m\c\m12\i\m35~ 8\">zxv)cv8xc0\sv)2zv?z\sv},{doo;sz$;"0xsZZ;U^)2l2^x~}%,O{hhvjxcs0lz"v^v U^)2cxsv^)cUv>9)2v)2zv"LUecNo7zx)9l^NNLvlz\)zxzsvc\v)2zvU^)2v^E9"mvFN^"" v%fff)2zv$9x")vs9+9)fffU^Gz"o^vU^)2cjv^)cU_v>2c"zvlc\)z\)"v^xzvlz\)zxzsv eLv`z|v9$v)2zLv^xzv\c)29\+oe0)v^v"9\+Nzv$c\)vl2^x^l)zxkv)2zvzE)x^v"z)vc$ vex^lz"v)2z\vl^0"zv`z|v)coj^lGv)2zvlz\)zxzsvl2^x^l)zxv9\)cv^vU^)2cxsv^)c U_vxz"0N)9\+v9\v)2zosz"9xzsvU^)2cxsv"j^l9\+vc\v)2zvNz$)v^\svx9+2)mv=\v)2 zvc)2zxv2^\so;U^)2l2^xsz$;"0xsy~}{,O{_v>29Nzv")9NNvjxcs0l9\+v^vU^)2cxsv^ )cU_v>c0NsoL9zNsv^vxz^NNLv9\)zxz")9\+vjc"9)9c\vc$v)2zv"LUecNvCjxce^eNLv\ c)v>2^)vLc0o>c0Nsv+0z""kv)xLv9)v^\sv"zzJmvF$mvR0Nzv%%v9\v8jjz\s9Evbvc$v` 2zv`z|eccGm >c0Nsv+0z""kv)xLv9)v^\sv"zzJmvF$mvR0Nzv%%v9\v8jjz\s9Evbvc$v` \end{lcode} \section{Answer} \begin{comment} %%%% the result of TeXing the above This is pdfTeXk, Version 3.141592-1.40.3 (Web2C 7.5.6) %&-line parsing enabled. entering extended mode (./codeans20.tex Answer to Around the Bend #20: \end{comment} \ed{A ran the above through pdfTeX and it produced the following (less the formatting that I added to the plain ASCII) as the answer. I suspect, though, that the command \cs{ver} below is a typo and should not be there.} \begin{lcode} \def\surd{{\mathchar"1270}} \end{lcode} produces a mathord atom with the symbol vertically centered on the math axis. Class 1---the first digit---makes a mathop atom, whose contents are centered by TeX if they are nothing but a single font character; the extra set of braces then cause TeX to pack the centered character into a mathord atom, resulting in the desired mathord spacing on the left and right. On the other hand \begin{lcode} \ver\mathchardef\surd="0270 \end{lcode} while still producing a mathord atom, would yield a really interesting position of the symbol (probably not what you would guess; try it and see). Cf. Rule 11 in Appendix G of \emph{The TeXbook}. %%\endinput \chapter{Variable number of arguments} %%\input{bend021} % bend021.tex \begin{comment} \documentclass{memoir} \usepackage{bend} \usepackage{comment} \usepackage{url} \begin{document} \end{comment} \section{Remarks} \ed{\oposted{2002/09/13}} Back in the days when there existed an INFO-TeX mail list whose postings were automatically piped (by suitable arrangements) into \url{comp.text.tex}, I launched a thing called `Around the Bend' with the following explanation: \begin{quote} [Date: Thu 10 Oct 91] I would like to propose a regular department for INFO-TeX, called `Around the Bend'. It will consist of macro-writing challenges on the level of the dangerous-bend exercises in the \emph{TeXbook}, with interested parties invited to collaborate and/or compete to find the best solution. My motivation for doing this is partly selfish: to get more feedback from other macro writers about some of the interesting macro-writing problems that I run into. \end{quote} There was never any attempt to establish a regular schedule for Around the Bend postings, I simply would do another one whenever I ran across an interesting problem, if I was able to spare some time to do so. The series is archived at \url{CTAN:pub/tex/info/aro-bend} for anyone who has an interest in looking at it. I also noticed that the exercises and answers are available in \url{comp.text.tex} archives through \url{groups.google.com}. In response to a question on July 24, 2002 from Antoine Chambert-Loir\index{Chambert-Loir, Antoine} (with apologies for the delay in answering): \begin{quote} \ldots why did 'Around the Bend' stop? There were nice challenges proposed there. \end{quote} I am tempted to say `Well, actually they didn't stop, there was just an unusually large gap in the aperiodic schedule'. But what I also wanted to say is that there are others quite as capable as I am of devising good Around the Bend exercises---I am thinking of a recent post by David Kastrup\index{Kastrup, David} about a completely expandable string comparison macro---and it occurred to me it might be better to invite interested parties to sign up for an informal `editorial board' to issue further exercises, so that other demands on my time do not have such a dampening effect on the rate of output. I don't have any desire to put restrictions on what goes out in continuation of the series apart from a (fairly crucial) one of striving for high quality and creativity. Send e-mail if you are interested, to the address below. There are only some obvious questions of coordination to address, such as trying (I think) to avoid two different people posting different exercises at the same time. Turning now to the next exercise, prompted by a recent \url{comp.text.tex} question from David Reitter\index{Reitter, David}: %======================================================================== %%*** Exercise 21: \section{Exercise} Define a macro that takes a variable number of arguments. Do it in the best way possible. For the sake of concreteness, consider this somewhat contrived example as a test case that your solution should be able to handle, though possibly using a different syntax: \begin{lcode} \printdate -> today's date in preferred form \printdate[Tuesday] -> Tuesday \printdate[Tuesday][17] -> Tuesday the 17th \printdate[Tuesday][17][9] -> Tuesday, September 17th \printdate[Tuesday][17][9][2002] -> and so on \printdate[Tuesday][17][9][2002][Gregorian calendar] -> and so forth \end{lcode} The lines above illustrate six different ways of calling the \cmd{\printdate} macro. The macro should print something appropriate in each case, but the exact form of the output is a matter of taste, it need not follow exactly what I have given here. Part of a good solution will be a good analysis of why one way might be better than another. The solution that I came up with is based on the question from David Reitter\index{Reitter, David} that originally inspired this exercise, thus it assumes the context is LaTeX and tries to solve the problem in a way that is natural for LaTeX. A straightforward solution based on existing examples of multiple-option commands in the LaTeX kernel would qualify as natural, but definitely not elegant since that would require defining a separate macro to handle each stage of the multiple option scanning. Non-LaTeX solutions are also considered to be of interest. %======================================================================== I suggest posting your answers directly to comp.text.tex instead of mailing them to me (as was done in the past), though depending on how late you stayed up working on this entertaining exercise instead of writing your thesis or balancing your checkbook as you \emph{ought} to have been doing, you might want to beware of posting in haste and wait until you have had some sleep and a chance to reread what you wrote, to avoid embarrassing oversights [\ldots said he, speaking from experience]. Please e-mail a copy in addition (or instead, if you like) to the Around the Bend Editorial Board ... hmm, that gives me an idea \ldots [pausing to consult the dictionary] make that the Supremely Honorable, Ingenious and, in Special Honor of Knuth, Around the Bend Editorial Board---whose size will not long remain one I dare say, especially after the establishment of this glamorous name---at \url{@pobox.com} %%Regards, Michael Downes \begin{comment} target=_parent>...@ams.org (Michael J Downes) writes:

> ======================================================================== > *** Exercise 21: > Define a macro that takes a variable number of arguments. Do it in the > best way possible. For the sake of concreteness, consider this somewhat > contrived example as a test case that your solution should be able to > handle, though possibly using a different syntax:

>   \printdate                         -> today's date in preferred form >   \printdate[Tuesday]                -> "Tuesday" >   \printdate[Tuesday][17]            -> "Tuesday the 17th" >   \printdate[Tuesday][17][9]         -> "Tuesday, September 17th" >   \printdate[Tuesday][17][9][2002]   -> and so on >   \printdate[Tuesday][17][9][2002][Gregorian calendar] -> and so forth \end{comment} \section{Answers} %\textbf{David Kastrup (2002/09/14)} \begin{solution}{Solution 1 (David Kastrup)}\index{Kastrup, David} \ed{\oposted{2002/09/14}} \begin{lcode} \def\printdate{\count@\z@\toks@{}\printdate@a} \def\printdate@a{\@ifnextchar[{\printdate@b}{\printdate@c}} \def\printdate@b[#1]{\toks@\expandafter{\the\toks@{#1}}% \advance\count@\@ne\printdate@a} \def\printdate@c{\csname printdate@@\romannumeral\count@ \expandafter\endcsname\the\toks@} \end{lcode} You can now define the one-argument macro \cmd{\printdate@@i}, the 5-argument macro \cmd{\printdate@@v} and so on. \cmd{\printdate@c} might also contain other stuff. For testing, we just define it as \begin{lcode} \def\printdate@c{\message{\number\count@\space arguments: \the\toks@}} \end{lcode} This needs the LaTeX macro \cmd{\@ifnextchar}, of course. If you want to have various defaults in sequence and just want to call \cmd{\printdate@@v}, you could write something like \begin{lcode} \def\printdate@c{\let\gobble@x\relax\expandafter\newcommand \expandafter\gobble@x\expandafter[\number\count@]{}% \edef\next{{Tuesday}{17}{9}{2002}{Gregorian calendar}% \the\toks@}\expandafter\expandafter\expandafter \printdate@@v\expandafter\gobble@x\next} \end{lcode} Ok, this latter proposal is ugly. Better ideas? % -- David Kastrup, Kriemhildstr. 15, 44793 Bochum Email: \end{solution} \begin{solution}{Solution 2 (mine)} \ed{\oposted{2002/09/20}} %\textbf{Michael J Downes (Sep 20, 2002)} Define a macro that takes a variable number of arguments. and gave the following example application: \begin{lcode} \printdate -> today's date in preferred form \printdate[Tuesday] -> Tuesday \printdate[Tuesday][17] -> Tuesday the 17th \printdate[Tuesday][17][9] -> Tuesday, September 17th \printdate[Tuesday][17][9][2002] -> and so on \end{lcode} My solution (see below), written with LaTeX in mind, has the following characteristics: \begin{itemize} \item The kernel of the solution is not specific to a particular user-level command; for each user-level command, only two command-specific macros are needed: the top-level one invoked by the user, and the internal one that handles all the arguments. By contrast, the standard LaTeX method of handling multiple options requires a separate command-specific macro for each step of the argument scanning. \item The number of optional arguments is quasi-limited. The number of default values that you give in a command's definition becomes an upper limit on the number of arguments that will be scanned for. And if you supply twenty default values, the code that ends up handling them will have to be more than a simple TeX macro since macro arguments only go up to 9. \item Commands defined with this method can be nested, because the delimiters for the optional arguments are regular curly braces \verb?{ }?, not square brackets [ ]. \end{itemize} The choice of square brackets in LaTeX for optional arguments is OK for arguments whose values are suitably restricted, but when used for arguments that may contain arbitrary text---in particular, other commands with optional arguments---it becomes a pitfall that many users have fallen into over the years, and generally costing them an amount of lost time in inverse proportion to their understanding of catcodes. (I.e., its worst effects are on the kind of users that LaTeX was intended to serve in the first place.) The most common examples in practice are perhaps \cmd{\twocolumn}\verb?[...]? and \verb?\begin{thm}[...]?, but it could also happen in the optional arguments of \cmd{\section}, \cmd{\caption}, or \cmd{\cite}. The chief argument against using braces for optional arguments came out coincidentally in another thread only a couple of days ago, as stated by Heiko Oberdiek\index{Oberdiek, Heiko} on \url{comp.text.tex} \begin{comment} (<am6mb5$a1...@n.ruf.uni-freiburg.de> comp.text.tex 17 Sep 2002): \end{comment} %$ \begin{quote} How do you want to distinguish between a parameter and a group, both enclosed in \verb?"{}"? Example: \begin{lcode} \foo{bar}{\bfseries bla} \end{lcode} \end{quote} But in practice it seems to me that this is not a significant drawback. Savvy users would normally use the \verb?\textbf{...}? form anyway (I hope). In fact the \verb?"{\whatever ...}"? form (called a \emph{declaration} in the LaTeX book) is, in a certain sense, quite unnatural for a linear language like TeX where the macro expansion works by simple left-to-right substitution. At least, if used at document level such a syntax makes it unnecessarily difficult to remap the functions involved and therefore is a stumbling block in many special applications. For example, it becomes feasible to add italic corrections automatically only when we use the \cmd{\emph}\verb?{...}? form rather than the \verb?{?\cmd{\em}\verb?...}? form. (There is an \cmd{\aftergroup} trick that would sort of do the job but only by placing some assumptions on the usage that do not hold in the real world.) %%%Regards, Michael Downes %

------------------------------------------------------------------------ \begin{lcode} \documentclass{article} \usepackage{ifmtarg} \makeatletter % If \cmd{\MyCmd} is defined as % \VariableArgs{\MyCode ...}{{Default1}{Default2}} % then % \MyCmd -> \MyCode...{Default1}{Default2} % \MyCmd{aaa} -> \MyCode...{aaa}{Default2} % \MyCmd{a}{bc} -> \MyCode...{a}{bc} % In other words, \VariableArgs takes two arguments and % and if the invocation via \MyCmd finds $n$ actual arguments, the first % $n$ default values are replaced by the actual arguments. % % In principle the number of optional arguments is "whatever \MyCode is % able to handle" but if the number of defaults is $d$ then scanning % will stop as soon as $d$ arguments have been read, if not before. % In practice things will begin to get unwieldy after a dozen or so % arguments, because the process of scanning one more % actual argument involves rescanning the whole list of arguments % each time (actual arguments read previously plus any remaining defaults). \newcommand{\VariableArgs}[2]{% \toks@{#1}% \@ifnextchar\bgroup{\AddArg #2{}@}{#1#2}} \def\AddArg#1#2@#3{% \toks@\expandafter{\the\toks@{#3}}% \edef\RunIt{\the\toks@}% \@ifnextchar\bgroup{% \ifx @#2@% \begingroup \def\AddArg{\endgroup \expandafter\RunIt\@gobble}% \fi \AddArg #2@% }{% \RunIt #2% }% } \newcommand{\printdate}{% % If zero args, use \today. \VariableArgs{\PrintDateFive}{{\today}{}{}{}{}}} % This example is slightly more complicated than necessary because it % behaves differently depending on the number of arguments. \newcommand{\PrintDateFive}[5]{% % Always print #1, which might be \today (from the default value). #1% \@ifnotmtarg{#2#3#4#5}{% % If only #1 & #2 are given, use a slightly different form. \@ifmtarg{#3#4#5}{ the}{,}% % Args 2,3,4,5: Print each one if nonempty, but rearranging the % order slightly. \@ifnotmtarg{#3}{ \MonthName{#3}}% \@ifnotmtarg{#2}{ \OrdinalDay{#2}}% \@ifnotmtarg{#4}{, #4}% \@ifnotmtarg{#5}{ (#5)}% }} \def\MonthName#1{% \ifcase 0#1 \number\month\or January\or February\or March\or April\or May\or June\or July\or August\or September\or October\or November\or December% \else Thirteen's Month\fi} % If #2 is not a digit, use #1 \def\LastDigit#1#2{% \ifodd 0#21 \else #1\expandafter\@gobbletwo\fi\LastDigit #2} \def\OrdinalDay#1{#1% \ifcase\LastDigit #1\space th\or st\or nd\or rd\else th\fi} \begin{document} \noindent Testing: \begin{enumerate}\setcounter{enumi}{-1} \item \printdate \item \printdate{Tuesday} \item \printdate{Tuesday}{17} \item \printdate{Tuesday}{17}{9} \item \printdate{Tuesday}{17}{9}{2002} \item \printdate{Tuesday}{17}{9}{2002}{Gregorian calendar} \end{enumerate} \end{document} \end{lcode} \end{solution} \begin{solution}{Solution 3 (Donald Arseneau)}\index{Arseneau, Donald} %%\textbf{Donald Arseneau (2002/09/24)} \ed{\oposted{2002/09/24}} *** Exercise 21: \\ Define a macro that takes a variable number of arguments. \begin{lcode} \printdate[Tuesday][17][9][2002][Gregorian calendar] -> and so forth \end{lcode} I did it (acually before MD posed the challenge) using \verb?{ }?, not \verb?[ ]?, and this answer does not match the challenge in other ways. But I haven't got around to working it in the last week or so. Two features notably missing are: error checking for a bad number when specifying the number of arguments, and provision of default values for omitted arguments (they are all null here). (I also think I could make do with one fewer \cmd{\MultiArgCollect} macros.) I think \verb?{}? delimiters really are the `best way' in regards to nesting macros. The one problem is confusion with non-explicit \verb?{?, and so I handle the most common case of \cmd{\bgroup}. \begin{lcode} \makeatletter \let\MultiArgBgroup={ \def\MultiArg#1#2{\begingroup \let\bgroup\begingroup \let\egroup\endgroup \expandafter\MultiArgCollect\romannumeral\number#1001\delimiter{#2}} \def\MultiArgCollect#1{\csname MultiArgCollect#1\endcsname} \def\MultiArgCollectm#1\delimiter#2{% \@ifnextchar\MultiArgBgroup {\MultiArgCollectA#1\delimiter{#2}}% {\MultiArgCollect#1\delimiter{#2{}}}} \def\MultiArgCollectA#1\delimiter#2#3{% \MultiArgCollect#1\delimiter{#2{#3}}}} \def\MultiArgCollecti#1\delimiter#2{\endgroup#2} \newcommand\DeclareMultiArgCommand[2]{\expandafter \Declare@MultiArg@ \csname MA\string_\string#1\endcsname{#1}{#2}} \def\Declare@MultiArg@#1#2#3{% \DeclareRobustCommand{#2}{\MultiArg{#3}{#1}} \newcommand{#1}[#3]} \DeclareMultiArgCommand {\printdate}{6}{...} \end{lcode} \end{solution} %%\endinput \indexintoc \printindex \end{document}