FormatPrinter

Overview

This topic provides information on the implementation of the Lisp Format function. The function is used to produce formatted output according to the arguments provided. There are 4 types of destinations in which the result of the function can be sent. The control string contains the directives used for formatting. A tilde introduces each of the 26 directive characters. The directives along with the prefix parameters and modifiers that can be placed in front of them are used to specify the kind of formatting required.

Syntax

format destination control-string &rest args

References

HyperSpec
Format Function
22.3 Formatted Output
CLtL
22.3.3. Formatted Output to Character Streams

Implementation

#LispFunctions The table provides the list of classes for the Lisp format function.
Class
Format
FormatHelper

FormatHelper (lisp.system.format.FormatHelper)

The FormatHelper class contains the body of the Format function written in Java. Methods of the class are used to format the output. It includes directive methods and directive helper methods that produce the output as if using the Lisp Format function. The class uses mostly Java types in order to be able to use Java functions such as the Java regular expressions (java.util.regex.*), Formatter (java.util.Formatter), and Number Formatter (java.text.NumberFormat). The regular expressions used are the directives (formatRegexp), the directive with prefixes and modifiers (subRegexp), the case clause (caseRegexp), the conditional clauses (condRegexp), the iteration clauses (iterRegexp), and the justification clauses (justRegexp). The regular expressions are declared as global and are precompiled.

Method(s) - FormatHelper (lisp.system.format.FormatHelper)

Method
Constructor
FormatHelper
Basic Output
character
percent
ampersand (and)
page
tilda
Radix Control
radix
decimal
binary
octal
hexadecimal
Floating-Point Printers
fixed format (floating)
exponential (exponent)
general (generalFloat)
dollar
Printer Operations
ascii
s-expression (sDir)
Layout Control
tabulate
justification
Control-Flow Operations
asterisk (star)
conditional
iteration
indirection
Miscellaneous Operations
case
plural
Miscellaneous Pseudo-Operations
escape
newline
Helper Function
comma
padding
vNumPrefix
vPrefix

FormatHelper
The method is called using the destination (Object), control string (Lisp String), and arguments (Lisp List). The destination type is determined and stored. The method converts the other parameters into Java types and then uses the regular expression (formatRegexp) that represent the directives to determine if the control string is valid for the Format function. This is done with the Java matcher and matches functions. If it is correct, then the control string is split into individual directive strings with tilda using the Java split function, and then each one is rematched with formatRegexp. Each directive is matched with the subRegexp to determine if there are prefixes and/or modifiers, and if they are present, then the directive string is split with comma to store the prefixes and modifiers. Prefixes can be a number, character, #, or V depending on the parameters accepted by the directive. They are placed behind the tilda and separated by commas. Modifiers are : stored as 3, @ as 4, or :@ as 3 or 4. The modifier are placed directly infront of the directive letter.The directive letter is stored and used to call the correct method. The methods called from FormatHelper use the Java formatter function with the %s conversion unless otherwise noted.

If the individual directive string does not rematch with formatRegexp, then it is either a case, conditional, iteration, or justification directive. This is because when the individual directive string is split with tilda, the resulting split will not match with formatRegexp due to the tildas contained inside the directive.

Case Directive
The case directive ~(str~) in contained in the FormatHelper method and converts the case of the output. If the directive string is matched to ~( as represented in the caseRegexp, then it enters into the case branch and stores the case type. It exits the branch for the next directive which is the control string stored in the (). This goes through the FormatHelper method and is outputted with the case type. Then after this, the ~) is matched to caseRegexp and enters the case branch, and the case type is set to the default.

Iteration Directive
The iteration directive ~{str~} in contained in the FormatHelper method and iterates through the str, which is used as the control string. If the directive string is matched to ~{ as represented in the iterRegexp, then it enters into the iteration branch and stores the control string. It calls a new FormatHelper for the number of repetitions specified.

Justification Directive
The justification directive (~less than sign str ~ greater than sign) in contained partly in the FormatHelper method, and the rest of the directive is in the justification method. It produces justified text with the str. The justification branch is entered by matching the justRegexp with ~<. If the directive is present, then a boolean is set to keep track of it. The branch stores the str(s) contained by the directive. This is done until ~> is found. This is then stored into the original directive control strings, which calls the justification method. The method formats the clauses as specified by the prefixes and modifiers.

Conditional Directive
The conditional directive ~[str0~;str1~;...strn~] in contained in the FormatHelper method. It contains a set of control string clauses, and only one is chosen to be used. If the directive string is matched to ~[ as represented in the condRegexp, then it enters into the conditional branch. The branch loops through and stores the clauses until ~] is found. Then the correct clause is selected and stored in the original directive string.

Character Directive
The character directive ~C outputs the Lisp character argument according to the modifier. The @ modifier calls the Lisp character getPreferredName method. CLforJava will not be supporting the control bit output of the : modifier.

Percent Directive
The percent directive ~% outputs a newline character or the number of newlines specified by the prefix, and no argument is used.

Ampersand Directive
The ampersand directive ~& outputs a newline character or the number of newlines specified by the prefix, and no argument is used. It should only output a newline if it is not at the beginning of the newline, but this is not supported by CLforJava, and so a newline is always outputted.

Page Directive
The page directive ~| (vertical bar) outputs a page separator character (^L) or the number of page separators specified by the prefix.

Tilda Directive
The tilda directive ~~ outputs a tilde or the number of tilde specified by the prefix.

Radix Directive
The radix directive ~R prints the argument in the radix specified in the prefix. If no prefix is given, then it outputs the number contained in the argument in words according to the modifier. CLforJava does not support the cardinal English number (~R), the ordinal English number (~:R), and the old Roman numeral (~:@R). It will support the Roman numeral (~:R) when the RomanNumeralEngine? is complete.

Decimal Directive
The decimal directive ~D outputs the integer argument in decimal radix. Prefixes and modifiers can be provided to format the output. The method calls the padding and comma methods if needed. The method uses the Java formatter function with the %d conversion. The decimal method is used for all decimal directives such as D, B, O, and X. The method is passed the directive letter and then formats the argument according to the directive.

Binary Directive
The binary directive ~B outputs the integer argument in binary radix. The decimal method is passed the directive letter. It sets the print-radix to T and print-base to 2 and calls the PrintObject function to format the argument as a binary number. Prefixes and modifiers can be provided to format the output. The method calls the padding and comma methods if needed.

Octal Directive
The octal directive ~O outputs the integer argument in octal radix. The decimal method is passed the directive letter. Prefixes and modifiers can be provided to format the output. The method calls the padding and comma methods if needed. The argument is converted to a Java BigInteger in order to use the Java formatter function with the %o conversion for formatting.

Hexadecimal Directive
The hexadecimal directive ~X outputs the integer argument in hexadecimal radix. The decimal method is passed the directive letter. Prefixes and modifiers can be provided to format the output. The method calls the padding and comma methods if needed. The argument is converted to a Java BigInteger in order to use the Java formatter function with the %x conversion for formatting.

Fixed-Format Floating-Point Directive
The fixed-format floating-point directive ~F outputs the argument as a floating-point number. Prefixes and modifiers can be provided to format the output. The method calls the padding and comma methods if needed. The argument is converted to a Java Double in order to use the Java formatter function with the %f conversion for formatting.

Exponential Floating-Point Directive
The exponential floating-point directive ~E outputs the argument in exponential notation. Prefixes and modifiers can be provided to format the output. The method calls the padding and comma methods if needed. The argument is converted to a Java Double in order to use the Java formatter function with the %E conversion for formatting.

General Floating-Point Directive
The general floating-point directive ~G outputs the argument in fixed-format or exponential notation. The size of the argument determines which method is called, either ~F or ~E. Prefixes and modifiers can be provided to format the output. The method calls the padding and comma methods if needed.

Dollar Floating-Point Directive
The dollar floating-point directive ~$ outputs the argument in fixed-format floating-point notation, like currency but without the $ in front. Prefixes and modifiers can be provided to format the output. The method calls the padding and comma methods if needed. The argument is converted to a Java Double in order to use the Java Number Format currency function. The argument is converted to currency, and the $ is removed from the output.

Ascii Directive
The ascii directive ~A outputs any Lisp Object argument without escape characters. Print-escape and print-readably are bound to nil. The printerOperations method is called to help format the output, and returns the output as a Java String. Prefixes and modifiers can be provided to format the output and are passed to the printerOperations method. If the directive is contained in a justification directive, then the output is not written to the stream and is stored in the justification string to be formatted by the justification method. Otherwise, the output is written to the stream.

S-Expression Directive
The s-expression directive ~S outputs any Lisp Object argument with escape characters. Print-escape is bound to T. The printerOperations method is called to help format the output, and returns the output as a Java String. Prefixes and modifiers can be provided to format the output and are passed to the printerOperations method. If the directive is contained in a justification directive, then the output is not written to the stream and is stored in the justification string to be formatted by the justification method. Otherwise, the output is written to the stream.

Tabulate Directive
The tabulate directive ~T outputs spaces in order to move to a given column. Relative tabulation is performed when the modifier is @. Prefixes can be provided to format the output. No argument is used.

Asterisk Directive
The asterisk directive ~* ignores the next argument or the number of arguments specified by the prefix. The directive can ignore the argument backwards (~:*) or be an absolute goto (~@*).

Indirection Directive
The indirection directive ~? requires two arguments. The first one must be a string that is used as a control string and the second a list in which the elements are used as the arguments. Both are consumed by the directive. The control string is processed recursively. If the modifier @ is given, then only the string argument is needed and consumed. After the control string and list of arguments have been determined, a new FormatHelper function is called with the parameters.

Plural Directive
The plural directive ~P outputs s, y, or ies depending on the argument. If the modifier : is used, the directive backs up an argument before outputting.

Escape Directive
The escape directive terminates formatting operation. If it is inside ~{ or ~<, then the construct is terminated if there are no more arguments. The entire iteration directive is terminated with ~:^. If it is not enclosed, the entire FormatHelper method is terminated. If a prefix is given, then the operation is terminated when the parameter is zero.

Newline Directive
The newline directive ~ ignores the newline and or whitespaces depending on the modifier given. No argument is used.

Comma Method
The comma method is used to help format the directive outputs. It inserts commas into the string according to the number and the comma character passed in. It returns the new string.

Padding Method
The padding method is used to help format the directive outputs. It replaces the default padding space character with the character passed in to the method. It returns the new string.

VNumPrefix Method
The vNumPrefix method is used to help the format directives to determine the number prefixes. It is called by the prefixes that can be a number, # or V. If it is a number, then it is parsed from the prefix string. # represents the number of arguments left to be processed. V takes an argument and uses it as the prefix. If the argument is nil, then the prefix is omitted. The number is returned.

VPrefix Method
The vNumPrefix method is used to help the format directives to determine the character prefixes. It is called by the prefixes that can be a character or V. If it is a character, then it is parsed from the prefix string. V takes an argument and uses it as the prefix. The argument must be a character. If the argument is nil, then the prefix is omitted. The character is returned.

Format

The Format class has yet to be implemented. It is to be written in Lisp and will be the wrapper for the FormatHelper class.

Implementation
Using the grammar below, the Format class will be implemented by using a recursive descent parser. This class of parsers can handle LL(k) grammars, with the important constraint that the grammar be right recursive. Ambiguity is not a huge roadblock due to the parser's ability to handle precedence. As you may notice, the /* empty */ string is defined twice, but this will not be a problem. The parser will be written in Lisp.

Bakus-Naur Form / Grammar

A BNF for the behavior of the Format function from the hyperspec is as follows. Note: the prefixes and modifiers are not included due to simplicity.

FormatStr ::= /* EMPTY */ | Token FormatStr;
Token ::= SimpleToken | ComplexToken;

ComplexToken ::= Logical | Conditional | CaseConversion | Function | Justify | Iterate;

Logical ::= Logical_Start Logical_Middle Logical_End;
Logical_Middle ::= FormatStr | FormatStr RegClauseSeperator Logical_Middle;

Justify ::= Justify_Start Justify_Middle Justify_End;
Justify_Middle ::= FormatStr | FormatStr ClauseSeperator Justify_Middle;
ClauseSeperator ::= RegClauseSeperator | ModClauseSeperator;

Conditional            ::= Conditional_Start Conditional_Middle Conditional_End;
Conditional_Middle     ::= Conditional_Piece Conditional_Elseclause;
Conditional_Piece      ::= FormatStr | FormatStr RegClauseSeperator Conditional_Piece;
Conditional_Elseclause ::= ModClauseSeperator | /* EMPTY */;

CaseConversion ::= CaseConversion_Start FormatStr CaseConversion_End;

Function ::= Function_Start Function_Call Function_End;

Iterate ::= Iterate_Start FormatStr Iterate_End | Iterate_Sublist;
Iterate_Sublist ::= Iterate_Sublist_Start Iterate_Sublist_Middle Iterate_End;
Iterate_Sublist_Middle ::= FormatStr | FormatStr Iterate_Sublist_Escape Iterate_Sublist_Middle;

SimpleToken ::=  Character | Newline | Freshline | Page | Radix | Binary | 
                 Octal | Hexadecimal | FixedFloat | ExponentialFloat | GeneralFloat | 
                 MonetaryFloat | Aesthetic | Standard | Write | Indent | Plural |
                 Recursive | Tabulate | GoTo | EscapeUpward | Decimal;
Character ::= '~''C';
Newline ::= '~''%';
Freshline ::= '~''&';
Page ::= '~''|';
Radix ::= '~''R';
Binary ::= '~''B';
Decimal ::= '~''D'; 

Octal ::= '~''O';
Hexadecimal ::= '~''X';
FixedFloat ::= '~''F';
ExponentialFloat ::= '~''E';
GeneralFloat ::= '~''G';
MonetaryFloat ::= '~''$';
Aesthetic ::= '~''A';
Standard ::= '~''S';
Write ::= '~''W';
Indent ::= '~''I';
Plural ::= '~''P';
Recursive ::= '~''?';
Tabulate ::= '~''T';
GoTo ::= '~''*';
EscapeUpward ::= '~''^';

Iterate_Start ::= '~''{';
Iterate_End ::= '~''}' | '~'':''}';

Iterate_Sublist_Start ::= '~'':''{';
Iterate_Sublist_Escape ::= '~'':''^'

RegClauseSeperator ::= '~'';';
ModClauseSeperator ::= '~'':'';';

Iterate_Start ::= '~''{';
Iterate_End ::= '~''}' | '~'':''}';

CaseConversion_Start ::= '~''(';
CaseConversion_End ::= '~'')';

Conditional_Start ::= '~''[';
Conditional_End ::= '~'']';

// Function_Call will be handled differently than other non-terminals
Function_Start ::= '~''/';
Function_End ::= '/';
Function_Call ::= 'lispfunction'; // lispfunction is a non-terminal that will be mapped directly to a lisp function.

Logical_or_Justify_Start ::= '~''<';
Logical_End ::= '~'':''>' | '~''@'':''>';

Logical_or_Justify_Start ::= '~'':''<';
Justify_End ::= '~''>';
Topic revision: r9 - 2010-04-27 - 23:26:52 - JosephNiehaus
 
Home
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback