AttributeAndSyntaxType

Overview

The CommonLisp system provides an extensible subsystem (TheReader) that maps strings of characters into various Lisp objects. The supported extension method is the use of a ReadTable that supports redefining characters as MacroCharacters? that trigger the evaluation of a Lisp function by TheReader. There is however an underlying mechanism defined in the CommonLisp specification that drives the basic state machine in TheReader. This is the dual concepts of CharacterAttributes? and SyntaxType? . While the specification description is very detailed and encapsulated, there is no mechanism for directly affecting the mapping of a character to a specific CharacterAttributes? and SyntaxType? .

Aside from giving a slight nod to the existence of other character schemes, the existing specification deals mostly in the lower 127 characters of the ASCII code. However, CLforJava is designed to encompass the entire Unicode system (see CharacterSystem). There are no specific definitions in the Lisp specification as to the Attiribute or Syntax Type of the vast number of characters. Since we could no longer use simplistic methods such as an array lookup (there are approximately 15,100 characters defined for CLforJava), the project required a CharacterSystem that could handle the large set of characters and have very encapsulated design. Having done that, we decided to expose this mechanism to both Java and Lisp programmers as an "official" extension to CommonLisp.

References

HyperSpec CLtL
link 1 Link 2

Implementation

The first implementation of TheReader included nested classes that defined CharacterAttribute? and SyntaxType? . In the new version, these classes are extracted to be extension classes with a public API. Furthermore, they are not defined as final allowing Java programmers to extend these classes to handle parsing of radically different languages while still using the underlying reader and ReadTable mechanisms.

Both classes extend from the Java 1.4 Character subsetting mechanism, providing a standard of method. Each of them also provides a set of type-safe enum constants for attributes and syntax type.

Signatures

public class lisp.extension.character.Attribute extends java.lang.Character.Subset {
    public static final lisp.extension.character.Attribute CONSTITUENT = new lisp.extension.character.Attribute("CONSTITUENT");
    ... etc for the rest of the  attributes ...

    public static lisp.extension.character.Attribute of(lisp.common.type.Character character);
    private lisp.extension.character.Attribute(String name) {}
}

public class lisp.extension.character.SyntaxType extends java.lang.Character.Subset {
    public static final lisp.extension.character. SyntaxType ALPHADIGIT = new lisp.extension.character. SyntaxType("ALPHADIGIT");
    ... etc for the rest of the  attributes ...

    public static lisp.extension.character. SyntaxType of(lisp.common.type.Character character);
    private lisp.extension.character. SyntaxType(String name) {}
}

Connection to the ReadTable

TBD

Core Java Classes Javadoc Links
Topic revision: r2 - 2009-02-11 - 03:30:45 - MeganLusher
 
Home
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback