This is aspell-dev.info, produced by makeinfo version 5.2 from aspell-dev.texi. This is the developer’s manual for Aspell. Copyright © 2002, 2003, 2004, 2006 Kevin Atkinson. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts and no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License". INFO-DIR-SECTION GNU Packages START-INFO-DIR-ENTRY * Aspell-dev: (aspell-dev). For Aspell developers END-INFO-DIR-ENTRY  File: aspell-dev.info, Node: Top, Next: Style Guidelines, Prev: (dir), Up: (dir) Notes ***** This manual is designed for those who wish to develop Aspell. It is currently very sketchy. However, it should improve over time. * Menu: * Style Guidelines:: * How to Submit a Patch:: * C++ Standard Library:: * Templates:: * Error Handling:: * Source Code Layout:: * Strings:: * Smart Pointers:: * I/O:: * Config Class:: * Filter Interface:: * Filter Modes:: * Data Structures:: * Mk-Src Script:: * How It All Works:: * Copying::  File: aspell-dev.info, Node: Style Guidelines, Next: How to Submit a Patch, Prev: Top, Up: Top 1 Style Guidelines ****************** * Style Guidelines:: * How to Submit a Patch:: * C++ Standard Library:: * Templates:: * Error Handling:: * Source Code Layout:: * Strings:: * Smart Pointers:: * I/O:: * Config Class:: * Filter Interface:: * Filter Modes:: * Data Structures:: * Mk-Src Script:: * How It All Works:: * Copying:: As far as coding styles go I am really not that picky. The important thing is to stay consistent. However, please whatever you do, do not indent with more than 4 characters as I find indenting with more than that extremely difficult to read as most of the code ends up on the right side of the screen.  File: aspell-dev.info, Node: How to Submit a Patch, Next: C++ Standard Library, Prev: Style Guidelines, Up: Top 2 How to Submit a Patch *********************** Bug reports and patches should be submitted via GitHub Issues at rather than being posted to any of the mailing lists. The mailing lists are good if you need to check something out or need help or feedback from other readers, but they are not the best place to submit bugs or patches because they will likely get forgotten or lost within the mailing list traffic if not acted upon immediately. Please make the effort to use the tracker.  File: aspell-dev.info, Node: C++ Standard Library, Next: Templates, Prev: How to Submit a Patch, Up: Top 3 C++ Standard Library ********************** The C++ Standard library is not used directly except under very specific circumstances. The string class and the STL are used indirectly through wrapper classes and all I/O is done using the standard C library with light right helper classes to make using C I/O a bit more C++ like. However the ‘new’, ‘new[]’, ‘delete’ and ‘delete[]’ operators are used to allocate memory when appropriate.  File: aspell-dev.info, Node: Templates, Next: Error Handling, Prev: C++ Standard Library, Up: Top 4 Templates *********** Templates are used in Aspell when there is a clear advantage to doing so. Whenever you use templates please use them carefully and try very hard not to create code bloat by generating a lot of unnecessary and duplicate code.  File: aspell-dev.info, Node: Error Handling, Next: Source Code Layout, Prev: Templates, Up: Top 5 Error Handling **************** Exceptions are not used in Aspell as I find them more trouble than they are worth. Instead an alternate method of error handling is used which is based around the PosibErr class. PosibErr is a special Error handling device that will make sure that an error is properly handled. It is defined in ‘posib_err.hpp’. PosibErr is expected to be used as the return type of the function. It will automatically be converted to the "normal" return type however if the normal returned type is accessed and there is an "unhandled" error condition it will abort. It will also abort if the object is destroyed with an "unhandled" error condition. This includes ignoring the return type of a function returning an error condition. An error condition is handled by simply checking for the presence of an error, calling ignore, or taking ownership of the error. The PosibErr class is used extensively throughout Aspell. Please refer to the Aspell source for examples of using PosibErr until better documentation is written.  File: aspell-dev.info, Node: Source Code Layout, Next: Strings, Prev: Error Handling, Up: Top 6 Source Code Layout ******************** ‘common/’ Common code used by all parts of Aspell. ‘lib/’ Library code used only by the actual Aspell library. ‘data/’ Data files used by Aspell. ‘modules/’ Aspell modules which are eventually meant to be pluggable. ‘speller/’ ‘default/’ Main speller Module. ‘filter/’ ‘tokenizer/’ ‘auto/’ Scripts and data files to automatically generate code used by Aspell. ‘interface/’ Header files and such that external programs should use when in order to use the Aspell library. ‘cc/’ The external C interface that programs should be using when they wish to use Aspell. ‘prog/’ Actual programs based on the Aspell library. The main Aspell utility is included here. ‘scripts/’ Miscellaneous scripts used by Aspell. ‘manual/’ ‘examples/’ Example programs demonstrating the use of the Aspell library.  File: aspell-dev.info, Node: Strings, Next: Smart Pointers, Prev: Source Code Layout, Up: Top 7 Strings ********* 7.1 String ========== The ‘String’ class provided the same functionality of the C++ string except for fewer constructors. It also inherits ‘OStream’ so that you can write to it with the ‘<<’ operator. It is defined in ‘string.hpp’. 7.2 ParmString ============== ParmString is a special string class that is designed to be used as a parameter for a function that is expecting a string. It is defined in ‘parm_string.hpp’. It will allow either a ‘const char *’ or ‘String’ class to be passed in. It will automatically convert to a ‘const char *’. The string can also be accessed via the ‘str’ method. Usage example: void foo(ParmString s1, ParmString s2) { const char * str0 = s1; unsigned int size0 = s2.size() if (s1 == s2 || s2 == "bar") { ... } } ... String s1 = "..."; foo(s1); const char * s2 = "..."; foo(s2); This class should be used when a string is being passed in as a parameter. It is faster than using ‘const String &’ (as that will create an unnecessary temporary when a ‘const char *’ is passed in), and is less annoying than using ‘const char *’ (as it doesn’t require the ‘c_str()’ method to be used when a ‘String’ is passed in). 7.3 CharVector ============== A character vector is basically a ‘Vector’ but it has a few additional methods for dealing with strings which ‘Vector’ does not provide. It, like ‘String’, is also inherits ‘OStream’ so that you can write to it with the ‘<<’ operator. It is defined in ‘char_vector.hpp’. Use it when ever you need a string which is guaranteed to be in a continuous block of memory which you can write to.  File: aspell-dev.info, Node: Smart Pointers, Next: I/O, Prev: Strings, Up: Top 8 Smart Pointers **************** Smart pointers are used extensively in Aspell to simplify memory management tasks and to avoid memory leaks. 8.1 CopyPtr =========== The ‘CopyPtr’ class makes a deep copy of an object whenever it is copied. The ‘CopyPtr’ class is defined in ‘copy_ptr.hpp’. This header should be included wherever ‘CopyPtr’ is used. The complete definition of the object ‘CopyPtr’ is pointing to does not need to be defined at this point. The implementation is defined in ‘copy_ptr-t.hpp’. The implementation header file should be included at a point in your code where the class ‘CopyPtr’ is pointing to is completely defined. 8.2 ClonePtr ============ ‘ClonePtr’ is like copy pointer except the ‘clone()’ method is used instead of the copy constructor to make copies of an object. If is defined in ‘clone_ptr.hpp’ and implemented in ‘clone_ptr-t.hpp’. 8.3 StackPtr ============ A ‘StackPtr’ is designed to be used whenever the only pointer to a new object allocated with ‘new’ is on the stack. It is similar to the standard C++ ‘auto_ptr’ but the semantics are a bit different. It is defined in ‘stack_ptr.hpp’ — unlike ‘CopyPtr’ or ‘ClonePtr’ it is defined and implemented in this header file. 8.4 GenericCopyPtr ================== A generalized version of ‘CopyPtr’ and ‘ClonePtr’ which the two are based on. It is defined in ‘generic_copy_ptr.hpp’ and implemented in ‘generic_copy_ptr-t.hpp’.  File: aspell-dev.info, Node: I/O, Next: Config Class, Prev: Smart Pointers, Up: Top 9 I/O ***** Aspell does not use C++ I/O classes and functions in any way since they do not provide a way to get at the underlying file number and can often be slower than the highly tuned C I/O functions found in the standard C library. However, some lightweight wrapper classes are provided so that standard C I/O can be used in a more C++ like way. 9.1 IStream/OStream =================== These two base classes mimic some of the functionally of the C++ functionally of the corresponding classes. They are defined in ‘istream.hpp’ and ‘ostream.hpp’ respectively. They are however based on standard C I/O and are not proper C++ streams. 9.2 FStream =========== Defined in ‘fstream.hpp’. 9.3 Standard Streams ==================== ‘CIN’/‘COUT’/‘CERR’. Defined in ‘iostream.hpp’.  File: aspell-dev.info, Node: Config Class, Next: Filter Interface, Prev: I/O, Up: Top 10 Config Class *************** The ‘Config’ class is used to hold configuration information. It has a set of keys which it will accept. Inserting or even trying to look at a key that it does not know will produce an error. It is defined in ‘common/config.hpp’.  File: aspell-dev.info, Node: Filter Interface, Next: Filter Modes, Prev: Config Class, Up: Top 11 Filter Interface ******************* 11.1 Overview ============= In Aspell there are 5 types of filters: 1. _Decoders_ which take input in some standard format such as iso8859-1 or UTF-8 and convert it into a string of ‘FilterChars’. 2. _Decoding filters_ which manipulate a string of ‘FilterChars’ by decoding the text is some way such as converting an SGML character into its Unicode value. 3. _True filters_ which manipulate a string of ‘FilterChars’ to make it more suitable for spell checking. These filters generally blank out text which should not be spell checked 4. _Encoding filters_ which manipulate a string of ‘FilterChars’ by encoding the text in some way such as converting certain Unicode characters to SGML characters. 5. _Encoders_ which take a string of ‘FilterChars’ and convert into a standard format such as iso8859-1 or UTF-8 Which types of filters are used depends on the situation 1. When _decoding words_ for spell checking: • The _decoder_ to convert from a standard format • The _decoding filter_ to perform high level decoding if necessary • The _encoder_ to convert into an internal format used by the speller module 2. When _checking a document_ • The _decoder_ to convert from a standard format • The _decoding filter_ to perform high level decoding if necessary • A _true filter_ to filter out parts of the document which should not be spell checked • The _encoder_ to convert into an internal format used by the speller module 3. When _encoding words_ such as those returned for suggestions: • The _decoder_ to convert from the internal format used by the speller module • The _encoding filter_ to perform high level encodings if necessary • The _encoder_ to convert into a standard format A ‘FilterChar’ is a struct defined in ‘common/filter_char.hpp’ which contains two members, a character, and a width. Its purpose is to keep track of the width of the character in the original format. This is important because when a misspelled word is found the exact location of the word needs to be returned to the application so that it can highlight it for the user. For example if the filters translated this: Mr. foo said "I hate my namme". to this Mr. foo said "I hate my namme". without keeping track of the original width of the characters the application will likely highlight ‘e my ’ as the misspelling because the spell checker will return 25 as the offset instead of 30. However with keeping track of the width using ‘FilterChar’ the spell checker will know that the real position is 30 since the quote is really 6 characters wide. In particular the text will be annotated something like the following: 1111111111111611111111111111161 Mr. foo said "I hate my namme". The standard _encoder_ and _decoder_ filters are defined in ‘common/convert.cpp’. There should generally not be any need to deal with them so they will not be discussed here. The other three filters, the _encoding filter_, the _true filter_, and the _decoding filter_, are all defined the exact same way; they are inherited from the ‘IndividualFilter’ class. 11.2 Adding a New Filter ======================== A new filter basically is added by placing the corresponding loadable object inside a directory reachable by Aspell via ‘filter-path’ list. Further it is necessary that the corresponding filter description file is located in one of the directories listed by the ‘option-path’ list. The name of the loadable object has to conform to the following convention ‘libfiltername-filter.so’ where ‘filtername’ stands for the name of the filter which is passed to Aspell by the ‘add-filter’ option. The same applies to the filter description file which has to conform to the following naming scheme: ‘filtername-filter.opt’. To add a new loadable filter object create a new file. Basically the file should be a C++ file and end in ‘.cpp’. The file should contain a new filter class inherited from ‘IndividualFilter’ and a constructor function called ‘new_filtertype’ (see *note Constructor Function::) returning a new filter object. Further it is necessary to manually generate the filter description file. Finally the resulting object has to be turned into a loadable filter object using libtool. Alternatively a new filter may extend the functionality of an existing filter. In this case the new filter has to be derived form the corresponding valid filter class instead of the ‘IndividualFilter’ class. 11.3 IndividualFilter class =========================== All filters are required to inherit from the ‘IndividualFilter’ class found in ‘indiv_filter.hpp’. See that file for more details and the other filter modules for examples of how it is used. 11.4 Constructor Function ========================= After the class is created a function must be created which will return a new filter allocated with ‘new’. The function must have the following prototype: C_EXPORT IndividualFilter * new_aspell_FILTERNAME_FILTERTYPE Filters are defined in groups where each group contains an _encoding filter_, a _true filter_, and a _decoding filter_ (see *note Filter Overview::). Only one of them is required to be defined, however they all need a separate constructor function. 11.5 Filter Description File ============================ This file contains the description of a filter which is loaded by Aspell immediately when the ‘add-filter’ option is invoked. If this file is missing Aspell will complain about it. It consists of lines containing comments which must be started by a ‘#’ character and lines containing key value pairs describing the filter. Each file at least has to contain the following two lines in the given order. ASPELL >=0.60 DESCRIPTION this is short filter description The first non blank, non comment line has to contain the keyword ‘ASPELL’ followed by the version of Aspell which the filter is usable with. To denote multiple Aspell versions the version number may be prefixed by ‘‘<’’, ‘‘<=’’, ‘‘=’’, ‘‘>=’’ or ‘‘>’. If the range prefix is omitted ‘‘=’’ is assumed. The ‘DESCRIPTION’ of the filter should be under 50, begin in lower case, and note include any trailing punctuation characters. The keyword ‘DESCRIPTION’ may be abbreviated by ‘DESC’. For each filter feature (see *note Filter Overview::) provided by the corresponding loadable object, the option file has to contain the following line: STATIC filtertype ‘filtertype’ stands for one of ‘decoder’, ‘filter’ or ‘encoder’ denoting the entire filter type. This line allows to statically (see *note Link Filters Static::) link the filter into Aspell if requested by the user or by the system Aspell is built for. OPTION newoption DESCRIPTION this is a short description of newoption TYPE bool DEFAULT false ENDOPTION An option is added by a line containing the keyword ‘OPTION’ followed by the name of the option. If this name is not prefixed by the name of the filter Aspell will implicitly do that. For the ‘DESCRIPTION’ of a filter option the same holds as for the filter description. The ‘TYPE’ of the option may be one of ‘bool’, ‘int’, ‘string’ or ‘list’. If the ‘TYPE’ is omitted ‘bool’ is assumed. The default value(s) for an option is specified via ‘DEFAULT’ (short ‘DEF’) followed by the desired ‘TYPE’ dependent default value. The table *note Filter Default Values:: shows the possible values for each ‘TYPE’. Type Default Available bool true true false int 0 any number value string any printable string list any comma separated list of strings Table 1. Shows the default values Aspell assumes if option ‘description’ lacks a ‘DEFAULT’ or ‘DEF’ line. The ‘ENDOPTION’ line may be omitted as it is assumed implicitly if a line containing ‘OPTION’, ‘STATIC’. *Note* The keywords in a filter description file are case insensitive. The above examples use the all uppercase for better distinguishing them from values and comments. Further a filter description may contain blank lines to enhance their readability. *Note* An option of ‘list’ type may contain multiple consecutive lines for default values starting with ‘DEFAULT’ or ‘DEF’, to specify numerous default values. 11.6 Retrieve Options by a Filter ================================= An option always has to be retrieved by a filter using its full qualified name as the following example shows. config->retrieve_bool("filter-filtername-newoption"); The prefix ‘filter-’ allows user to relate option uniquely to the specific filter when ‘filtername-newoption’ ambiguous an existing option of Aspell. The ‘filtername’ stands for the name of the filter the option belongs to and ‘-newoption’ is the name of the option as specified in the corresponding ‘.opt’ file (see *note Filter Description File:: 11.7 Compiling and Linking ========================== See a good book on Unix programming on how to turn the filter source into a loadable object. 11.8 Programmer’s Interface =========================== A more convenient way recommended, if filter is added to Aspell standard distribution to build a new filter is provided by Aspell’s programmers interface for filter. It is provided by the ‘loadable-filter-API.hpp’ file. Including this file gives access to a collection of macros hiding nasty details about runtime construction of a filter and about filter debugging. Table *note Interface Macros:: shows the macros provided by the interface. For details upon the entire macros see ‘loadable-filter-API.hpp’. An example on how to use these macros can be found at ‘examples/loadable/ccpp-context.hpp’ and ‘examples/loadable/ccpp-context.cpp’. Macro Type Description Notes ACTIVATE_ENCODER M makes the entire do not call inside encoding filter class declaration; callable by Aspell these macros define new_ function; ACTIVATE_DECODER M makes the entire _as above_ decoding filter callable by Aspell ACTIVATE_FILTER M makes the entire _as above_ filter callable by Aspell FDEBUGOPEN D Initialises the macros These macros are only for debugging a filter active if the and opens the debug ‘FILTER_PROGRESS_CONTROL’ file stream macro is defined and denotes the name of the file debug messages should be sent to. If debugging should go to Aspell standard debugging output (right now stderr) use empty string constant as filename FDEBUGNOTOPEN D Same as “FDEBUGOPEN” _as above_ but only if debug file stream was not opened yet FDEBUGCLOSE D closes the debugging _as above_ device opened by “FDEBUGOPEN” and reverts it to “stderr”; FDEBUG D prints the filename _as above_ and the line number it occurs FDEBUGPRINTF D special printf for _as above_ debugging Table 2. Shows the macros provided by ‘loadable-filter-API.hpp’ (*M* mandatory, *D* debugging) 11.9 Adding a filter to Aspell standard distribution ==================================================== Any filter which one day should be added to Aspell has to be built using the developer interface, described in *note Programmer's Interface::. To add the filter the following steps have to be performed: 1. Decide whether the filter should be kept loadable if possible, or always be statically linked to Aspell. 2. Place the filter sources inside the entire directory of Aspell source tree. Right now use ‘$top_srcdir/modules/filter’. 3. Modify the ‘Makefile.am’ file on the topmost directory of the Aspell distribution. Follow the instructions given by the ‘#Filter Modules’ section. 4. Run ‘autoconf’, ‘automake’, … 5. Reconfigure sources. 6. Clear away any remains of a previous build and rebuild sources. 7. Reinstall Aspell. 8. Test if filter has been added properly otherwise return to steps 2–7 9. Reconfigure sources with ‘enable-static’ flag and repeat steps 2–7 until your filter builds and runs properly in case of static linkage. 10. Add your source files to cvs, and commit all your changes. Or in case you are not allowed to commit to cvs submit a patch (see *note How to Submit a Patch::) containing your changes.  File: aspell-dev.info, Node: Filter Modes, Next: Data Structures, Prev: Filter Interface, Up: Top 12 Filter Modes *************** Filter modes are the preferred way to specify combinations of filters which are used regularly and thus abbreviate Aspell’s command line arguments. A new filter mode is specified by a file named like the filter new mode and prefixed by ‘.amf’ (Aspell Mode File). If such a file is accessible by the path set via filter-path option Aspell will try to load the contained mode specification. 12.1 Aspell Mode File ===================== The first key in the made file has be the ‘mode’ key. It is checked against the mode name part of the .amf file. If the ‘mode’ key is missing mode file will be rejected. The same holds for the ‘aspell’ key which specifies the version(s) of Aspell which is(are) required by the filter. If these two keys are followed by at least one ‘magic’ key Aspell will be able to select the entire mode from extension and if required from contents of the file to spell implicitly. The last key of the required keys is the ‘des[c[ription]]’ key. It gives a short description of the filter mode which will displayed when type ‘aspell help’. The rest of the file consists of the keys ‘filter’ and ‘option’ to load filters are set various options. 12.1.1 Version Line ------------------- Each version line must start with ‘aspell’ and be followed by a version, optionally prefixed by a relational operator. The relation operator can be one of ‘<’, ‘<=’, ‘=’, ‘>=’ or ’>’ for allowing Aspell version with version number being lower, lower or equal, equal to, greater or equal or greater than required version number, respectfully. If the relation operator is omitted ‘=’ is assumed. 12.1.2 Magic Line ----------------- The magic line contains a description which requirements files have to fulfill in order to implicitly activate the entire mode at least one such line is required. Each magic line has the following format: MAGIC //[/] The magic key consist of three ‘:’ separated fields. The first two are byte counts the last is a regular expression. The first byte count indicates the first byte the regular expression will be applied to the second byte count indicates the number of bytes to test against the regular expression. If mode selection should only occurred on basis of the listed file extensions the magic key should consist of the “” special string. At least one is required per MAGIC line. may not be empty and should not contain a leading ‘.’ as this is assumed implicitly. Multiple MAGIC lines are allowed. Modes may be extended limited by additional