2017/11/29

GNU Aspell

I am interested in GNU Aspell, the spell checker licensed under the GNU LGPL ver.2.1.
I have downloaded the Aspell for Win32, aspell-0.50.4.1-vc++-src.zip, to create my spell checker from official site.
At the first time, VC7.1 could not compile the original Aspell's source codes, because the codes were for the ancient compiler, VC6.
Then, I modified the project and source codes for VC7.1.
Next, I have derived special Tokenizer which have the following parsing rules;
[A-Z]{2,} [A-Za-z][a-z]*
"word" consists of only alphabets and is punctuated by a separator.
[^A-Za-z]
The letters excluding all the alphabets are separators.
The tokenizer skips multi-byte characters such as Japanese.

[a-z][A-Z]
An uppercase letter following a lowercase letter is a separator, too.
[A-Z]{2,}
Sequential uppercase letters are an abbreviation, which is one of words.

If the lowercase letter follows after an abbreviation, the abbreviation is punctuated before the last letter of sequential uppercase letters. In the other words, the last uppercase letter means the capital letter of a next word.
ex)SAMLuncher -> "SAM" "launcher"
These rules are for C++ or other programming language source codes. The tokenizer can check identifiers chained with underscores in C++ Standard style, and the identifiers in camel style such as Jave Standard or the Hungarian naming rules.
The new spell checker for CUI has created, and I am creating GUI version. I of course make spell checker for source codes. but, as other purpose, I want to make add-in module for Enterprise Architect (EA), because EA's spell checker does not support multi-byte characters like Japanese, although EA has already contained the spell checker, Wspell. Therefore, I am writing the GUI in ATL, because of suitability to COM.

No comments:

How to set parameters to debugging program on Visual Studio 2019 with CMake

Solution: MSDN Sometimes the "Debug and Launch Settings for CMake" bottun is disabled. In this case, change to the target view. ...