Learn how to utilise Keyman Desktop’s group functionality when designing keyboard layouts, and how to manage end-of-word rules.
This blog is about writing Keyman Desktop keyboard layouts. It is assumed that you are familiar with the concepts outlined in the keyboard tutorial and language reference.
When designing a keyboard layout, a common technique is to break processing of complex rules down into multiple groups. A typical design is:
begin > use(constraints)
group(constraints) using keys
c match invalid keystrokes
nomatch > use(main)
group(main) using keys
c normal processing rules
match > use(post-process)
c normalisation, additional text processing
c note this group does not match keystrokes but only processes context
This design makes it easy to make your keyboard layout flexible and very useable, and is typically much easier to validate for correctness than if all the rules are in a single group.
One common scenario that this model does not necessarily cover, however, is how to match end of word scenarios. The example we will use is word-final sigma in Greek, which differs from the medial form as shown:
There are a number of ways of handling this: leave the choice to the end user, show word-final sigma first, or show medial sigma first. The choice really depends on the language; for some languages option one may be disconcerting for the end user; conversely, for other languages the second approach may be confusing.
1. Leave the choice to the end user
A naive keyboard layout will just have two keys for sigma. This is a valid design, but it can be annoying for end users. In other languages, there may be too many options for this to be a useful solution.
2. Show word-final sigma until another Greek character typed
This means that we assume that a sigma as word final until another letter is added to the word, and change it at that point. This would be best done in post-processing, rather than adding complexity to the main group:
'ς' any(greek) > 'σ' context(2)
3. Show medial sigma until a word ending character typed
The alternative is to only output word-final sigma if an end-of-word is detected. This has two components: matching end of word punctuation, including punctuation specific to your language; and matching special end of word keys, specifically Space, Enter and Tab.
This could look a bit like the following:
store(word-ending-key) [K_RETURN] [K_TAB] [K_SPACE]
group(main) using keys
c and other normal rules
+ any(punctuation) > index(punctuation, 1)
+ any(word-ending-key) > deadkey(word-ending)
match > use(post-process)
'σ' any(punctuation) > 'ς' context(2)
'σ' deadkey(word-ending) > 'ς' use(final)
group(final) using keys
You'll note that the word-ending-key rule adds a deadkey to the output. This deadkey is really used just as a flag to tell the post-process group that we need to do additional special processing on this. This is a common technique in Keyman Desktop keyboards. We also want to make sure we use the post-process group because there may be other rules that need to fire, e.g. automatic breathing marks may be added in Greek.
Similarly, the punctuation rule may appear unnecessary. However, if you do not include the rule, the match rule will not fire, and therefore the post-process group will not run. You may also have language-specific punctuation that actually needs conversion.
However, where I'd like to focus your attention is on the final group. This is where the magic happens with end-of-word keys. One may be tempted to try and just push out the virtual key as output in a special rule in the main group, e.g.
'σ' + [K_RETURN] > 'ς' [K_RETURN] c don't use this!
But while this virtual key output technique does kinda work, it is not a supported feature of Keyman Desktop, and definitely won't work in KeymanWeb. So how do you get that keystroke to the application?
The answer is that Keyman and KeymanWeb have special processing for unmatched keystrokes in a "using keys" group: they are always sent on to the application. For technical reasons, that I won't get into here, this is much easier than synthesising arbitrary keystrokes as would be required with virtual key output. To take advantage of this behaviour, we just have to make sure the rule processing ends in a "using keys" group that does not match the desired key.
This technique is powerful but has a few pitfalls that it helps to be aware of:
- There is a temptation to do the conversion on navigation keys as well as word ending keys. This makes some sense as a user may type a word then navigate to another part of the document, and would then end up with an incorrect medial sigma. However, this is dangerous because it makes editing a medial sigma very difficult! It is best to restrict to punctuation and word ending keys space, enter and tab.
- The nomatch rule is also one to be careful with in "using keys" groups: it only applies if the keystroke would generate a "normal" character. Therefore Space or "a" would trigger a nomatch if no rule matched, but Enter or Tab would not (these keys may generate control characters, but not normal printing characters).
- You should also consider which punctuation should trigger end-of-word processing: for instance, hyphen (-) may not, while double hyphen (–) may! This is fairly easy to handle in the post-process group (please note, this is hypothetical and may actually not apply to Greek!):
c 'σ-' > 'ς-' Don't do this as hyphen may be valid in middle of word
'σ–' > 'ς–'
c or you may opt to do some nice 'autocorrect':
c 'σ–' > 'ς—'
c '–' > '—'
This simple example demonstrates some of the techniques and tricks that you can use to make your keyboard layout work most effectively. The constraints/main/post-process structure has proven to be a powerful and easy to maintain template for many different languages. The final group design pattern allows you to safely handle end-of-word scenarios. And, as always, you need to think about how your keyboard will be used. Your users will be typing away with your keyboard all day: a solid design will be much appreciated by your keyboard users.
0 thoughts on “The group(final) design pattern: handling end-of-word rules”