This is an advanced topic which involves programming in Typescript. It is intended for those who wish to build a predictive text dictionary with custom logic.

The standard lexical models supported in Keyman are currently wordlist-based. This works well for many languages, but for polysynthetic languages or those with complex morphologies, it is not practical to list all possible word forms. For these languages, it makes sense to embed grammar knowledge in to the lexical model and reduce the wordlist dramatically. The custom-1.0 lexical model type has been provided for this purpose.

A custom model definition will be fairly small — just defining the source files and root class for the lexical model (for example, this file may be called example.en.custom.model.ts):

const source: LexicalModelSource = {
  format: 'custom-1.0',
  wordBreaker: 'default',
  rootClass: 'ExampleCustomModel',
  sources: ['ExampleCustomModel.ts'],
};
export default source;

The implementation of the lexical model is done in the ExampleCustomModel class which is implemented in the source file ExampleCustomModel.ts. It is possible to add multiple source files, which will be concatenated into the compiled model.

The ExampleCustomModel class must implement at least configure and predict functions, and can also optionally implement as much of the LexicalModel interface as makes sense for your use case.

This example offers a correction for teh to the, them, or tee hee.

import { LexicalModelTypes } from '@keymanapp/common-types';

export class ExampleCustomModel implements LexicalModelTypes.LexicalModel {
  configure(capabilities: LexicalModelTypes.Capabilities): LexicalModelTypes.Configuration {
    return {
      leftContextCodePoints: 16,
      rightContextCodePoints: 0,
      wordbreaksAfterSuggestions: false,
    }
  }

  languageUsesCasing: boolean = true;

  predict(transform: LexicalModelTypes.Transform, context: LexicalModelTypes.Context): LexicalModelTypes.Distribution<LexicalModelTypes.Suggestion> {
    if(transform.deleteLeft == 0 && context.left.endsWith('te') && transform.insert == 'h') {
      return [
        { p: 0.3, sample: { displayAs: 'the', transform: { deleteLeft: 2, insert: 'the' } } },
        { p: 0.2, sample: { displayAs: 'them', transform: { deleteLeft: 2, insert: 'them' } } },
        { p: 0.1, sample: { displayAs: 'tee hee', transform: { deleteLeft: 2, insert: 'tee hee' } } },
      ];
    } else {
      return [];
    }
  }
}

The prediction works on an input transform, which may be more complex than a single character being inserted. You may want to compare against the applied transform; see the applyTransform() function for how to do this.

Note: The lexical model compiler had a bug with compiling custom models in earlier versions of Keyman Developer, as this functionality was exposed but unused. This problem is corrected in Keyman Developer 18.0.249 with bug fix #15778. (As of writing this bug fix is undergoing review and will be deployed shortly.)

Categories: Developing Keyboards

1 thought on “Creating an advanced custom lexical model with Keyman”

Keyman Update for 27 Mar 2026 – Keyman Blog · March 30, 2026 at 6:23 am

[…] complex words for predictive text (#15777), with a technical blog explaining how it […]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Related Posts

Developing Keyboards

Write a description for your keyboard package (with Keyman Developer)

When you design and distribute a keyboard using Keyman Developer, it is important to spend a bit of time describing the keyboard for your users. The description will appear in the keyboard search results on Read more…

Developing Keyboards

Introducing the Keyman Open Source Keyboard Repository

We have many hundreds of keyboard layouts online at keyman.com that cover well over a thousand  languages. These keyboard layouts work on Windows, the web, iPhones, iPads and Android phones and tablets. However, most of Read more…