This is an advanced topic which involves programming in Typescript. It is intended for those who wish to build a predictive text dictionary with custom logic.
The standard lexical models supported in Keyman are currently wordlist-based. This works well for many languages, but for polysynthetic languages or those with complex morphologies, it is not practical to list all possible word forms. For these languages, it makes sense to embed grammar knowledge in to the lexical model and reduce the wordlist dramatically. The custom-1.0 lexical model type has been provided for this purpose.
A custom model definition will be fairly small — just defining the source files and root class for the lexical model (for example, this file may be called example.en.custom.model.ts):
const source: LexicalModelSource = {
format: 'custom-1.0',
wordBreaker: 'default',
rootClass: 'ExampleCustomModel',
sources: ['ExampleCustomModel.ts'],
};
export default source;
The implementation of the lexical model is done in the ExampleCustomModel class which is implemented in the source file ExampleCustomModel.ts. It is possible to add multiple source files, which will be concatenated into the compiled model.
The ExampleCustomModel class must implement at least configure and predict functions, and can also optionally implement as much of the LexicalModel interface as makes sense for your use case.
This example offers a correction for teh to the, them, or tee hee.
import { LexicalModelTypes } from '@keymanapp/common-types';
export class ExampleCustomModel implements LexicalModelTypes.LexicalModel {
configure(capabilities: LexicalModelTypes.Capabilities): LexicalModelTypes.Configuration {
return {
leftContextCodePoints: 16,
rightContextCodePoints: 0,
wordbreaksAfterSuggestions: false,
}
}
languageUsesCasing: boolean = true;
predict(transform: LexicalModelTypes.Transform, context: LexicalModelTypes.Context): LexicalModelTypes.Distribution<LexicalModelTypes.Suggestion> {
if(transform.deleteLeft == 0 && context.left.endsWith('te') && transform.insert == 'h') {
return [
{ p: 0.3, sample: { displayAs: 'the', transform: { deleteLeft: 2, insert: 'the' } } },
{ p: 0.2, sample: { displayAs: 'them', transform: { deleteLeft: 2, insert: 'them' } } },
{ p: 0.1, sample: { displayAs: 'tee hee', transform: { deleteLeft: 2, insert: 'tee hee' } } },
];
} else {
return [];
}
}
}
The prediction works on an input transform, which may be more complex than a single character being inserted. You may want to compare against the applied transform; see the applyTransform() function for how to do this.
Note: The lexical model compiler had a bug with compiling custom models in earlier versions of Keyman Developer, as this functionality was exposed but unused. This problem is corrected in Keyman Developer 18.0.249 with bug fix #15778. (As of writing this bug fix is undergoing review and will be deployed shortly.)
1 thought on “Creating an advanced custom lexical model with Keyman”
Keyman Update for 27 Mar 2026 – Keyman Blog · March 30, 2026 at 6:23 am
[…] complex words for predictive text (#15777), with a technical blog explaining how it […]