One of the most important parts of the Kodexa Platform is the ability to create and manage models. Models are the core of the platform and are used to extract data from documents. Models are created by using Python and the Kodexa SDK.
In its simpliest form a model is simply a small Python script receives a Document and returns a Document. The model can be as simple as:
You would put this code in a module, ie.
In order to deploy the model we need to also create a
model.yaml file that describes the model. This file is used
to describe the model and also to provide the metadata that is used to deploy the model to the Kodexa Platform.
# A very simple first model that isn't trainable slug: my-model version: 1.0.0 orgSlug: kodexa type: store storeType: MODEL name: My Model metadata: atomic: true state: TRAINED modelRuntimeRef: kodexa/base-model-runtime type: model provider: Amazon Web Services providerUrl: https://aws.amazon.com/textract/ contents: - model/*