Skip to content

AWS Textract

aws-textract-model

Extracts data from forms and tables using OCR and machine learning

Amazon Textract is a machine learning (ML) service that automatically extracts text, handwriting, and data from scanned documents. It goes beyond simple optical character recognition (OCR) to identify, understand, and extract data from forms and tables.

You can list a component in the marketplace, and define if you want it to be a template.

✅ Available in Marketplace

❌ Can not be used as a template

What is a Model?

This component is a model, which is a type of store that is specialized for handling AI/ML model storage, which includes both the implementation, and the results of training.

Models are a foundational part of Kodexa, and are used in many different ways. For example, a model can be used to classify documents, or to extract data from documents. Models can also be used to train other models.

Metadata

✅ Atomic Deployment (Recommended)

❌ Not trainable

Model Runtime

A model needs to reference a model runtime to use.

✅ kodexa/base-model-runtime