Introduction

img.png

Kodexa is a platform for building intelligent document processing pipelines. It is a set of tools and services that allow you to build a pipeline that can take a document, extract the content, and then process it to extract the information you need.

It is built on a set of core principles:

  • Document Centric - Kodexa is built around the idea of a document. A document is a collection of content nodes that are connected together. This is a powerful model that allows you to build pipelines that can extract content from a wide range of sources.

  • Pipeline Oriented - Kodexa is built around the idea of a pipeline. A pipeline is a series of steps that can be executed on a document. This allows you to build a pipeline that can extract content from a wide range of sources.

  • Extensible - Kodexa is built around the idea of a pipeline. A pipeline is a series of steps that can be executed on a document. This allows you to build a pipeline that can extract content from a wide range of sources.

  • Label Driven - Kodexa focuses on the idea of labels. Labels are a way to identify content within a document and then use that content to drive the processing of the document.