Skip to content

Getting Started

This guide will walk you through using the Kodexa UI (User Interface) to work with unstructured content. The UI is a web application that allows you to design projects and train assistants and models to allow you to extract structured data.

Let's get started!

Logging In

Logging into Kodexa

  1. Obtain your Kodexa account credentials from your system administrator.
  1. Using a web browser, go to the URL of your Kodexa instance (e.g. https://[company_name].kodexa.com).
  2. Enter the email address and the password to the corresponding text fields and click the Login button.

If you are logging in for the first time, it is highly recommended to change your password. If you have forgotten your password, you may request your system administrator to do a user password reset.

Creating an Organization

You need to select an existing or create a new Organization where you would like to put your Project.

Creating an Organization

  1. Click the Create Organization button. (Alternatively, you can first check the list of existing Organizations accessible to you by clicking the Organization button.)
  2. Field in your preferred Organization name. The slug will automatically generated as you type in the name. Click the create button.
  3. The Project Directory page for the newly created Organization will be shown. This page will show the list of projects created in that Organization.

Setting up a Project

A project is a collection of components needed to extract data from documents.

Setting up a Project

  1. Click the Create Project button to open Project Template Marketplace. Each template contains a set of components that you might need to extract the data from your documents.
  2. You may search for a Project Template by typing in the filter field and/or clicking the corresponding group buttons. You can also start with an empty project.
  3. Click a Project Template to see additional information about it. Click the Use Template button to select the Project Template. Otherwise, you may click the Marketplace button to go back.
  4. Once you have decided to use a Project Template, nominate a Project Name (required) and a Description (optional) and then click the Create button.
  5. The newly created project will be listed in the Project Directory page. Click the project name to open it.
  6. You can still modify, delete and add components to your project in the Project page.

Training a Document for Data Extraction

To extract data from PDF documents, you need to train Kodexa for it to know which data to extract and how to structure those data.

Training a Document for Data Extraction

  1. Click the Training Store and upload a document that you would like to train. You may close the Upload page or click anywhere outside the pop-up as soon as the upload process starts.
  2. Click the name of the training file to start labelling the document. You may need to indicate the data structure and label the table markers and form markers in your document depending on whether you are trying to extract data from tables, from forms, or from a combination of tables and forms.
  3. Optionally, you can train and test to get a preview of the extraction.
  4. Once you are satisfied with the result of the test, assign the document as the training document and save the assistant and labels.

You are now ready to extract data from the documents with similar formats.

Extracting the Data

  1. Go to the Processing Store of your project. If you are currently viewing the training document, click the "Return to Project" button to go to the Processing Store.
  2. Upload the documents where the data would be extracted from and wait for the processing to finish.
  3. Go to the extracted data store to view the extracted data.
  4. You may download the extracted data as Excel file if your data structure is not nested. Alternatively, you can publish the results in different file formats using one of the Kodexa Publishing Assistants.

Extracting the Data