Skip to content

Overview

In order to be more efficient in verification of model-extracted information, data should be presented in a flex grid view so that several data attributes can be verified in full screen without the need to move the screen up and down. Instead, it will be easier if the user has a page navigation to go from one batch of data to another batch.

By default, the data that will be presented in this screen are only the user-editable data, and that the data fields will be in an editable mode like in an Excel or Google sheet where you can go from one cell to another and make changes. When you press tab, you exit that data entry and move on to the next entry. Any changes made to the data entry should be autosaved after 5 seconds.

The idea of a data attribute to be user-editable will mean providing a configuration for that taxon to be user-editable with a default setting of true. If that data must not be changed since it is being provided by the system before processing, then the setting must be false.

Another component of the data attribute that must be included in the taxon is the ability to receive multiple entries or values. There are data attributes which cannot take up more than one data entry (e.g. barcode). As such, the taxon configuration for multi-value should be set to false by default.

Users should also be allowed to adjust the flex grid view just by changing a YAML file that determines which is going to be presented in one row, and if there are any children for that row.

  • Data Forms allow us to define how we want to present a data structure to a user
  • To provide a structure that is configurable in metadata and can be rendered
    • Leverages the metadata from the data structure to determine control types naming etc
    • Integrates with the attribute status and exception management from the platform
    • Supports audit trail, notes and more

High-Level Structure

The concept is that a data form sits on a Data Structure and uses a simple set of metadata to determine how best to present that information to user.

Data Form Structure

Data Form Structure

The structure of a Data Form is comprised of two parts.

Data Form Providers

A data form can have one or more data sources. These are defined by metadata and act as the definition of how we will populate the data form. By making the data sources flexible, it allows us to have different types of data sources for different use cases. The following are examples:

  • A simple Data Object data source that allows us to build a data form to represent a single data object
  • A Data Attribute data source that returns a list of attributes (based on a query) rather than a data object
  • A Document data source that provides methods to allow you to get things like the status, assignee, etc. associated with the document
    • This would allow you to add cards to your view that will update the document, etc.

The data sources will be defined in the metadata. They will then be exposed as an API endpoint dynamically - this allows us to provide custom data source endpoints that meet complex query criteria (for example getting all data attributes with a specific status in a specific order).

By moving the data work to the backend, it means that we are then able to build the view and card infrastructure to interact with these endpoints.

We are implementing the following available data sources: | Name | Description | | ---- | --- | | dataObject | This data source will load a data object based on the ID that is provided to the form. This allows this type of data source to handle the form loading from data object link in a view. | | documentFamily | This data source will load up a list of data objects based on the ID of a document family that was provided. This allows a form to be built that is focused on loading up data from a document. | | scriptableDataObject | This data source will allow you to load one or more data objects based upon a script that is provided with the data source metadata | | scriptableDataAttribute | This data source will allow you to load one or more data attributes based upon a script that is provided with the data source metadata |

Interacting with the API for a Data Form

The following section outlines how we interact with a data form and data form sources in order to load the data into the UI to render.

Interacting with the API

The first interaction from the UI to the platform is to go by data source under the data form (where the data form has been loaded as metadata) and then perform a PUT request to the URL

https://server/api/dataForms/{orgSlug}/{slug}/{version}/sources/{dataSourceId}

The body of this object should contain the parameters you have available, this means that if you have a storeRef and a documentFamilyId or dataObjectId then you need to post these in the body of the PUT above.

{ 'storeRef' : 'org/example-store:1.0.0', 'dataObjectId':'123123123123123213' }

The reason for this is the resulting response will include whether this source can be used and also will include some metadata about the source.

{
  'sourceType' : 'DATA_OBJECT',
  'valid' : true,
  'methods' : [
    { 'name' : 'dataAttributes', 
      'parameters' : [
          { 'name': 'storeRef', required: true }
        ]}
  ]
}

The idea behind this is it will return enough information to determine if the source can be used in the data form.

After getting this metadata, the expectation is that based on the source type, the data form in the UI will load from either:

https://server/api/dataForms/{orgSlug}/{slug}/{version}/sources/{dataSourceId}/{method}

Also remember that these endpoints are actually PUT methods not GET method since you will also need to PUT the parameters to those endpoints to get the data, as below.

The request to the method is sent as an exchange with just parameters:

{
  "parameters": {
    "storeRef": "org/8a8a843c81f0aeb60181f1011d1e02bd-extracted-data",
    "orgSlug": "org",
    "formSlug": "data-validation",
    "version": "1.0.0",
    "id": "b34170ac-f720-4dc2-9a29-c37f5c9de0c2",
    "projectId" : "aaaa"
  }
}

What you will receive back is that same exchange with the response included.

{
  "method": "attributesToValidate",
  "parameters": {
    "storeRef": "org/8a8a843c81f0aeb60181f1011d1e02bd-extracted-data",
    "orgSlug": "org",
    "formSlug": "data-validation",
    "version": "1.0.0",
    "id": "3d1ea52b-44e4-43d7-8ad1-7e43bad81e7c"
  },
  "payload": {
    "attributesToValidate": [
      {
        "id": "8a8a83df82a6b19e0182a6c9e09f150d",
        "uuid": "5253b43b67fa48a49b295f4a99c8f23f",
        "createdOn": "2022-08-16T13:13:44.381Z",
        "updatedOn": "2022-08-18T10:21:20.676Z",
        "dataObject": {
          "id": "8a8a83df82a6b19e0182a6c9e09f1503",
          "uuid": "a03f20cd9eb7434b80130dc8d1d0bbed",
          "createdOn": "2022-08-16T13:13:44.368Z",
          "updatedOn": "2022-08-16T13:13:44.368Z",
          "documentFamily": {
            "id": "8a8a83df82a6b19e0182a6c97de00d7e",
            "storeRef": "org/8a8a843c81f0aeb60181f1011d1e02bd-processing:1.0.0",
            "path": "2366e551-d1e8-476c-bbc2-bc6dea8b4d7b1.pdf"
          },
          "dataExceptions": [],
          "taxonomyRef": "org/96d46bb7-2acf-4a70-806c-78d7f6031398-taxonomy-template:1.0.0",
          "path": "Bill",
          "rowNum": 0,
          "sourceOrdering": "1ed1d654-1363-64ae-8c50-1b492166df3d000000000",
          "dateTime": "2022-08-16T13:13:43.482Z",
          "lineage": {
            "storeRef": "org/8a8a843c81f0aeb60181f1011d1e02bd-processing:1.0.0",
            "documentFamilyId": "8a8a83df82a6b19e0182a6c97de00d7e",
            "executionId": "8a8a83df82a6b19e0182a6c984850d9a",
            "contentObjectId": "8a8a83df82a6b19e0182a6c9dbd21413"
          },
          "storeRef": "org/8a8a843c81f0aeb60181f1011d1e02bd-extracted-data:1.0.0",
          "taxon": {
            "id": "5296aa51427e4e309b7678280a7aeae9",
            "label": "Bill",
            "generateName": false,
            "group": true,
            "name": "Bill",
            "externalName": "_ill",
            "valuePath": "VALUE_OR_ALL_CONTENT",
            "enableFallbackExpression": false,
            "nullable": false,
            "enabled": true,
            "color": "#e8cb07",

            "options": [],
            "relatedTaxons": [],
            "nodeTypes": [],
            "taxonType": "STRING",
            "selectionOptions": [],
            "typeFeatures": {},
            "path": "Bill",
            "multiValue": true,
            "userEditable": true,
            "usePostExpression": false
          }
        },
        "value": "2021-12-20",
        "truncated": false,
        "dataExceptions": [],
        "tagMetadata": [
          {
            "id": "8a8a83df82a6b19e0182a6c9e0a0150e",
            "uuid": "d1f8179dec414324be9a05fae1c4671f",
            "createdOn": "2022-08-16T13:13:44.382Z",
            "updatedOn": "2022-08-16T13:13:44.382Z",
            "metadata": {
              "scaledBoundingBox": [
                1.405085329802862,
                5.793177242967772,
                2.474,
                5.923665645006256
              ],
              "parentSelector": "//page",
              "parentIndex": 0
            }
          }
        ],
        "tag": "DueDate",
        "tagUuid": "2828d3d1-c35a-40d5-bb2b-c99ca45a44a2",
        "dateValue": "2021-12-20T00:00:00.000Z",
        "attributeStatus": {
          "id": "8a8a841182a1d9a60182a20564e50006",
          "uuid": "bb93f065e26049b59347ac911ea8e1af",
          "createdOn": "2022-08-15T15:00:38.803Z",
          "updatedOn": "2022-08-16T14:40:01.367Z",
          "color": "#8BFF00",
          "status": "In Mass Review",
          "statusType": "UNRESOLVED"
        },
        "validationState": "VALID",
        "validationMessages": [],
        "confidence": 1,
        "dataFeatures": {},
        "numberOfNotes": 0
      },
      ...
    ],
    "targetValue": "2021-12-20T00:00:00"
  }
}

Based upon the type of provider the data for the forms will be returned in a payload, usually with the key of the method name, however the provider can also return other things in the payload.

Now, when you want to work with the data and then return it you can define a button to push back the content to the method again.

- type: button
  properties:
    color: blue
    label: Update & Next
    action: executeProviderMethod
    sourceId: 23894u238o471
    method: attributesToValidate
    payload: dataAttributes

This button shows it will pick up the data attributes from the form and push them back to the payload value “attributesToValidate” and then send that to the method again. In essence returning the exchange object again but now with the updated data attributes from the form.

Whether the provider is sending data the first time or not - the same script is always called.

In order to see how the data form is rendered, refer to the documentation on Cards.

Where do we get to a Data Form?

The location where you can access a data form is based upon the types of data source that are available for that data form, this means you could have more than one source available, allowing the form to be access from more than one location.

Data Forms in the UI

When the data source is configured, it will get parameters based on the how the data form was opened.

How Data Form is Opened Parameters
Opened from a row is a Data Store storeRef - the reference for the data store
dataObjectId - the data object ID that was selected
Opened from a row in a document store (a document family) storeRef - the reference for the document store
documentFamilyId - the data object ID that was selected
Opened from a Data Store storeRef - the reference for the data store


This means that when we first start up the data sources, and we have the properties for them, then we will pass these
properties back to the data form data source to determine if can be used.

These are new API’s that exist on the platform for data forms.

Endpoint Purpose
/api/dataForms/{orgSlug}/{slug}/{version}/sources/{sourceId} This URL is a PUT method where you can pass as properties you received when the form was open (properties that were passed based on the way you opened the data form). It will return data source definition metadata. This metadata will tell the UI what is available from the data source.

The metadata will include:
* Will the source return data objects or data attributes
What methods are available for the source
Is the data source valid - based upon the provided properties (for example if you didn’t provide a documentFamilyId - then the documentFamilySource would not be valid)
/api/dataForms/{orgSlug}/{slug}/{version}/sources/{sourceId}/{methodName} This URL is a PUT


This is also implemented with the entry point on the data form
orgSlug: org
slug: data-validation
version: 1.0.0
type: dataForm
name: Data Validation
icon: mdi-archive-search
entrypoints:
  - dataStore

Available entry points are:

Entry Points Description
documentFamily Open the form for a specific document
documentStore Dropdown in the document store (i.e. processing)
dataStore Dropdown in the data store (i.e. extracted data)