Documentation

Learn about the underlying framework behind Kodexa

Kodexa comprises a core framework and a surrounding set of extensions. These extensions enable developers to work with documents and unstructured data in a way that supports flexibility and extensibility, and they are deployable on a range of platforms. The structure of the framework was inspired by concepts derived from many standard approaches used in data engineering and data science, and was informed by our experience labeling data for use by machine learning algorithms.

The framework is focused on the following areas:

  • Content Normalization and Classification

  • Labelling (Tagging), Feature Engineering and Content Organization

  • Data Extraction and Document Processing

  • Deploying in batch or as a service

Before we jump into the details of how we approach each of these areas in the platform, let's explain the core of Kodexa.

‚Äč