Skip to content

Tecton Framework

Tecton makes building operational ML data flows and consuming ML data as easy as possible. Tecton's Framework has two APIs: a Declarative API for composing data pipelines, and an interactive Read API to consume features.

Declarative APIs

Feature pipelines are composed using Tecton's Declarative framework, which is used to define objects such as Feature Views, Data Sources and Feature Services.

Read APIs

Tecton's Read APIs are used to access feature values online for model serving or offline for model training.

The end-to-end example illustrated here can be found in our Github sample repo.

Declarative Pipeline Composition

Defining Feature Pipelines

Tecton's framework is designed for you to express ML data flows. There are 5 important Tecton objects:

  1. Data Sources: Data sources define a connection to a batch, stream, or request data source (i.e. request-time parameters) and are used as inputs to feature pipelines, known as "Feature Views" in Tecton.
  2. Feature Views: Feature Views take in data sources as inputs, or in some cases other Feature Views, and define a pipeline of transformations to compute one or more features. Feature Views also provide Tecton with additional information such as metadata and orchestration, serving, and monitoring configurations. There are many types of Feature Views, each designed to support a common data flow pattern.
  3. Transformations: Each Feature View has a single pipeline of transformations that define the computation of one or more features. Transformations can be modularized and stitched together into a pipeline.
  4. Entities: An Entity is an object or concept that can be modeled and that has features associated with it. Examples include User, Ad, Product, and Product Category. In Tecton, every Feature View is associated with one or more entities.
  5. Feature Services: A Feature Service represents a set of features that power a model. Typically there is one Feature Service for each version of a model. Feature Services provide convenient endpoints for fetching training data through the Tecton SDK or fetching real-time feature vectors from Tecton's REST API.

In practice, composing pipelines with Tecton means connecting Data Sources to Feature Views to Feature Services.

Tecton objects are declared in Python. We recommend managing your source files using Git as a source of truth for your feature pipelines.

Operational Feature Pipelines

Once feature data pipelines are defined, Tecton orchestrates the operational tasks required to run these data pipelines and serve features. This includes:

  • Materialization: orchestration of transformations and writing computed feature values to Tecton's online and offline stores
  • Point-in-time-correctness: ensuring future signals do not inappropriately leak into training datasets to ensure the accuracy of the model training and avoiding data skew
  • Monitoring: tracks your data flow pipelines and triggers alerts in case of any incidents.

Consuming Features through Feature Serving APIs

Tecton has online and offline APIs for reading feature data.

Online API

The Online API is useful for applications that need up-to-date feature values in real-time in production.

  • Rest API: serves feature vectors with low-latency (e.g. /api/v1/feature-service/get-features)
  • Python SDK: each Feature View and Feature Service has a get_online_features method that wraps the Rest API

More information can be found in the Real-time Features Guide

Offline Training API

The Offline API is useful for training models using historical data with point-in-time correctness.

  • Python SDK: each Feature View and Feature Service has a get_historical_features method for training data generation.

More information can be found in the Training Data Guide

Next: Learn about Tecton's Platform →