Skip to content

Monitoring Materialization

Overview

If feature processing jobs begin to fail, Tecton can begin to serve stale or inaccurate data. To ensure that feature processing jobs stay healthy, Tecton offers monitoring, alerting and debugging tools.

For a practical example of debugging a materialization alert, see Example: Debugging Materialization Alerts.

Setting Up Alerts

Tecton can automatically generate materialization health alerts and online store feature freshness alerts that are sent to a specified email address. See Types of Alerts for more details.

Note

It is highly recommend that an alert email is set for each FeatureView that is being consumed in production.

To configure alerts, specify monitoring when defining a FeatureView in your Feature Repository. MonitoringConfig objects configure alert thresholds and feature freshness expectations.

@batch_feature_view(
  ...
  monitoring = MonitoringConfig(
      monitor_freshness=True,
      expected_feature_freshness="2w",
      alert_email="kanye@tecton.ai"
  )
)
def my_feature_view(inputs):
  ...
  • monitor_freshness: Set this to False to suppress online store freshness-related alerts.
  • expected_feature_freshness: Set this value to decrease the sensitivity of freshness alerts. See Default Expected Feature Freshness for details about the default value if this field is unspecified.
  • alert_email: Recipient of alerts.

Debugging Tools

Tecton provides tools to monitor and debug production Feature Views from all Tecton tools: Web UI, SDK, and CLI.

Web UI: Health Overview

The easiest way to check the health of a materialized FeatureView is through the Web UI. Navigate to the FeatureView in question and switch to the “Materialization” tab to see Feature View materialization diagnostics at a glance.

SDK: FeatureView Materialization Status

The Tecton SDK provides the FeatureTable.materialization_status() method to displays details about failed materialization attempts.

In the SDK and Web UI, Tecton provides a link to the auto-generated job that was used to compute feature values. This job link can be used to view the underlying error that caused a materialization job to fail.

To view this job, click on the Job status in the materialization table in the Web UI. This link is also available in the SDK materialization_status() method, and the tecton materialization-status command in the CLI.

Monitoring Materialization 1

This link will open a page in your Spark processing engine where you will be able to see the job failure. In the example below, we show a spot failure in Databricks:

Monitoring Materialization 2

CLI: Cluster Overview and Status

Tecton provides the ability to view the status of all Feature Views in a cluster using the tecton freshness CLI command.

$ tecton freshness
           Feature View               Stale?   Freshness   Expected Freshness     Created At
=================================================================================================
partner_ctr_performance:14d              Y        2wk 1d      2d                   12/02/20 10:52
ad_group_ctr_performance                 N        1h 1m       2h                   11/28/20 19:50
user_ad_impression_counts                N        1m 35s      2h                   10/01/20 2:16
content_keyword_ctr_performance:v2       N        1m 36s      2h                   09/04/20 22:22
content_keyword_ctr_performance          N        1m 37s      2h                   08/26/20 12:52
user_total_ad_frequency_counts           N        1m 38s      2h                   08/26/20 12:52

You can also use the $ tecton materialization-status $FV_NAME to see the materialization status of a specific FeatureView.

$ tecton materialization-status my_feature_view
All the displayed times are in UTC time zone
TYPE     WINDOW_START_TIME      WINDOW_END_TIME     STATUS    ATTEMPT_NUMBER     JOB_CREATED_AT      JOB_LOGS
================================================================================================================
BATCH   2020-12-15 00:00:00   2020-12-22 00:00:00   SUCCESS         1          2020-12-22 00:00:27   https://...
BATCH   2020-12-14 00:00:00   2020-12-21 00:00:00   SUCCESS         1          2020-12-21 00:00:14   https://...
BATCH   2020-12-13 00:00:00   2020-12-20 00:00:00   SUCCESS         1          2020-12-20 00:00:13   https://...
BATCH   2020-12-12 00:00:00   2020-12-19 00:00:00   SUCCESS         1          2020-12-19 00:00:10   https://...
BATCH   2020-12-11 00:00:00   2020-12-18 00:00:00   SUCCESS         1          2020-12-18 00:00:06   https://...

Default Expected Feature Freshness

By default, a Feature Views's freshness is expected to be less than twice the materialization schedule. By default, alerts will fire once this threshold, plus a small grace period, is crossed. For streaming Feature Views, freshness can be configured as low as 30 minutes.The grace period's duration depends on on the FeatureView's materialization schedule:

Schedule Grace Period
<= 10 minutes 30 minutes
<= 30 minutes 90 minutes
<= 1 hour 2 hours
<= 4 hours 4 hours
<= 24 hours 12 hours
> 24 hours 24 hours

The table below has examples of materialization schedules mapped to default alert thresholds:

Schedule Default Alert Threshold
5 minutes 40 minutes
30 minutes 2 hours
1 hour 4 hours
4 hours 12 hours
24 hours 60 hours