On-Demand Feature View
OnDemandFeatureView is used for simple transformations that are executed in real-time at feature request time. They allow you to:
- Calculate features based on information only available at request time, such as the amount of the current transaction; and
- Calculate the combination of other feature values, such as the amount of the current transaction compared to the 7 day average transaction amount.
OnDemandFeatureView stands is in contrast to all other feature views (
BatchFeatureView ), which precompute feature values and store them in the offline and/or online feature store.
- your use case requires real-time fresh features that need to process data that is only available right at the time of your real-time prediction
- the latency introduced by the complexity of your on-demand transformation is acceptable for your use case (example: If your on-demand transformation executes a
sleep("1second")statement, the execution of this transformation won't be any faster than. 1 second)
- precomputing your feature values would be a waste of storage or compute resources, because you're not expecting to actually use all precomputed feature values in production, or because precomputing all possible feature combinations would be intractable
- Turning a user's GPS coordinates into a geohash
- Parsing a user's search string
- Checking if a user's incoming transaction is larger than the user's average number of transactions in the last 30 days
- Picking the maximum transaction of the past 10 transactions of a user (if combined with a
- Computing the cosine similarity between a precomputed user embedding and a query embedding.
OnDemandFeatureView transformation can be expressed as Python code. Support for other languages (Rust and Java) are on the roadmap. Dependencies such as pip packages can be installed on a global (but not per-FeatureView) level.
For more examples see Examples here.
Feature with no dependencies
Feature with precomputed dependencies
See the API reference for the full list of parameters.
In your feature repository, the
RequestDataSource defines the schema your
OnDemandFeatureView will expect for request time data.
To configure a
RequestDataSource, you'll need to first create a Spark
StructType that defines the type for each input parameter.
OnDemandFeatureView requires a defined output schema, similar to the
RequestDataSource. Tecton uses the schema to display the FeatureView's expected output in the web-ui.
Note: Outputs from an
OnDemandFeatureView must be non-null, even if the output schema declares
Transformations for an
OnDemandFeatureView work the same as other Feature Views, except they must be written in Python with
See how to use an On Demand Feature View in a notebook here.
How it works
While other features are pre-computed and saved in the online store, the
OnDemandFeatureView transformation is executed in the Tecton service when you request a feature vector online. Inputs to the pipeline can be a
RequestDataSource included in the request, or the output of other features. They cannot access data from your batch or stream data sources.
OnDemandFeatureView is run at request time, you can only use Python-native or
pandas based transformations. To guarantee online/offline consistency, Tecton will automatically package your transformation as a Spark UDF when you generate historical feature values offline.