Correlated pattern definitions and Snowflake’s MATCH_RECOGNIZE
MATCH_RECOGNIZE is a powerful SQL clause that can be used to find patterns in data.
We will use MATCH_RECOGNIZE to find the Orders where the Customer assignment changed and was reverted back. The Order can be assigned to several different Customer before being reverted back to the Original Customer. Here is the raw data:
ORDER_NUM | CUSTOMER | LOAD_DATE |
---|---|---|
111 | aaa | 2023-02-09 04:49:41.335 |
111 | bbb | 2023-02-09 04:49:42.338 |
111 | aaa | 2023-02-09 04:49:43.278 |
222 | aaa | 2023-02-09 04:49:44.213 |
222 | bbb | 2023-02-09 04:49:45.254 |
333 | aaa | 2023-02-09 04:49:46.334 |
333 | bbb | 2023-02-09 04:49:47.101 |
333 | ccc | 2023-02-09 04:49:48.196 |
In the above data, you will notice that Order 111
was initially assigned to Customer aaa
, changed to Customer bbb
, and then was reverted back to Customer aaa
.
The following Oracle MATCH_RECOGNIZE query can be used to identify Order 111
select * from order_customer
match_recognize(
partition by order_number
order by load_date
one row per match
pattern (init modified+ reversed)
define
init as customer_id = customer_id,
modified as customer_id <> init.customer_id,
reversed as customer_id = init.customer_id
);
However, when I tried to port it to Snowflake, it turned out that Snowflake currently does not support Correlated pattern definition in MATCH_RECOGNIZE
.
Snowflake will throw the following error message:
ERROR: Unsupported feature 'Correlated pattern definitions'.
This query can be re-written for Snowflake using the FIRST_VALUE()
function as following:
select * from order_customer
match_recognize(
partition by order_number
order by load_date
one row per match
pattern (init modified+ init)
define
init as customer_id = first_value(customer_id),
modified as customer_id <> first_value(customer_id)
);
Generate Raw Data
create or replace table order_customer (order_number number, customer_id varchar(80), load_date timestamp);
insert into order_customer values (111, 'aaa', CURRENT_TIMESTAMP);
insert into order_customer values (111, 'bbb', CURRENT_TIMESTAMP);
insert into order_customer values (111, 'aaa', CURRENT_TIMESTAMP);
insert into order_customer values (222, 'aaa', CURRENT_TIMESTAMP);
insert into order_customer values (222, 'bbb', CURRENT_TIMESTAMP);
insert into order_customer values (333, 'aaa', CURRENT_TIMESTAMP);
insert into order_customer values (333, 'bbb', CURRENT_TIMESTAMP);
insert into order_customer values (333, 'ccc', CURRENT_TIMESTAMP);
select * from order_customer;
Screenshots
See also:
- A/B Analysis on Streaming Data using MATCH_RECOGNIZE
- Applied overview of MATCH_RECOGNIZE clause
- Correlated pattern definitions and Snowflake’s MATCH_RECOGNIZE
- How to Predict Customer Churn Using SQL Pattern Detection - MATCH_RECOGNIZE
- Association Rule Mining using MATCH_RECOGNIZE
- MATCH_RECOGNIZE
- Stock Analysis using MATCH_RECOGNIZE
- MATCH_RECOGNIZE and Data Vault Effectivity Satellite - Part II
- MATCH_RECOGNIZE and Data Vault Effectivity Satellite
- What’s eating up your Snowflake Virtual Warehouse - Part II featuring MATCH_RECOGNIZE