16 KiB
Comparison Monitor Reference
Detailed reference for building createComparisonMonitorMac tool calls.
When to Use
Use a comparison monitor when the user wants to:
- Compare data between two tables (e.g., source vs target, dev vs prod)
- Validate data consistency after migration or replication
- Check row count parity across environments
- Compare field-level metrics between tables (null counts, sums, distributions)
Pre-Step: Verify Both Tables and Fields
Before constructing alert conditions, you MUST verify that both tables exist and that any referenced fields are real columns. This is the most common source of comparison monitor failures.
- Resolve both MCONs. Use
searchto find the source and target tables. If the user provideddatabase:schema.tableformat, search for each to get the MCON. - Get full schemas. Call
getTablewithinclude_fields: trueon BOTH the source table and the target table. You need the column lists from both. - For field-level metrics, verify fields exist on both sides. Confirm that
sourceFieldexists in the source table's column list ANDtargetFieldexists in the target table's column list. Field names are case-sensitive on most warehouses. - Check field type compatibility. The metric must be compatible with the column types on both sides. For example,
NUMERIC_MEANrequires numeric columns in both the source and target tables. If the source column is numeric but the target is a string, the comparison will fail. - If any field does not exist or types are incompatible, stop and ask the user to clarify. Do not guess.
Required Parameters
| Parameter | Type | Description |
|---|---|---|
name |
string | Unique identifier for the monitor. Use a descriptive slug (e.g., orders_dev_prod_compare). |
description |
string | Human-readable description of what the monitor checks. |
source_table |
string | Source table MCON (preferred) or database:schema.table format. If not MCON, also pass source_warehouse. |
target_table |
string | Target table MCON (preferred) or database:schema.table format. If not MCON, also pass target_warehouse. |
alert_conditions |
array | List of comparison conditions (see Alert Conditions below). |
Optional Parameters
| Parameter | Type | Description |
|---|---|---|
source_warehouse |
string | Warehouse name or UUID for the source table. Required if source_table is not an MCON. |
target_warehouse |
string | Warehouse name or UUID for the target table. Required if target_table is not an MCON. |
segment_fields |
array of string | Fields to segment the comparison by. Must exist in BOTH tables with the same name. |
domain_id |
string (uuid) | Domain UUID (use getDomains to list). Only one domain can be assigned per monitor. |
Cross-Warehouse Comparisons
When the source and target tables live in different warehouses (e.g., comparing a Snowflake staging table against a BigQuery production table), you MUST provide both source_warehouse and target_warehouse explicitly. The tool cannot auto-resolve warehouses when tables are in different environments.
Even when both tables are MCONs, if they belong to different warehouses, pass both warehouse parameters to be safe. Omitting them in cross-warehouse scenarios causes silent failures or incorrect results.
Common cross-warehouse patterns:
- Dev vs prod: same warehouse type, different databases or schemas
- Migration validation: source in old warehouse, target in new warehouse
- Replication checks: primary warehouse vs replica or downstream warehouse
Alert Conditions
Each condition compares a metric between the source and target tables.
| Field | Type | Required | Description |
|---|---|---|---|
metric |
string | Yes | The metric to compare (see Metrics Reference below). |
sourceField |
string | For field-level metrics | Column in the source table. Required for ALL metrics except ROW_COUNT. |
targetField |
string | For field-level metrics | Column in the target table. Required for ALL metrics except ROW_COUNT. |
thresholdValue |
number | No | Threshold for acceptable difference between source and target. |
isThresholdRelative |
boolean | No | false = absolute difference (default), true = percentage difference. |
customMetric |
object | No | Custom SQL expressions for source and target (see Custom Metrics below). |
ROW_COUNT and Fields: A Critical Rule
NEVER pass
sourceFieldortargetFieldwhen using theROW_COUNTmetric.
ROW_COUNT is a table-level metric -- it counts all rows in the table, not values in a column. Passing field names with ROW_COUNT causes the API call to fail or produce unexpected behavior.
This is the single most common mistake with comparison monitors. Before submitting any alert condition with ROW_COUNT, verify that sourceField and targetField are both absent from the condition object.
| Metric | Fields needed? | What happens if you pass fields? |
|---|---|---|
ROW_COUNT |
No -- NEVER pass fields | API error or undefined behavior |
| All other metrics | Yes -- always pass both fields | Required for the comparison to work |
Metrics Reference
Table-level metric (no fields needed)
| Metric | Description |
|---|---|
ROW_COUNT |
Compare total row counts between source and target. |
Field-level metrics (require sourceField and targetField)
Uniqueness and duplicates
| Metric | Description |
|---|---|
UNIQUE_COUNT |
Count of distinct values. |
DUPLICATE_COUNT |
Count of duplicate (non-unique) values. |
APPROX_DISTINCT_COUNT |
Approximate distinct count (faster on large tables). |
Null and empty checks
| Metric | Description |
|---|---|
NULL_COUNT |
Count of null values. |
NON_NULL_COUNT |
Count of non-null values. |
EMPTY_STRING_COUNT |
Count of empty string values. |
TEXT_ALL_SPACES_COUNT |
Count of values that are all whitespace. |
NAN_COUNT |
Count of NaN values. |
TEXT_NULL_KEYWORD_COUNT |
Count of values containing null-like keywords (e.g., "NULL", "None"). |
Numeric statistics
| Metric | Description |
|---|---|
NUMERIC_MEAN |
Mean of numeric field. |
NUMERIC_MEDIAN |
Median of numeric field. |
NUMERIC_MIN |
Minimum value. |
NUMERIC_MAX |
Maximum value. |
NUMERIC_STDDEV |
Standard deviation. |
SUM |
Sum of numeric field. |
ZERO_COUNT |
Count of zero values. |
NEGATIVE_COUNT |
Count of negative values. |
Percentiles
| Metric | Description |
|---|---|
PERCENTILE_20 |
20th percentile value. |
PERCENTILE_40 |
40th percentile value. |
PERCENTILE_60 |
60th percentile value. |
PERCENTILE_80 |
80th percentile value. |
Text statistics
| Metric | Description |
|---|---|
TEXT_MAX_LENGTH |
Maximum string length. |
TEXT_MIN_LENGTH |
Minimum string length. |
TEXT_MEAN_LENGTH |
Mean string length. |
TEXT_STD_LENGTH |
Standard deviation of string length. |
Text format checks
| Metric | Description |
|---|---|
TEXT_NOT_INT_COUNT |
Count of values not parseable as integers. |
TEXT_NOT_NUMBER_COUNT |
Count of values not parseable as numbers. |
TEXT_NOT_UUID_COUNT |
Count of values not matching UUID format. |
TEXT_NOT_SSN_COUNT |
Count of values not matching SSN format. |
TEXT_NOT_US_PHONE_COUNT |
Count of values not matching US phone format. |
TEXT_NOT_US_STATE_CODE_COUNT |
Count of values not matching US state codes. |
TEXT_NOT_US_ZIP_CODE_COUNT |
Count of values not matching US zip codes. |
TEXT_NOT_EMAIL_ADDRESS_COUNT |
Count of values not matching email format. |
TEXT_NOT_TIMESTAMP_COUNT |
Count of values not parseable as timestamps. |
Boolean
| Metric | Description |
|---|---|
TRUE_COUNT |
Count of true values. |
FALSE_COUNT |
Count of false values. |
Timestamp
| Metric | Description |
|---|---|
FUTURE_TIMESTAMP_COUNT |
Count of timestamps in the future. |
PAST_TIMESTAMP_COUNT |
Count of timestamps unreasonably far in the past. |
UNIX_ZERO_COUNT |
Count of timestamps equal to Unix epoch zero (1970-01-01). |
Choosing the Right Metric
| User intent | Correct metric | Fields needed? |
|---|---|---|
| Row count parity | ROW_COUNT |
No -- never pass fields |
| Distinct values in a column | UNIQUE_COUNT |
Yes |
| Null values in a column | NULL_COUNT |
Yes |
| Sum, average, min, max | SUM, NUMERIC_MEAN, NUMERIC_MIN, NUMERIC_MAX |
Yes |
| Data completeness | NON_NULL_COUNT |
Yes |
| String format validation | TEXT_NOT_EMAIL_ADDRESS_COUNT, TEXT_NOT_UUID_COUNT, etc. |
Yes |
| Custom computed expressions | Use customMetric instead of metric |
No (SQL handles it) |
Custom Metrics
Use custom metrics when:
- Column names differ between source and target and you need a computed expression (not just a direct field comparison).
- You need a derived calculation like
SUM(quantity * unit_price)rather than a simple column metric. - Standard metrics do not cover the comparison (e.g., comparing a ratio, a conditional aggregate, or a windowed calculation).
If the columns simply have different names but you want a standard metric (e.g., compare SUM of revenue in source vs total_revenue in target), you do NOT need a custom metric -- just use the standard metric with different sourceField and targetField values.
Custom metric structure:
{
"customMetric": {
"displayName": "Revenue Sum",
"sourceSqlExpression": "SUM(revenue)",
"targetSqlExpression": "SUM(total_revenue)"
}
}
| Field | Type | Required | Description |
|---|---|---|---|
displayName |
string | Yes | Human-readable name for the metric in alerts and dashboards. |
sourceSqlExpression |
string | Yes | SQL expression evaluated against the source table. |
targetSqlExpression |
string | Yes | SQL expression evaluated against the target table. |
When using customMetric, do NOT also pass metric, sourceField, or targetField in the same alert condition. The custom metric replaces all of those.
Threshold Guidance
Absolute thresholds (isThresholdRelative: false or omitted)
The thresholdValue is the maximum acceptable absolute difference between the source and target metric values.
thresholdValue: 0-- source and target must match exactly.thresholdValue: 100-- up to 100 units of difference is acceptable.
Relative (percentage) thresholds (isThresholdRelative: true)
The thresholdValue is the maximum acceptable percentage difference.
thresholdValue: 5-- up to 5% difference is acceptable.thresholdValue: 0.1-- up to 0.1% difference is acceptable.
When to use each
| Scenario | Recommended threshold type |
|---|---|
| Exact replication (row counts must match) | Absolute, thresholdValue: 0 |
| Near-real-time sync with small lag | Absolute, small value (e.g., 10-100) |
| Tables at different scales | Relative, percentage-based |
| Aggregated metrics (sums, means) | Relative, to handle floating-point differences |
Examples
Row count parity with absolute threshold
Compare row counts between dev and prod, alerting if they differ by more than 100 rows.
{
"name": "orders_dev_prod_row_count",
"description": "Verify dev and prod orders tables have similar row counts",
"source_table": "MCON++a1b2c3d4-e5f6-7890-abcd-ef1234567890++1++1++dev_warehouse:core.orders",
"target_table": "MCON++b2c3d4e5-f6a7-8901-bcde-f12345678901++1++1++prod_warehouse:core.orders",
"alert_conditions": [
{
"metric": "ROW_COUNT",
"thresholdValue": 100,
"isThresholdRelative": false
}
]
}
Note: no sourceField or targetField -- ROW_COUNT is table-level.
Row count parity with percentage threshold
Alert if row counts differ by more than 5%.
{
"name": "orders_replication_check",
"description": "Verify replicated orders table is within 5% of source row count",
"source_table": "MCON++a1b2c3d4-e5f6-7890-abcd-ef1234567890++1++1++primary:sales.orders",
"target_table": "MCON++b2c3d4e5-f6a7-8901-bcde-f12345678901++1++1++replica:sales.orders",
"alert_conditions": [
{
"metric": "ROW_COUNT",
"thresholdValue": 5,
"isThresholdRelative": true
}
]
}
Field-level comparison (different column names)
Compare the sum of revenue in the source table against total_revenue in the target table.
{
"name": "revenue_source_target_sum",
"description": "Verify revenue sums match between staging and production",
"source_table": "MCON++a1b2c3d4-e5f6-7890-abcd-ef1234567890++1++1++staging:finance.transactions",
"target_table": "MCON++b2c3d4e5-f6a7-8901-bcde-f12345678901++1++1++production:finance.transactions",
"alert_conditions": [
{
"metric": "SUM",
"sourceField": "revenue",
"targetField": "total_revenue",
"thresholdValue": 1,
"isThresholdRelative": true
}
]
}
Segmented comparison
Compare null counts on email between source and target, segmented by country. The country field must exist in both tables.
{
"name": "email_nulls_by_country",
"description": "Compare email null counts by country between ETL source and target",
"source_table": "MCON++a1b2c3d4-e5f6-7890-abcd-ef1234567890++1++1++raw:crm.contacts",
"target_table": "MCON++b2c3d4e5-f6a7-8901-bcde-f12345678901++1++1++analytics:crm.contacts",
"segment_fields": ["country"],
"alert_conditions": [
{
"metric": "NULL_COUNT",
"sourceField": "email",
"targetField": "email",
"thresholdValue": 0,
"isThresholdRelative": false
}
]
}
Cross-warehouse comparison with explicit warehouses
When source and target are in different warehouses, both warehouse parameters must be provided.
{
"name": "migration_users_row_count",
"description": "Validate user row counts match after Snowflake to BigQuery migration",
"source_table": "snowflake_db:public.users",
"source_warehouse": "snowflake-prod",
"target_table": "bigquery_project:public.users",
"target_warehouse": "bigquery-prod",
"alert_conditions": [
{
"metric": "ROW_COUNT",
"thresholdValue": 0,
"isThresholdRelative": false
}
]
}
Custom metric comparison
Compare a computed revenue expression when the SQL differs between source and target.
{
"name": "computed_revenue_compare",
"description": "Compare total revenue computation between legacy and new schema",
"source_table": "MCON++a1b2c3d4-e5f6-7890-abcd-ef1234567890++1++1++warehouse:legacy.orders",
"target_table": "MCON++b2c3d4e5-f6a7-8901-bcde-f12345678901++1++1++warehouse:v2.orders",
"alert_conditions": [
{
"customMetric": {
"displayName": "Total Revenue",
"sourceSqlExpression": "SUM(quantity * unit_price)",
"targetSqlExpression": "SUM(total_amount)"
},
"thresholdValue": 0.01,
"isThresholdRelative": true
}
]
}
Multiple alert conditions
Compare both row counts and field-level metrics in a single monitor.
{
"name": "orders_full_comparison",
"description": "Full comparison of orders between staging and production",
"source_table": "MCON++a1b2c3d4-e5f6-7890-abcd-ef1234567890++1++1++staging:core.orders",
"target_table": "MCON++b2c3d4e5-f6a7-8901-bcde-f12345678901++1++1++production:core.orders",
"domain_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
"alert_conditions": [
{
"metric": "ROW_COUNT",
"thresholdValue": 0,
"isThresholdRelative": false
},
{
"metric": "NULL_COUNT",
"sourceField": "customer_id",
"targetField": "customer_id",
"thresholdValue": 0,
"isThresholdRelative": false
},
{
"metric": "SUM",
"sourceField": "amount",
"targetField": "amount",
"thresholdValue": 0.1,
"isThresholdRelative": true
}
]
}
Note: the ROW_COUNT condition has no fields, while the field-level conditions each specify both sourceField and targetField.