15 KiB
Validation Monitor Reference
Detailed reference for building createValidationMonitorMac tool calls.
When to Use
Use a validation monitor when the user wants to:
- Check that specific fields are never null
- Validate that values are within an allowed set (e.g., status in 'active', 'pending', 'inactive')
- Enforce referential integrity (field values exist in another table)
- Apply row-level business rules (e.g., "amount must be positive")
- Combine multiple conditions with AND/OR logic
Getting the Logic Right: Conditions Match INVALID Data
This is the single most confusing aspect of validation monitors and the number one source of mistakes. Conditions describe what INVALID data looks like -- the data you want to be alerted about. They do NOT describe what valid data looks like.
Think of it this way: the monitor scans rows and fires an alert when it finds rows matching the condition. So the condition must match the BAD rows.
| User wants | Condition should match | Common mistake |
|---|---|---|
| "id should never be null" | id IS NULL (alert when null found) | id IS NOT NULL (would alert on every valid row) |
| "status must be in [active, pending]" | status NOT IN [active, pending] (alert on unexpected values) | status IN [active, pending] (would alert on valid rows) |
| "amount must be positive" | amount IS NEGATIVE (alert on bad values) | amount > 0 (would alert on valid rows) |
| "email must not be empty" | email IS NULL OR email = '' (alert on missing) | email IS NOT NULL (would alert on valid rows) |
Before building any condition, ask yourself: "If a row matches this condition, is the row INVALID?" If the answer is no, the logic is backwards.
Pre-Step: Verify Field Existence
Before constructing the alert_condition, verify that every field name you plan to reference exists in the table's column list. This is the number two source of validation monitor failures -- referencing columns that do not exist or are misspelled.
- You should already have the column list from
getTablewithinclude_fields: true(done in Step 2 of the main skill). - For every field name in your planned conditions, confirm it appears in the column list exactly as spelled (field names are case-sensitive on most warehouses).
- If a field does not exist, stop and ask the user to clarify the correct column name. Do not guess.
Required Parameters
| Parameter | Type | Description |
|---|---|---|
name |
string | Unique identifier for the monitor. Use a descriptive slug (e.g., orders_not_null_check). |
description |
string | Human-readable description of what the monitor checks. |
table |
string | Table MCON (preferred) or database:schema.table format. If not MCON, also pass warehouse. |
alert_condition |
object | Condition tree defining when to alert (see Alert Condition Structure below). |
Optional Parameters
| Parameter | Type | Description |
|---|---|---|
warehouse |
string | Warehouse name or UUID. Required if table is not an MCON. |
domain_id |
string (uuid) | Domain UUID (use getDomains to list). |
Alert Condition Structure
The top level of alert_condition must always be a GROUP node. This GROUP contains one or more conditions combined with AND or OR logic.
{
"type": "GROUP",
"operator": "AND",
"conditions": [...]
}
Condition Types
There are four condition types: UNARY, BINARY, SQL, and GROUP.
UNARY (single-value checks)
Used for predicates that operate on a single field with no comparison value.
{
"type": "UNARY",
"predicate": {"name": "null", "negated": false},
"value": [{"type": "FIELD", "field": "column_name"}]
}
predicate.name-- the predicate to apply (see Predicates Reference below).predicate.negated-- set totrueto invert the predicate (e.g.,nullwithnegated: truemeans "is NOT null").value-- an array with a single value descriptor (usually a FIELD reference).
BINARY (comparison checks)
Used for predicates that compare a field against a value.
{
"type": "BINARY",
"predicate": {"name": "greater_than", "negated": false},
"left": [{"type": "FIELD", "field": "column_name"}],
"right": [{"type": "LITERAL", "literal": "0"}]
}
left-- the left-hand side of the comparison (typically a FIELD reference).right-- the right-hand side (typically a LITERAL value, SQL expression, or FIELD reference).- Both
leftandrightare arrays of value descriptors.
SQL (custom SQL expression)
Used for complex conditions that are difficult to express with UNARY/BINARY nodes. The SQL expression should evaluate to true for INVALID rows.
{
"type": "SQL",
"sql": "amount > 0 AND amount < 1000000"
}
GROUP (nested conditions)
Used to combine multiple conditions with AND or OR logic. Groups can be nested.
{
"type": "GROUP",
"operator": "OR",
"conditions": [
{"type": "UNARY", "...": "..."},
{"type": "BINARY", "...": "..."}
]
}
Value Types
Value descriptors appear in the value, left, and right arrays of UNARY and BINARY conditions.
| Type | Field | Description | Example |
|---|---|---|---|
FIELD |
"field": "column_name" |
References a column in the table. | {"type": "FIELD", "field": "user_id"} |
LITERAL |
"literal": "value" |
A static value (always a string, even for numbers). | {"type": "LITERAL", "literal": "100"} |
SQL |
"sql": "SELECT ..." |
A SQL expression or subquery. | {"type": "SQL", "sql": "SELECT MAX(id) FROM ref_table"} |
Predicates Reference
Before building conditions, call getValidationPredicates to get the full list of supported predicates for the connected warehouse. The list below covers common predicates but may not be exhaustive.
Unary Predicates
These predicates take no comparison value -- they check a property of the field itself.
| Predicate | Description | Example use |
|---|---|---|
null |
Field value is null. | Alert on null ids. |
is_negative |
Field value is negative. | Alert on negative amounts. |
is_between_0_and_1 |
Field value is between 0 and 1 (inclusive). | Alert on rates that should be percentages (0-100). |
is_future_date |
Field value is a date/timestamp in the future. | Alert on future-dated records. |
is_uuid |
Field value matches UUID format. | Alert on non-UUID values in a UUID field (use with negated: true). |
Binary Predicates
These predicates compare a field against a value.
| Predicate | Right-hand side | Description | Example use |
|---|---|---|---|
equal |
Single LITERAL | Field equals the given value. | Alert when status equals 'deleted'. |
greater_than |
Single LITERAL | Field is greater than the given value. | Alert when discount_pct exceeds 100. |
less_than |
Single LITERAL | Field is less than the given value. | Alert when quantity is below 0. |
in_set |
Multiple LITERALs | Field value is in the given set. | Alert when status is in an invalid set (see example below). |
contains |
Single LITERAL | Field value contains the given substring. | Alert when email contains 'test@'. |
starts_with |
Single LITERAL | Field value starts with the given prefix. | Alert when phone starts with '000'. |
between |
Two LITERALs | Field value is between the two given values (inclusive). | Alert when score is between 0 and 10 (if that range is invalid). |
Using negated to Invert Predicates
Any predicate can be inverted by setting "negated": true in the predicate object. This is essential for "must be in set" validations:
- "status must be in [active, pending]" becomes
in_setwith values["active", "pending"]andnegated: true-- meaning "alert when status is NOT in [active, pending]". - "id must not be null" becomes
nullwithnegated: false-- meaning "alert when id IS null" (no inversion needed since the condition already matches invalid data).
Examples
Alert when id is null
Verify that id exists in the table schema from getTable before proceeding.
{
"name": "orders_id_not_null",
"description": "Alert when order id is null",
"table": "MCON++a1b2c3d4-e5f6-7890-abcd-ef1234567890++1++1++analytics:core.orders",
"alert_condition": {
"type": "GROUP",
"operator": "AND",
"conditions": [
{
"type": "UNARY",
"predicate": {"name": "null", "negated": false},
"value": [{"type": "FIELD", "field": "id"}]
}
]
}
}
The condition matches rows where id IS NULL -- these are the invalid rows we want to be alerted about.
Alert when status is not in allowed set
Verify that status exists in the table schema from getTable before proceeding.
{
"name": "orders_status_allowed_values",
"description": "Alert when order status is outside the allowed set",
"table": "MCON++a1b2c3d4-e5f6-7890-abcd-ef1234567890++1++1++analytics:core.orders",
"alert_condition": {
"type": "GROUP",
"operator": "AND",
"conditions": [
{
"type": "BINARY",
"predicate": {"name": "in_set", "negated": true},
"left": [{"type": "FIELD", "field": "status"}],
"right": [
{"type": "LITERAL", "literal": "active"},
{"type": "LITERAL", "literal": "pending"},
{"type": "LITERAL", "literal": "inactive"}
]
}
]
}
}
Note negated: true -- the predicate is in_set, but we want to alert when the value is NOT in the set. This catches any unexpected status values.
Alert when amount is negative
Verify that amount exists in the table schema from getTable before proceeding.
{
"name": "orders_positive_amount",
"description": "Alert when order amount is negative",
"table": "MCON++a1b2c3d4-e5f6-7890-abcd-ef1234567890++1++1++analytics:core.orders",
"alert_condition": {
"type": "GROUP",
"operator": "AND",
"conditions": [
{
"type": "UNARY",
"predicate": {"name": "is_negative", "negated": false},
"value": [{"type": "FIELD", "field": "amount"}]
}
]
}
}
The condition matches rows where amount is negative -- these are the invalid rows.
Combined conditions: null OR negative
Verify that both amount and quantity exist in the table schema from getTable before proceeding.
{
"name": "orders_amount_quality",
"description": "Alert when amount is null or quantity is negative",
"table": "MCON++a1b2c3d4-e5f6-7890-abcd-ef1234567890++1++1++analytics:core.orders",
"alert_condition": {
"type": "GROUP",
"operator": "OR",
"conditions": [
{
"type": "UNARY",
"predicate": {"name": "null", "negated": false},
"value": [{"type": "FIELD", "field": "amount"}]
},
{
"type": "UNARY",
"predicate": {"name": "is_negative", "negated": false},
"value": [{"type": "FIELD", "field": "quantity"}]
}
]
}
}
The OR operator means an alert fires if either condition matches -- the row has a null amount OR a negative quantity.
Between check with nested AND/OR
Verify that score and status exist in the table schema from getTable before proceeding.
{
"name": "records_score_validation",
"description": "Alert when score is outside 0-100 range for active records",
"table": "MCON++a1b2c3d4-e5f6-7890-abcd-ef1234567890++1++1++warehouse:metrics.records",
"alert_condition": {
"type": "GROUP",
"operator": "AND",
"conditions": [
{
"type": "BINARY",
"predicate": {"name": "equal", "negated": false},
"left": [{"type": "FIELD", "field": "status"}],
"right": [{"type": "LITERAL", "literal": "active"}]
},
{
"type": "BINARY",
"predicate": {"name": "between", "negated": true},
"left": [{"type": "FIELD", "field": "score"}],
"right": [
{"type": "LITERAL", "literal": "0"},
{"type": "LITERAL", "literal": "100"}
]
}
]
}
}
This uses between with negated: true to alert when score is outside the 0-100 range, but only for active records (the AND operator requires both conditions to match).
Referential integrity with SQL subquery
Verify that customer_id exists in the table schema from getTable before proceeding.
{
"name": "orders_valid_customer",
"description": "Alert when customer_id does not exist in customers table",
"table": "MCON++a1b2c3d4-e5f6-7890-abcd-ef1234567890++1++1++analytics:core.orders",
"alert_condition": {
"type": "GROUP",
"operator": "AND",
"conditions": [
{
"type": "SQL",
"sql": "customer_id IS NOT NULL AND customer_id NOT IN (SELECT id FROM analytics.core.customers)"
}
]
}
}
The SQL condition type is useful for referential integrity checks that require subqueries. The customer_id IS NOT NULL guard avoids alerting on null values (which should be caught by a separate null check if needed).
Contains and starts_with checks
Verify that email and phone exist in the table schema from getTable before proceeding.
{
"name": "contacts_format_validation",
"description": "Alert when email contains test data or phone has invalid prefix",
"table": "MCON++a1b2c3d4-e5f6-7890-abcd-ef1234567890++1++1++warehouse:crm.contacts",
"alert_condition": {
"type": "GROUP",
"operator": "OR",
"conditions": [
{
"type": "BINARY",
"predicate": {"name": "contains", "negated": false},
"left": [{"type": "FIELD", "field": "email"}],
"right": [{"type": "LITERAL", "literal": "@test.example.com"}]
},
{
"type": "BINARY",
"predicate": {"name": "starts_with", "negated": false},
"left": [{"type": "FIELD", "field": "phone"}],
"right": [{"type": "LITERAL", "literal": "000"}]
}
]
}
}
Fallback: Custom SQL Monitor
If createValidationMonitorMac fails -- for example because a referenced column does not exist yet in the live table, or the warehouse does not support a particular predicate -- fall back to createCustomSqlMonitorMac with an explicit SQL query instead.
A custom SQL monitor lets you express any validation logic as a SQL query that returns rows or a count. This is always available as a backup when the structured validation condition tree cannot express what you need or encounters an API error.
When falling back:
- Translate the intended validation logic into a SQL query.
- The SQL should select rows that violate the rule (matching the same "conditions match INVALID data" principle).
- Use
createCustomSqlMonitorMacwith the translated query. - Inform the user that you used a custom SQL monitor as a fallback and explain why.