parameters:
status: "{{ var('order_status') }}"
with parameters:
# made famous by GitHub Actions
status: ${{ var('order_status') }}
# or the ASP.Net flavor:
status2: <%= var('order_status2') %>
# or the PHP flavor:
status3: <?= var('order_status3') ?>
and, just like Ansible, it's going to get insaneo when your inner expression has a quote character, too, since you'll need to escape it from the YAML parser leading to leaning toothpick syndrome e.g. parameters:
status: "{{ eval('echo \"hello\"') }}"
---If you find my "but what about the DX?" compelling, also gravely consider why in the world `data_expression:` seems to get a pass, in that it is implicitly wrapped in the mustaches
---
edit: ah, that's why https://github.com/paloaltodatabases/sequor/blob/v1.2.0/src/... but https://github.com/paloaltodatabases/sequor/blob/v1.2.0/src/... is what I would suggest changing before you get a bunch of tech debt and have to introduce a breaking change. From
str_rendered = Template(template_str, undefined=StrictUndefined).render(jinja_context)
to str_rendered = Template(template_str, undefined=StrictUndefined,
variable_start_string="${{",
variable_end_string="}}"
).render(jinja_context)
# et al, if you want to fix the {# and {%, too
per https://jinja.palletsprojects.com/en/stable/api/#jinja2.Temp...Quick clarification on _expression: we intentionally use two templating systems - Jinja {{ }} for simple variable injection, and Python *_expression for complex logic that Jinja can't handle.
Actually, since we only use Jinja for variable substitution, should I just drop it entirely? We have another version implemented in Java/JavaScript that uses simple ${var-name} syntax, and we already have Python expressions for advanced scenarios. Might be cleaner to unify on ${var-name} + Python expressions.
Given how deeply you've looked into our system, would you consider using Sequor? I can promise full support including fundamental changes like these - your technical insight would be invaluable for getting the design right early on.
As for "complex logic that jinja can't handle," I am not able to readily identify what that would mean given that jinja has executable blocks but I do agree with you that its mental model can make writing imperative code inside those blocks painful (e.g. {% set _ = my_dict.update({"something":"else}) %} type silliness)
it ultimately depends on whether those _expression: stanzas are always going to produce a Python result or they could produce arbitrary output. If the former, then I agree with you jinja2 would be terrible for that since it's a templating language[1]. If the latter, then using jinja2 would be a harmonizing choice so the author didn't have to keep two different invocation styles in their head at once
1: one can see that in ansible via this convolution:
body: >-
{%- set foo = {} -%}
{%- for i in ... -%}
{%- endfor -%}
{# now emit the dict as json #}
{{ foo | to_json }}Target audience:
1) Enterprise IT teams who already know SQL/YAML - they can build complex integrations after ~1 hour of training using our examples, no prior Python needed
2) Modern data teams using dbt - Sequor complements it perfectly for data ingestion and activation
What they gain:
Full flexibility with structure. Enterprise IT folks go from zero to building end-to-end solutions in an hour without needing developer support. Think "dbt but for API integrations."
Competitors & differentiation:
1) Zapier/n8n: GUI looks easy but gets complex fast, poor database integration, can't handle bulk data
2) Fivetran/Airbyte: Pre-built connectors only, zero customization, ingestion-only
3) Us: Only code-first solution using open tech stack (SQL+YAML+Python) - gives you flexibility with Fivetran reliability
Business model:
1) Core engine: Open source, free forever
2) Revenue: On-premise server with enterprise features (RBAC, observability and execution monitoring with notifications, audit logs) - flat fee per installation, no per-row costs like competitors
3) Services: Custom connector development and app-to-app integration flows (we love this work!)
4) Cloud version maybe later - everyone wants on-premise now
The key difference:
we're the only tool that's both easy to learn AND highly customizable for all major API integration patterns: data ingestion, reverse ETL, and multi-step iPaaS workflows - all in one platform.
steps:
# Step 1: Pull only NEW orders since last run
- op: http_request
request:
source: "shopify"
url: "https://{{ var('store_name') }}.myshopify.com/admin/api/{{ var('api_version') }}/orders.json"
method: GET
parameters:
status: any
updated_at_min_expression: "{{ last_run_timestamp() or '2024-01-01' }}"
headers:
"Accept": "application/json"
response:
success_status: [200]
tables:
- source: "snowflake"
table: "shopify_orders_incremental"
columns: { ... }
data_expression: response.json()['orders']
# Step 2: Update metrics ONLY for customers with new/changed orders
- op: transform
source: "snowflake"
query: |
MERGE INTO customer_metrics cm
USING (
SELECT
customer_id,
SUM(total_price::FLOAT) as total_spend,
COUNT(*) as order_count
FROM shopify_orders
WHERE customer_id IN (
SELECT DISTINCT customer_id
FROM shopify_orders_incremental
)
GROUP BY customer_id
) new_metrics
ON cm.customer_id = new_metrics.customer_id
WHEN MATCHED THEN
UPDATE SET
total_spend = new_metrics.total_spend,
order_count = new_metrics.order_count,
updated_at = CURRENT_TIMESTAMP()
WHEN NOT MATCHED THEN
INSERT (customer_id, total_spend, order_count, updated_at)
VALUES (new_metrics.customer_id, new_metrics.total_spend, new_metrics.order_count, CURRENT_TIMESTAMP())
# Step 3: Sync only customers whose metrics were just updated
- op: http_request
input:
source: "snowflake"
query: |
SELECT customer_id, email, total_spend, order_count
FROM customer_metrics
WHERE updated_at >= '{{ run_start_timestamp() }}'
request:
source: "mailchimp"
url_expression: |
f"https://us1.api.mailchimp.com/3.0/lists/{var('list_id')}/members/{hashlib.md5(record['email'].encode()).hexdigest()}"
method: PATCH
body_expression: |
{
"merge_fields": {
"TOTALSPEND": record['total_spend'],
"ORDERCOUNT": record['order_count']
}
}
This scales much better: if you have 100K customers but only 50 new orders, you're recalculating metrics for ~50 customers instead of all 100K. Same simple workflow pattern, just production-ready efficiency.Does this address your concern or did you mean something else? Would you suggest I use a slightly more complex but optimized example for the main demo? Your feedback is welcome and appreciated!
/s