2 pointsby grafikui4 hours ago1 comment
  • grafikui4 hours ago
    Earlier this week I launched Transactional AI v0.1 to solve a problem I kept hitting: AI agents that half-executed and left systems in broken states.

    The core idea: apply the Saga pattern (from distributed systems) to AI workflows. Every step has automatic rollback. If OpenAI succeeds but Stripe fails, the system automatically deletes the AI-generated content and refunds—no manual cleanup.

    v0.2 adds production features based on feedback:

    Distributed Execution (v0.2.0):

    Redis-based distributed locking (prevents race conditions with multiple workers) PostgreSQL storage adapter (ACID compliance for regulated industries) Retry policies with exponential backoff (handles flaky LLM APIs) Observability & Reliability (v0.2.1):

    Event hooks for monitoring (12 lifecycle events: step start/complete/fail/timeout/retry, compensation events, transaction lifecycle) Per-step timeouts (kill hung OpenAI calls after 30s) Testing utilities (in-memory storage/locks, no Redis/Postgres needed for tests) Example:

    const tx = new Transaction('workflow-123', storage, { lock: new RedisLock('redis://localhost'), events: { onStepTimeout: (step, ms) => alerting.sendAlert(`${step} hung after ${ms}ms`), onStepFailed: (step, err, attempt) => logger.error(`${step} failed`, { err, attempt }) } });

    await tx.run(async (t) => { const report = await t.step('generate-ai-report', { do: async () => await openai.createCompletion({...}), undo: async (result) => await db.reports.delete(result.id), retry: { attempts: 3, backoffMs: 2000 }, timeout: 30000 });

      await t.step('charge-customer', {
        do: async () => await stripe.charges.create({...}),
        undo: async (charge) => await stripe.refunds.create({ charge: charge.id }),
        timeout: 10000
      });
    }); If anything fails: Automatic rollback in reverse order. Report deleted, payment refunded.

    Architecture:

    TypeScript, 21 passing tests, strict mode Storage adapters: File (dev), Redis (performance), Postgres (ACID), Memory (tests) Lock adapters: NoOp (single process), Redis (distributed), Mock (tests) CLI inspector: tai-inspect for debugging transaction state No heavyweight orchestration engines (Temporal, AWS Step Functions). Just a 450-line TypeScript library.

    Production readiness: 8.0/10 (up from 6.5 in v0.1)

    Considering for v0.3.0: compensation retry policies, parallel steps, OpenTelemetry integration, MongoDB/DynamoDB adapters.

    GitHub: https://github.com/Grafikui/Transactional-ai NPM: npm install transactional-ai

    Happy to answer questions about the implementation, saga patterns, or production experiences!