3 pointsby tornikeo7 hours ago2 comments
  • jackfranklyn7 hours ago
    Playwright + something like Claude with computer use is probably closest to what you're describing. Though I'd push back a bit on the approach.

    The flaky test problem usually comes from either race conditions (waiting for wrong things) or environmental differences. Adding AI vision on top often adds another layer of flakiness - now you're debugging "why did the model misread this button" on top of "why did the test timeout."

    For mocking external services specifically - tools like MailHog (email) or mock OAuth providers tend to be more reliable than screenshot-based approaches. The determinism matters.

    That said, if you genuinely need to test against production-like visual state - Playwright's screenshot comparison (toHaveScreenshot) combined with proper wait strategies has gotten pretty solid. The visual regression approach catches layout bugs that functional tests miss.

  • solusmukess6 hours ago
    i think you are looking something like seekbytes.com please let me your interest. As i am the owner of this work.