50 pointsby pranshuchittora3 hours ago4 comments
  • pranshuchittora3 hours ago
    Hey, I am the creator of agent-qa.

    Coding agents have accelerated software development, allowing folks to ship features at lightning speed, but whether the feature works in production without breaking existing behavior is still questionable.

    Conventionally, either a software engineer or a QA engineer converts user stories / feature PRDs into composable end-to-end tests, allowing teams to catch regressions.

    But with AI writing code, tests become the bottleneck. Though you can ask the coding agent to write tests, and it does write tests with reasonable correctness, AI greedily chases passing tests and sometimes bends the rules. Also, having access to the code allows it to write tests with shortcuts that might not mimic real user behavior.

    With agent-qa, you can write tests in plain English (natural language). It is built upon battle-tested testing frameworks (Playwright for web and Appium for mobile). Playwright and Appium work as a kernel executing the planned actions, while AI runs in the harness doing observation -> planning -> executing planned actions (via kernel) -> self-healing (in case a planned action fails) -> verification.

    The agent also evolves with every test run. It generates learning & product memories from each run, improving itself over time.

    This is in an early stage, and I’m looking forward to your feedback.

    Thanks!

    Live Demo - https://vostride.com/demo/agent-qa GitHub - https://github.com/vostride/agent-qa (Consider giving it star) Good Day!

  • willowwd92 hours ago
    What's the need of this? I run codex in loop and it writes and runs the playwright tests without any intervention.
    • pranshuchittora2 hours ago
      This is what teams are doing today. But LLMs have a tendency to greedily write tests, which leads to hacky tricks to make the test succeed.

      agent-qa is a harness where playwright works as an execution kernel and LLM works as a observer, planner and verifier.

  • mkdsf012 hours ago
    That looks interesting