Instead of choosing a single "best" tool, I break down where each package fits and how much manual work is needed for production-style experiment reporting.
Includes code examples and a feature matrix across power analysis, ratio metrics, relative effect CIs, CUPED, multiple testing correction, and working aggregated statistics for efficiency.
Disclosure: I am also the author of tea-tasting.