Personally, I think Claude Code played it a little too safe here, so that's why we didn't put more emphasis on its precision.
Note that 100% precision is also easy to achieve in this case: Only match trials with papers that explicitly mention said trials via regex. So clearly we have to pay attention to both precision and recall. We just happened to go with F1 as the more or less canonical measure to take both into account, but I agree that, depending on your use case, you may be interested in other measures of accuracy.