30 pointsby tjgreen4 hours ago6 comments
  • simonw13 minutes ago
    This is really cool. I've built things on PostgreSQL ts_vector() FTS in the past which works well but doesn't have whole-index ranking algorithms so can't do BM25.

    It's a bit surprising to me that this doesn't appear to have a mechanism to say "filter for just documents matching terms X and Y, then sort by BM25 relevance" - it looks like this extension currently handles just the BM25 ranking but not the FTS filtering. Are you planning to address that in the future?

    I found this example in the README quite confusing:

      SELECT * FROM documents
      WHERE content <@> to_bm25query('search terms', 'docs_idx') < -5.0
      ORDER BY content <@> 'search terms'
      LIMIT 10;
    
    That -5.0 is a magic number which, based on my understanding of BM25, is difficult to predict in advance since the threshold you would want to pick varies for different datasets.
  • jascha_eng40 minutes ago
    FWIW TJ is not your average vibe coder imo: https://www.linkedin.com/in/todd-j-green/

    In september he burned through 3000$ in API credits though, but I think that's before we finally bought max plans for everyone that wanted it.

  • gplprotects12 minutes ago
    > ParadeDB, is guarded behind AGPL

    What a wonderful ad for ParadeDB, and clear signal that "TigerData" is a pernicious entity.

  • 12 minutes ago
    undefined
  • benjiro300010 minutes ago
    [dead]
  • 4 hours ago
    undefined