2 pointsby geooff_17 hours ago1 comment
  • badmonster16 hours ago
    curious—how does the app handle different lighting, poses, or background distractions in fit pix when recognizing clothing items? Does it need clean photos, or can it handle everyday shots?
    • geooff_15 hours ago
      The app can handle everyday shots, as you'd expect though, poor inputs produce poor outputs. Theres really two components to this question though:

      1. Can the app differentiate one article of clothing from background / other articles 2. Can the app group together identical articles of clothing

      To answer 1. The app has decent performance with test set pixel level mean accuracy of 0.80 and mIoU of 0.69, the test set is all real world fit pix from myself and friends. The 0.8 is a bit misleading though as the errors often occur at clothing boundaries so in poor lighting there can be some border gore.

      As for 2. this remains to be seen. Currently clothing aggregation (Grouping together two segmentations of the same shirt) is manual. I'm doing some studies on tuning cosign-sim thresholds but I think long term there may need to be a more robust approach.

      • badmonster15 hours ago
        How are you representing clothing segments for cosine similarity—are you embedding the full segmentation masks, extracted features from a vision model (e.g., CLIP), or using texture/color histograms?
        • geooff_15 hours ago
          Extracted features from a vision model. I haven't experimented with CLIP yet but would like to as I think adding clothing search would be interesting