2 pointsby geooff_9 months ago1 comment
  • badmonster9 months ago
    curious—how does the app handle different lighting, poses, or background distractions in fit pix when recognizing clothing items? Does it need clean photos, or can it handle everyday shots?
    • geooff_9 months ago
      The app can handle everyday shots, as you'd expect though, poor inputs produce poor outputs. Theres really two components to this question though:

      1. Can the app differentiate one article of clothing from background / other articles 2. Can the app group together identical articles of clothing

      To answer 1. The app has decent performance with test set pixel level mean accuracy of 0.80 and mIoU of 0.69, the test set is all real world fit pix from myself and friends. The 0.8 is a bit misleading though as the errors often occur at clothing boundaries so in poor lighting there can be some border gore.

      As for 2. this remains to be seen. Currently clothing aggregation (Grouping together two segmentations of the same shirt) is manual. I'm doing some studies on tuning cosign-sim thresholds but I think long term there may need to be a more robust approach.

      • badmonster9 months ago
        How are you representing clothing segments for cosine similarity—are you embedding the full segmentation masks, extracted features from a vision model (e.g., CLIP), or using texture/color histograms?
        • geooff_9 months ago
          Extracted features from a vision model. I haven't experimented with CLIP yet but would like to as I think adding clothing search would be interesting