Show HN: Proving – A Career Intelligence App(proving.app)

1 pointby binarycleric5 hours ago1 comment

binyang_qiu4 hours ago
I think your concept of this APP is meaningful for most people. I'm curious to know how you handle data calibration, especially when it comes to different regions, remote offers, and issues with uneven sample distribution.
- binarycleric4 hours ago
  "App" is still hard-wired into my brain after my last job. It's web-only right now but I do want to get iOS and Android version when I have the time (or money).
  Right now we're processing a lot of data from various government sources and other free datasets to get salary information across all of the major US metros. Plus ingesting a bunch of data from some unnamed paid job posting APIs (respecting ToS) and pulling out reported salary information, if it exists. It's why the landing page is marked as "US-only" right now, I wanted a tight scope for an MVP launch. I have a number of tech friends in Canada so that'll probably be the next country I support.
  Right now I'm basing the user's salary data on the metro where they currently live. Not perfect and not the long term solution but cracking it fell to the cutting room floor to get an MVP shipped. That was a difficult cut but I gave myself a mid-May deadline to get something shipped.
  Re: Uneven sample distribution. Sample size is a first-class concept in scoring. For each user's metro+role+level slice, n is computed over a trailing 60-day window. Below a threshold (currently 30), I aggregate to a broader peer group and explicitly flag lower confidence on the score. Bayesian priors derived from the nationwide distribution for that role help fill in thin slices, so a senior Rust dev in Boise still gets a number but they also see "this is computed from a small local sample plus regional inference" rather than a false-precision point estimate. It's a lot and is still being fine-tuned.
  Pay transparency laws in CA/NYC/CO/WA are helping but coverage is patchy.
  Right now I'm not using any user-provided data in calculations as my user-base is too small and there's too much risk for identification. Eventually I want to add opt-in data submission so we can run real-time metro-aware compensation surveys based on consented and anonymized peer data.