1 pointby Gyanangshu4 hours ago1 comment
  • smcin4 hours ago
    Because “ID columns” and “value columns” (or "pivot") are dry statistician jargon, as opposed to terms like "wide-to-long" , and most of your users won't be MS in statistics or economics. So like you do, just visually separate the columns into "columns used to identify the data" , e.g. Year-WkOfYear-Store-Department, and "value columns". (Posting a few screenshots for illustration would be really good).

    > The harder part has been defaults. If the dataset has clear patterns, like columns named month_1 to month_26, it’s easy to guess what should be treated as values. But when naming is inconsistent, the guesses are often off.

    Well, you'll probably need to iterate with user assistance via your UI, but you can often eliminate a lot by filtering on the inferred data type (float/integer/date/categorical/string/etc.), permitted range of values (e.g. 'Sales' is probably a positive float or integer), units, formatting, assume related columns tend to be (fairly) contiguous, etc.

    Post us a corner case or two. It helps if you tell us what domains your data typically comes from.