Ask HN: Algorithm to find the numbers in a set where the numbers equal x?

  • Are you looking for a subset that adds up to X, or all subsets that add up to X?

    Actually, thinking about it more, I'm not sure that matters. In either case, I think you could construct the power set[1] of the original set, and then just iterate over the results. For each subset with more than 1 element, sum the elements and if it equals X you have a candidate answer. You can stop at the first "hit" if you only want one answer, or just keeping going until you've exhausted the power set if you need them all. Implementing a power set generating algorithm is a solved problem[2], although you may have to stare at it a bit before you truly "get it". I did anyway.

    There are, of course, pre-existing implementations of the power set algorithm. One I'm familiar with is in Guava[3].

    [1]: https://en.wikipedia.org/wiki/Power_set

    [2]: https://stackoverflow.com/questions/24365954/how-to-generate...

    [3]: https://google.github.io/guava/releases/21.0/api/docs/com/go...

    Edit: I say "look at the subsets with more than one element" because otherwise, the one element set that sums to X, is just X and I'm assuming that's a trivial answer that you don't care about.

  • Though I don't have a proof this smells like NP to me. For a set of size |n| the search space is |2^n| and there may be no combination that sums up to the target so naively the all |2^n| combinations have to be checked.

    If my intuition that it is in NP is correct, then the most efficient searches will require analysis of the Set and Target to understand the ways in which the are not random, because there is no generally fast algorithm for problems in NP with random data. But again, I could be wrong and the problem could be polynomial.

    Once there is actual data, heuristics might make it obvious that there is no solution:

      Target == 0
      Domain == {1}
    
    Or allow removal of some values. Assuming a O(n log n) sort of the set into an array. O(n log n) is rounding error on O(2^n) or virtually free. Particularly for problems of interesting size.

      Target == 3
      Domain == {2, 4, 5, 1}
    
    The more we know about the data, the more we can prune the search space:

      Target == 118
      Domain == {3,6,9,12...}

  • You gotta get out more! :)