21 pointsby jivaprimea month ago3 comments

mkla month ago
TSP = Travelling Salesman Problem (https://en.wikipedia.org/wiki/Travelling_salesman_problem)
PPO = Proximal Policy Optimisation, a reinforcement learning algorithm (https://en.wikipedia.org/wiki/Proximal_Policy_Optimization)
- n8henriea month ago
  Thanks. Was wondering if this was about my federal thrift savings plan.
migaa month ago
Also compare with LKH3 which seems much faster and closer to optimal.
whatever1a month ago
Sorry if I am harsh, but a 1200 node tsp problem is a toy problem. We can find proven optimal solutions to these in a fraction of the time you spent.
RL is probably best suited for uncertainty infected instances.
- whatever1a month ago
  Out of curiosity I solved it with the concorde solver in the Neos server.
  In 58s its heuristic found a solution 0.037% away from optimal, and in 943s it found and proved the optimal solution.
  (This is with 3GB of ram and 4 threads of an Intel Xeon E5-2698 @ 2.3GHz aka a 30yo algorithm on a 10 yo machine)