Suits, money laundering, and linear programming

(andrewpwheeler.com)

75 points | by apwheele 9 months ago ago

30 comments

  • seanhunter 8 months ago

    BTW before anyone gets any bright ideas about trying this in real life:

    This is called "layering" and is often a criminal offense [1] and banks have effective means to detect it (as they are required to by law). In particular they are highly attuned to flows of money that seem to split up large bunches of cash into smaller amounts that the would-be money launderer thinks will be below some static threshold. The detections which catch this have been in all banks I've ever worked with (since around 2010 when I first touched on doing any anti-moneylaundering work and they were very mature and well-established by then)[2].

    You are highly unlikely to get away with it unless you have already been doing it for some time or know someone who has and is willing to share their techniques.

    [1] Obviously or not it's only a criminal offense if you doing it to evade detection (eg for tax fraud or to avoid paying say a court settlement or something) or are otherwise laundering the proceeds of crime - that sort of thing. If you're just doing it for your own amusement it will probably get you added to a watchlist of suspicious accounts at your bank that they will monitor extra closely and may get you investigated by various financial crime enforcement organisations. Which even if you haven't done anything wrong I imagine would be stressful and time-consuming.

    [2] To give people who are not part of this world the idea, at least in the UK and US all employees of any kind of deposit-taking institution are required by regulators to have an annual anti-moneylaundering training which covers layering and how to detect it. So literally every bank employee should know what this pattern looks like and should be on the look out for it.

    • Shatnerz 8 months ago

      > This is called "layering"

      I believe it is actually "structuring".

      Layering is a step in money laundering where layers of legitimacy are added as money moves around. Structuring is breaking larger transactions into smaller transactions in order to avoid detection. Smurfing is also similar to structuring and honestly I don't understand the difference well enough to explain, but they are often used interchangeably in my experience.

      edit: Here is a short educational video on the topic of money laundering: https://www.youtube.com/watch?v=RhsUHDJ0BFM&t=90s

      • seanhunter 8 months ago

        Ooh yes. Thanks for the correction.

  • JohnKemeny 8 months ago

    This is Subset Sum and is famously NP-complete so, no, an LP doesn't work. However, the problem is only weakly NP-hard and there is a pseudo-polynomial time algorithm running in time O(M • n) where M is the sum and n is the number of accounts.

    The problem is, perhaps, most of all, that it takes the same amount of space.

    Ps, it's not a permutations problem, but a subsets problem. While permutations is n! subsets are "only" 2^n.

    • whatever1 8 months ago

      This is a MIP. CBC is an open source solver with a very basic implementation of branch & bound, cuts and search heuristics.

      Any modern commercial solver (Gurobi, Xpress, IBM-CPLEX, COPT) can solve problems of this size pretty fast.

      • JohnKemeny 8 months ago

        I agree that you can solve it using MILP, but I would say that commercial solvers will solve many instances of the problem pretty fast. But we know that you need even Ω(n²) time for 3SUM, so given large enough n, some instances will take a long time, regardless of solver.

        • whatever1 8 months ago

          This is assuming exhaustive search.

          This is not what math programming does. Theoretically if you derive tight enough cuts for the problem and the solution lies on a vertex of the relaxed simplex you you can get the optimal solution in polynomial time.

          • almostgotcaught 8 months ago

            > Theoretically if you derive tight enough cuts for the problem and the solution lies on a vertex of the relaxed simplex you you can get the optimal solution in polynomial time.

            I hate when people cloak very obvious things in jargon to make those things seem more insightful and disguise how obvious they are. What you're basically saying is if you're lucky and you win the lottery then you'll be rich.

            Ie yes we all know that if you luck out then interior point will give you the optimum in polynomial time. But the whole reason why MIP is np-hard is exactly because deriving those cuts in general isn't possible (at least currently).

            > This is not what math programming does

            So what gurobi/cplex/etc actually do is first try a bazillion symmetry transformations/warm start heuristics/etc but then ultimately it does boil down to search.

            • whatever1 8 months ago

              No and No.

              This way of thinking describes the plateau that MIP solvers hit in the 90's.

              Then in 2000's they starting implementing actual math (valid cuts in the root node mainly, but also during the search), which what gave us the exponential speed ups we enjoy today.

              So you would be right 30 years ago. You are wrong in 2024.

              • almostgotcaught 8 months ago

                > No and No.

                > which what gave us the exponential speed ups we enjoy today

                You're just patently wrong because

                1. P=?NP is still open

                2. You can consult the implementation of any solver

                There's really nothing more to be said about it because both of these bullets are formal proofs that you are wrong.

                • whatever1 8 months ago

                  Simplex algorithm should never work given its exponential worst case performance. Yet it solves the vast majority of LP problems, including ones with billions of variables. It is the default algorithm for every solver despite the fact that we have a polynomial interior point method in our disposal. Go figure.

                  Worst case performance or NP completeness is not a predictor of how fast we can solve an optimization problem.

                  • SkiFire13 8 months ago

                    > Simplex algorithm should never work given its exponential worst case performance.

                    Note that it has been proven however that if you slightly perturbate the inputs then the expected complexity becomes polynomial, meaning the inputs for which it's exponential are actually just pathological cases and it is polynomial in practice.

                  • JohnKemeny 8 months ago

                    > should never work given its exponential worst case performance

                    This is again not true. We know why it works well. See the paper

                    "Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time"

                    by Spielmann and Teng.

                    • whatever1 8 months ago

                      We started understanding of why it works so well like 30 years after its first implementation (in the 1950's)? The paper you mention is written in the 2000s while the Simplex dominated the field from at least the 70's.

                      Meanwhile the GP would have discarded the simplex algorithm due to its worst case performance.

                • kxyvr 8 months ago

                  Honestly, the parent is pretty accurate. No one is claiming that P = NP. However, the technology to solved mixed integer programs has improved dramatically over the last 30 years and that improvement in performance has outpaced computational speed by multiple orders of magnitude. It's the algorithms.

                  I just went to pull up some numbers. The following comes from a talk that Bob Bixby gave at ISMP in 2012. He is the original author of CPLEX and one of the current owners of Gurobi. Between 1991 and 2007, CPLEX achieved a 29530x speedup in their solvers. Their 1997/98 breakthrough year attributes the following speedups, Cutting planes: 33.3x, presolve: 7.7x, variable selection: 2.7x, node presolve 1.3x, heuristics: 1.1x, dive probing 1.1x. I don't have a paper reference for these numbers and I don't think he has published them, but I was at the talk.

                  The point is that integer programming solvers perform unreasonably well. There is theory as to why. Yes, there is still a lot of searching. However, search in-and-of-itself is not sufficient to solve the problems that we regularly solve now. Further, that increase in performance is not just heuristics.

                  • whatever1 8 months ago

                    Here is the paper

                    https://www.emis.de/journals/DMJDMV/vol-ismp/25_bixby-robert...

                    FYI we’ve probably crossed paths :)

                    • kxyvr 8 months ago

                      Laugh. Probably! I gave a talk at that conference titled, "Software Abstractions for Matrix-Free PDE Optimization with Cone Constraints." I still work in the field, so you want to talk algorithms sometime, feel free to send me an email. I keep my email off of HN to limit spam, but if you search for the lead author on that presentation, it should list my website.

    • w1 8 months ago

      This is also a problem for accountants, trying to do “account reconciliation” to determine which transactions contribute to a given balance.

      I made a simple tool that does this client-side for an accountant friend a while ago: https://reconciliation.spcdx.com/

      (Warning: runtimes do quickly scale, due to the time complexity described above)

    • TacticalCoder 8 months ago

      Can't subset sum like in TFA be solved by dynamic programming using a modified 0-1 knapsack? (modified to keep track of selected items)

      • JohnKemeny 8 months ago

        Yup, in time (and space) O(M • n), so with M = 1 million and let's say n = 1 million, it takes space 1 trillion. Given a byte per number, it takes a TB of memory, but is otherwise doable.

        Ps, you don't want to risk a running time of form O(Mn²).

    • JadeNB 8 months ago

      > weakly NP-hard

      What does this mean?

      • JohnKemeny 8 months ago

        Weak NP-completeness is problems that are numeric in nature and that become polynomial time solvable if the encoding of the numbers are given in unary.

        https://en.wikipedia.org/wiki/Weak_NP-completeness

        • JadeNB 8 months ago

          Ah, thanks. Then, since "NP-hard" and "NP-complete" are different, I would expect "weakly NP-hard" and "weakly NP-complete" to be different; but, at least according to Wikipedia, they are synonymous.

  • taeric 8 months ago

    I used to walk people through the "2sum" and "3sum" variants of this for an interview question. I was not looking for anything complicated, such that it was easy enough to talk through solutions that we could play with. Had a few people that would take us down some rather complicated approaches. (Even gave offers to some that got through some complicated solutions that wouldn't work.)

    I never tried doing an "optimal" solution. Thinking about the problem, I'm curious if a BDD approach wouldn't work? Will try and write one up to see. Fun times.

    • morkalork 8 months ago

      Please tell me the position entailed more than just CRUD application work

      • taeric 8 months ago

        I said I gave offers even for folks that didn't get it working. :D

        I was more using this as a way to discuss code. And "2sum" is a pretty trivial problem to solve linearly. Especially if you allow high level language use.

        (To directly answer the question, there was more than CRUD. Probably still more CRUD than folks would expect.)

  • wakawaka28 8 months ago

    In the real world this would be more complicated, because there could be more mixing and withdrawals. Banks also have a lot of visibility into account transactions, at least in the case of accounts which they would know the balances for.

  • rustcleaner 8 months ago

    No, money needs to be like Monero and the state can f- right off with their panopticon financial surveillance. Sometimes it's better to be armed and tolerate or handle an occasional little bad-guy, than it is to succumb to a supermassive Goku-monster bending you over a barrell and freeing you into life-long [legal] disability with his gargantuan **** of [In]Justice™ for failure to report and pay that last $40 in income tax!

    Ben Franklin on security and liberty, or something.

  • phinnino1 8 months ago

    amazing