Traceroute Isn't Real

(gekk.info)

98 points | by radeeyate 9 hours ago ago

39 comments

  • echoangle an hour ago

    > Traceroute, as far as the industry is concerned, does not exist.

    > Look it up. There is no RFC. There are no ports for traceroute, no rules in firewalls to accommodate it, no best practices for network operators. Why is that?

    I know this is getting into semantics but this argument is ridiculous. Everything that isn’t explicitly specified in an RFC and has its own protocol doesn’t exist to the industry? Who thinks like that? It’s using behavior of the system to get a result, how does that mean it doesn’t exist? If I do a Speedtest by sending traffic over the internet, does my program not exist because there is no SpeedTest Protocol with its own port, and no RFC has ever been written about it?

    • jojobas 27 minutes ago

      >That's the situation where you, a gay leftist, go to Thanksgiving dinner with the family, and a shitty uncle sits across from you and begins telling lies about society, about people of color, about gay marriage, and so on.

      Kinda checks out?

      Anyone who's ever used traceroute knows that quite a few routers won't reply. It's still the best available tool to figure out a bunch of problems.

  • yuliyp 4 hours ago

    I think the "The Worst Diagnostics In The World" section is a bit simplistic about what traceroute does tell you. It can tell you lots of thing beyond "you can reach all the way". Specifically, it can tell you at least some of the networks and locations your packet went through and it can tell you how far it definitely got. These are extremely powerful tools as they rule out lots of problems. It's useful to be able to hand an ISP a "look, I can reach X location in your network and then the traceroute died" and they can't wonder "are you sure your firewall isn't blocking it?"

    It's still a super-common tool for communicating issues between networking teams at various ASes. That the author's ISP thought they were too small to provide reasonable support to is not a strike against traceroute. Rather, it's a strike against that ISP.

    • SkyPuncher 2 hours ago

      Traceroute was immensely helpful for me figuring out why my favorite video game tend to have large latency and massive lag spikes.

      Geographically, where I lived, my connection should have been about 220 miles directly to Chicago. Instead, my connection traveled about 180 miles west to Minneapolis then 350 miles down to Chicago. Because this involved a bunch of extra network switches, my packets would often get buffered and sometimes delivered out of order (this was obvious by how the game worked).

      A fiber provider came to town and solved all of my connection issues. Not only was the connection inherently faster, but it had fewer hops and was routed more directly to Chicago (where this game had a datacenter).

      I think I went from nearly 100ms ping to 10ms ping.

      • notjoemama 26 minutes ago

        Nice! Back in the day I provided remote support and occasionally ran trace to diagnose data transfer issues. The few times there was a problem with a hop, there wasn’t anything that could be done. But each time I was able to find either a news article on inclement weather or a post on the ISP website about a network issue. Worked with a guy that spotted a car wreck from trace. Well, it was down the street and so was the trunk cnx.

    • steve_adams_86 3 hours ago

      Traceroute has helped me solve a lot of problems quickly and easily.

      A few weeks ago I was volunteering for a local political party and they had several services down. They had no idea where they were hosted, how they were hosted, why, etc.

      I ran traceroute on all of them and within minutes I was able to tell which ones were hosted together and approximately where, and when I brought that to the team it was enough information to jog memories and map IPs and WHOIS data to various services, data from email searches, etc.

      Without that it would have been a lot of guesswork, possibly for days.

      It turned out most of them were hosted by a service which moved their accounts to a new IP. One other was hosted elsewhere and turned out to be broken for longer than they realized.

      Absolute chaos.

  • FxChiP 27 minutes ago

    No, but it's actually worse than that.

    Not only is it contingent on your intermediaries actually responding to your packet with the diagnostic information you want, it assumes that the diagnostic response will also be able to get back to you. If, for instance, your links are failing over super frequently or you have something hilarious happen like the response packet ALSO having a too-low TTL, you may not get a response as you expect.

    But wait, there's more! Precisely because of that stepping-increase of TTL, by necessity, it must send as many TTLs as necessary to reach the endpoint. That means one packet per TTL. Remember what I said about links flapping? There is no guarantee that any two packets will or even should go the same route, for any number of reasons, some potentially even legitimate. In some situations you may see different hops between hosts that aren't actually even physically connected!

    And I love MTR, but it can handle some of these issues really... interestingly. I seem to semi-regularly see it in a state where it's showing a bunch of packet drops, but really I just have to refresh the display because some state or another got desynchronized.

    That said, on simple paths that don't change a whole lot, it's great. A very clever way to expose information you might not otherwise ordinarily have that might even be key to resolving any given issue. You just have to remember just how surprisingly much of networking is made up.

  • Chihuahua0633 8 hours ago

    I wish MTR (My Traceroute) was standard in all operating systems. It offers a number of benefits over Traceroute. MTR essentially combines the functionalities of traceroute and ping, providing a more comprehensive and dynamic view of network paths and performance.

    MTR runs continuously, gathering real-time stats that reveal both packet loss and latency trends over time. MTR provides minimum, average, and maximum response times, plus the standard deviation. This is especially useful for troubleshooting intermittent issues or spotting latency spikes.

    Of course, MTR isn’t perfect and still faces some of the same challenges as traceroute, like dealing with ICMP rate-limiting, load-balanced paths, or certain network setups that obscure hops. But overall, it provides a richer, more nuanced view, making it a preferred tool for network diagnostics and troubleshooting.

    • 1xdevnet 4 hours ago

      As long as you're the only one doing the troubleshooting, sure. FTA:

        > The example given is that a handful of users running MTR (do not get me started on this bastard program) can actually hit this rate limit. This is an outstanding example because I have seen something similar in practice.
      
        > Consider what that would look like, and how common it would be: If you have a NOC full of people who think they know what they're doing, but don't, that only enhances the probability that everyone is trying to troubleshoot on their own instead of doing a screenshare and coordinating their efforts - thus, you have six guys running MTR to the same IP.
    • sneak 4 hours ago

      MTR does none of the things you claim it does, which is directly addressed in the article.

      It certainly claims to, and displays figures as if it could, but it cannot.

      Even pinging the router IPs directly does not tell you your latency or packet loss to the router, for the reasons explained in the article.

      mtr is built on false pretenses.

    • belfalas 6 hours ago

      Plus one for MTR, it’s a great tool. Not perfect, but great.

  • theginger 5 hours ago

    Traceroute isn't real, I like this, sounds like something I'd say. But it is real, it's definitely real, it's installed on my laptop, it's probably installed on yours. I know fairly well how it works, understand many of its short comings, but I still love a traceroute, as does absolutely everyone even most people who truly understand it, or it really wouldn't exist. I am a visual understander, a picture paints a thousand words, even a distorted unreliable picture with a handful of details that are sometimes useful hiding amongst a great deal of irrelevant nonsense is sometimes better than nothing at all.

    It's not an ugly hack it's a beautiful elegant solution to the problem of not knowing how your traffic is mostly probably being routed.

  • toast0 3 hours ago

    This is a nice rant, but telling people not to use this tool because it has many flaws isn't very helpful. AFAIK, there's not a better tool out there to use; the alternative is hoping things will get better by magic and the passage of time (which, to be fair, is somewhat effective; but not ideal if you've got things to do)

    Yes, traceroute doesn't address it's hard to get in touch with someone who can help. Sure, anything to do with ICMP probably has to deal with rate limiting (and the two people are tracing so the packet loss is 50% effect is real, and frustrating). But when I've had network problems and a contact who is willing to help, they really want a traceroute or mtr to help narrow down where the problem is.

    The trick is finding the right settings to get a mtr that shows what you need to show. My big problem that I needed mtrs for was server a talking to server b over several hops with 2 or 4 way aggregation on each hop. Most of the paths are clean and I can see 0% loss, but there's one link in there with say 10% loss. Default settings will not get you anything useful; you've got to test many 5-tuples (dst host, src host, protocol, dst port, src port) to find one that shows loss and one that doesn't, and then send an mtr from those. You may want to run mtrs in the reverse direction too. You'll need to have a slow probe rate for the mtrs you share, to avoid/reduce the rate limiting issues.

    If you can't count on the far side destination definitively responding to pings, your mtrs are going to be too messy to share, unfortunately.

    If there's MPLS in the loop, there's an extension to get data from that too, and sometimes it works.

  • ay 5 hours ago

    The challenge with any tool that uses the probe traffic other than the traffic of interest is that the results may be specific to the probe traffic and completely different from the one you care about.

    In theory, iOAM (https://datatracker.ietf.org/doc/rfc9326/) is a much more robust mechanism.

    In practice, internet works on the least common denominator, which means that Traceroute (which is a clever hack on top of the ICMP TTL exceeded behavior, a required internet standard) is often the best one can have, if at all. (And if not, then one has to resort to uglier hacks)

    That said - one should not underestimate how much info one can dig out by varying TTL/hop count, changing the 5-tuple (source and destination address and ports + protocol), and tweaking the packet rate.

    And the dismissive attitude about “absolutely impossible to do anything with this info unless you are Fortune 500” is wrong. For a counter example of cooperation between the “people of the internet”, here’s a nice presentation:

    https://youtu.be/G_Ir_gRlst0?feature=shared

    As one can derive from the above - it’s absolutely possible, just that the level of SNR required to be reacted to is rather high, well above “my Traceroute is not showing what I think it should be showing”. Which, given the population of the internet, isn’t entirely unreasonable.

  • o11c 5 hours ago

    For all that it mentions common misconceptions, the article is still wrong in at least 3 ways:

    First, traceroutes can, if you control both endpoints, place bounds on where a network error is.

    Second, traceroute is useful if there are three endpoints and you control at least 2 of them.

    Thirdly, you do in fact know something about other people's networks by the mere fact that you've traversed the network before at different times.

  • 1xdevnet 4 hours ago

    I appreciate this, especially for Richard Steenbergen's traceroute presentation which I need to dig into. There's a 2020 version of the presentation if anyone is interested, since the Scribd/Slideshare version in the article is from 2014-ish.

    video: https://www.youtube.com/watch?v=L0RUI5kHzEQ

    slides: https://storage.googleapis.com/site-media-prod/meetings/NANO...

    Edit: yes, I fully agree that traceroute is flawed, it's only ever going to give you an incomplete or even misleading piece of the picture and you shouldn't take what you see as gospel. That said, it has its uses especially for networks that you control and to let you know where to maybe start digging - which is all that any tool does.

  • pjsg 4 hours ago

    traceroute is a useful tool for (amongst other things) determining where a system is physically. It doesn't always give you enough information.

    Traceroute to news.ycombinator.com:

      3   96.108.68.141 (po-200-xar01.maynard.ma.boston.comcast.net)  12.105ms  11.724ms  11.931ms 
    
      4   96.108.68.141 (po-200-xar01.maynard.ma.boston.comcast.net)  13.011ms  10.727ms  19.861ms 
    
      5   162.151.52.34 (be-501-ar01.needham.ma.boston.comcast.net)  13.988ms  14.721ms  12.921ms 
    
      6   162.151.52.34 (be-501-ar01.needham.ma.boston.comcast.net)  14.999ms  16.688ms  12.997ms 
    
      7   4.69.146.65 (ae0.11.bar1.SanDiego1.net.lumen.tech)  76.044ms  79.624ms  78.017ms 
    
      8   4.69.146.65 (ae0.11.bar1.SanDiego1.net.lumen.tech)  83.962ms  108.675ms  78.987ms 
    
      9   \*  \*  \*
    
    
    I can conclude that the server is probably on the west coast -- maybe San Diego.

    I recall using this during a sales pitch by some IP Geolocation company that were very proud of their technology. The example that they used, they claimed was in Morristown, NJ. A quick traceroute (from Massachusetts) revealed that the IP was somewhere in the UK as the last hop was close to Heathrow Airport. We did not purchase their solution!

    • sneak 4 hours ago

      This fails in a lot of different ways, not the least of which is anycast. The most common way is probably dns-based geographic steering.

      Can you tell me where 8.8.8.8 is?

      • toast0 3 hours ago

        8.8.8.8 is in many places. It's hard to get a list of exactly where[1], but the more places you can traceroute from, the more places you can add to your list. For 8.8.8.8 in particular, if you've got probes in major PoPs, you can probably get response times of 1 ms or less and be pretty sure that Google has a presence there. Of course, if you get a longer response time, they may have a presence there but your network may not have their local advertisement and your query routes elsewhere; or maybe the response from Google is being routed through another location. Traceroute might help debug a longer route to Google, but it'll be hard to debug the return.

        For dns steering to unicast ips, you can get a reasonable idea of where the ips you can see are, although you'll need to make dns requests from different locations to see more of the available dns answers.

        [1] Unless you're an insider, or maybe Google publishes a list somewhere. Their peeringdb listing of locations is probably a good start, though.

      • kingforaday 3 hours ago

        I know your question is rhetorical, but for those that might find this information useful...

        https://ipinfo.io/8.8.8.8

  • benlivengood 5 hours ago

    Traceroute is often useful because it almost always diagnoses LAN vs. WAN issues. Since most home routers/modems will send TTL Exceeded and so will most edge ISP routers, it is often the quickest way to see where the problem is when there is little or no connectivity.

  • paleotrope 5 hours ago

    Good article. Reminded me of a time when I was told that a server was taking over 2 minutes to respond and they proved it with traceroute.

    They were running "time traceroute host"

  • dsjoerg 3 hours ago

    It can be used wrong! It's not perfect! It could be better!

    • markhahn 3 hours ago

      "not perfect so it's worthless!"

  • markhahn 3 hours ago

    weird - everything claimed as wrong or bad is, IMO, utterly sensible and robust in the real, mean old world. by the author's logic, TCP's entire logic of congestion control is terrible because it's not eg, a deterministic credit-based mechanism that depends on everyone implementing it perfectly.

  • G_o_D 5 hours ago

    It had its use, back then it was mostly intranets, so it was useful, back then there werent much concepts of spoofing, vpn, tor like layering and routing, encapsulation, etc Yes in modern world on public internet it might not be much useful

    But still it gives basic overview a starting point for any forensic investigation or debugging network problemns

  • zamadatix 5 hours ago

    I always thought it was a bit humorous Ethernet has a more standardized and defined traceroute functionality than IP as part of https://en.wikipedia.org/wiki/IEEE_802.1ag (typically only found in carrier Ethernet solutions in practice though).

  • TZubiri 5 hours ago

    In the journey of learning, it is certainly a good point when our understanding grows so great, that we can point to things that are unknown and claim with certainty that they are bullshit.

    It is no easy feat, as to denounce a concept as bullshit, we need to effectively prove a negative, which depends on knowing either all of the things adjacent to that subject, or most of them, and coming to the conclusion that it's not like the others.

    It's like learning that Astrology and Phrenology are not like Astronomy, or Psychology.

    My current suspect for non-real status is Software Architecture, as a subject of study I don't think it holds any merit. And to the extent that it denotes something real, it, they are already covered in other disciplines with clearer peripheries in classical academic curricula and folklore domains.

  • satisfice 5 hours ago

    This guy is almost comically emphatic. He makes his point so firmly that I almost want to wait for him to finish and ask “So, are you saying Traceroute is not always the best utility for network diagnostics?” then watch his hair catch fire.

  • motohagiography an hour ago

    funny and smart.

  • philipwhiuk 5 hours ago

    I strongly disagree the TTL Exceeded message isn't a feature

    > Features are things that enable functionality. It doesn't do that.

    It does. It enables traceroute. Your entire argument is defeated by the fact it gives you this functionality.

    And yeah it's not perfect. Stop making perfect the enemy of good.

    • LoganDark 3 hours ago

      > It does. It enables traceroute. Your entire argument is defeated by the fact it gives you this functionality.

      The entire argument? You're discounting the entire post based on this one semantics disagreement?

      • iczero 2 hours ago

        > And yeah it's not perfect. Stop making perfect the enemy of good.

  • derefr 4 hours ago

    > No, AT&T is not going to push your complaint up the line to XO. Haha. No.

    Maybe not my complaint... but I'm sure there's somebody that could do it.

    Who would you have to be, to be able to convince AT&T to bother their fiber vendors for you?

    Could, say, the NSA get a routing loop fixed within an hour just by shouting into a phone?

    • sneak 4 hours ago

      You might be able to trade social capital for attention latency by posting to the nanog list.

  • nemo44x 5 hours ago

    Doesn’t matter as it still looks cool when you run it and step away for awhile so all passers-by assume serious business is going on.

    • petesergeant 5 hours ago

      This is yet another task mtr is better at

  • Circlecrypto2 9 hours ago

    Cloud infrastructure and firewalls alone are a good reason to not use traceroute.

  • fracus 5 hours ago

    > One of the "chapters" in my presentation was about traceroute, and it more or less said "Don't use it, because you don't know how, and almost nobody you'll talk to does either, so try your best to ignore them." This is not just my opinion, it's backed up by people much more experienced than me. For a good summary I highly recommend this presentation.

    I'm being pedantic but this paragraph was bizarre to read. You are basically telling us we or anyone we know won't know enough about traceroute not to use it but you and many people you know do know enough. It is presumptuous but also inconsistent. Are there people who know, or not?