North American Network Operators Group|
Date Prev | Date Next |
Date Index |
Thread Index |
Author Index |
Re: Global BGP - 2001-06-23
- From: lucifer
- Date: Sun Jun 24 18:19:14 2001
Brett Frankenberger wrote:
> > Out of curiosity - did anyone see a duration of significanlt instability
> > in the global routing tables on Saturday afternoon? Without violating NDA,
> > all I can say is that it resembled a historic event involve a bad route,
> > Ciscos, and Bay routers (only this time, it was a bad route, Ciscos, and
> > <X> vendor whom I cannot name but is being soundly beaten with wet noodles
> > to resolve the issue). The bad route, and instability, were seen across
> > all of our transit vendors (all "household" names of transit service).
> Hmm ... why is <X> being beaten? Was the problem reversed this time?
> The only historic event I can recall involving a bad route, Cisco, and
> Bay (actually, events would be better, since it happened at least
> twice) was a case of (a) someone injecting a bad route, (b) the cisco
> at the other end accepting it in violation of the RFC, (c) ciscos
> passing that bad route all around the internet, all in violation of the
> RFC, (d) that route eventually hitting a cisco<->bay peering
> connection, and (e) the Bay (although the problem wasn't limited to
> Bay, as gated, and possible other implementations as well, behaved the
> same way) properly sending a NOTIFY and taking down the BGP session, as
> required by the RFC.
A) Ciscos flap sessions, according to the only reports I've heard.
B) <X> routers were crashing, either due to the bug, or the session resets.
Thus, <X> is being flogged. I have reports of at least one <Y> having
problems, as well.
C) I would post the BugID, but the only source I have is under NDA. However,
having now heard this much in a public forum (IE, not covered), I can say
"Invalid AS path data bug".
> It only took two major outages before Cisco fixed the problem. (The
> BGP advertisement was posted to NANOG both times, as was the BugID the
> second time.)
I have the guilty announcement, but again, it's under NDA. However, I can
say that we are now seeing this announcement from all of our upstreams,
non-blocked, so it appears that they fixed the origionating point.
> So if this is the same issue, Cisco would be the vendor to flog,
> although assuming they didn't re-introduce it, the flogging might more
> correctly be directed at providers still running code old enough to
> have this particular problem.
I would flog Cisco as well, but A) they have a bug on it already, and B)
we're not using Ciscos for our core (note: this is my personal email, and
I am not speaking for my employer; however, this is publically documented
on my employers website, so it's not NDAed).
> Both my transits (Bay on my end, Cisco on the other end) made it
> through just fine, though. (This time. The last two times it
> happened, the cisco's on the other end happily passed the invalid route
> to me and the Bay on my end happily dropped the BGP session, and this
> was repeated ad infinitum until the bogus route was removed from the
> other end.)
I have no data on Bay; my apologies if this wasn't clear. Bay was *only*
being referenced as a historical point of note. No attempt at FUD, and my
apologies if anyone read it that way.
Joel Baker System Administrator - lightbearer.com