North American Network Operators Group|
Date Prev | Date Next |
Date Index |
Thread Index |
Author Index |
Re: Level3 routing issues?
- From: Hank Nussbacher
- Date: Tue Jan 28 11:24:39 2003
At 09:47 AM 28-01-03 -0600, Jack Bates wrote:
> On the other hand, we also know (from private communications and from
> other mailing lists.. ahem) that high rate and high src/dst diversity
> of scans causes some network devices to fail (devices that cache flows, or
> devices that suffer from cpu overload under such conditions).
> Some BGP-speaking routers (not all, by any means, but some subpopulation)
> found themselves pegged at 100% CPU on Saturday. Just one example:
Was it not known that under certain conditions the router would flatline?
Yes. And so does Cisco.
What percautionary measures were put into place in such an event to limit
A very reactive NOC. -Hank
> Whether you believe "anthropogenic" explanations for the instability
> depends on how fast you believe NEs can look, think, and type, compared
> to the speed with which the BGP announcement and withdrawal rates are
> observed to take off. For my part, I'd bet that the long slow exponential
> decay (with superimposed spiky noise) is people at work. But the initial
> blast is not.
When the crisis is on you, it's too late. You are either prepared and know
exactly what to do at that critical moment or you don't. You either had a <5
minute response time to the crisis or you didn't. We also know (from private
communications and from other mailing lists.. yes, I'm a thief :) that many
NEs were caught with their pants down, a mistake they aren't apt to do
again. It comes down to one's outlook. Do you just configure and maintain or
do you strive to push it to the envelope? Do you truly know your network?
Remember, it's a living, breathing thing. The complexity of variables makes
complete predictability impossible, and so we must learn to understand it
and how it reacts.
Then again, perhaps I'm a lunatic. :)