Merit Network
Can't find what you're looking for? Search the Mail Archives.
  About Merit   Services   Network   Resources & Support   Network Research   News   Events   Home

Discussion Communities: Merit Network Email List Archives

North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: Cascading Failures Could Crash the Global Internet

  • From: Marshall Eubanks
  • Date: Sun Feb 09 12:43:35 2003


A packet switched network can be engineered against cascading failures
in a way that's hard for a circuit switched network. Every time you see a
random wait in a protocol, it's a good bet that the protocol writers were trying to
protect against the tight coupling that leads to cascading failures.

Marshall Eubanks

On Sunday, February 9, 2003, at 10:07 AM, Jack Bates wrote:

From: "Stewart, William C (Bill), SALES"

I think the key is that the failures described in the paper
are caused by overload rather than other things -
too much demand for power blows out the generator,
and without it, the grid tries to get the power from the next
nearest generators, which overload and fail, and try to pull an
even large amount from the _next_ nearest, etc.
So the bit about heterogeneity is probably referring to
the fact that some nodes are bigger or better-connected than others,
and are more likely to blow out a bunch of their neighbors when
they fail and shed a big load.

That's not really how Internet systems usually fail.
A prime example of this theory was the large network I was using back when
IE5 first came out. They had one circuit bad which overloaded an ATM circuit
at another NAP causing it to generate bit errors. Shutting down the second
circuit overloaded both MAE circuits effectively shutting down the network.
However, it required manual intervention to create full failure, otherwise
TCP would pull back to being useless, effectively killing all connections
going that path, but not causing an issue with other paths until the manual
intervention of shutting down the cirucit.

While in theory it was still a cascade failure, it was also poor
planning/policy on the part of the network to not be able to compensate in
case of failure. The information provided may be partially inaccurate and is
only hearsay concerning actual outages and effects when various
interventions were tried; no hard fact. Thus it could be taken as solely my
conjecture and not actual fact.


Discussion Communities

About Merit | Services | Network | Resources & Support | Network Research
News | Events | Contact | Site Map | Merit Network Home

Merit Network, Inc.