DIVERSITY AND RELIABILITY
Flash! Satellite Outage:
" . . . not only were medical
professionals, repair people
and stock watchers left hung
out to dry, but CBS, Reuters,
NPR, UPI and other news
organizations were left looking
for backup. By all
accounts, CBS had seamless backup,
but NPR was in the
middle of broadcasting 'All Things
Considered' and had to
switch to alternate satellites,
ISDN, phone bridges and --
surprise -- RealAudio to get
its feed to local stations."
(Industry Standard Media Grok,
May 21, 1998.)
Two recent network failures
serve as "emblematic events"
for the new telecommunications:
(1) the PanAmSat Galaxy IV
satellite outage, which struck
at 6PM EDT on May 19, and
(2) AT&T's Frame Relay Network
shutdown of several weeks
ago. Both drew attention
to the pervasive, underlying
dependence of civilization-as-we-know-it
upon single
specific communications links.
There are lessons here. "Learn
From History, or . . . "
kind of lessons. For example:
1. Systems that are carefully
crafted and managed by the
"reliability, reliability,
reliability" crowd to be 24-by-7
and fault-free, aren't.
2. If you have not second-sourced
your data transport yet,
do it now.
3. The Year 2000 is coming,
better make that triple-
sourced.
Humanity has an amazing inability
to plan. Not too many
generations ago, when our relatives
lived by hunting and
gathering, the inability to plan
for the next season meant
death. Planners survived. The
clueless died. But today,
Homo Sapiens eats at McDonalds
- for the moment, planning
and survival are not strongly
linked.
Inability to plan made news
during the big United Parcel
Service strike of 1997.
Anybody that cared to look could
have seen the strike coming weeks
before it occurred. When
it hit, a school supply company
made news. Its entire
revenue stream hinged on a single
end-of-summer shipment.
UPS was its only shipper. Yet,
in several poignant
interviews with the owner, no
happy-talk reporter ever
asked, "Before the strike,
did you ever think what would
happen if it occurred during
your critical week?"
PHYSICAL DIVERSITY
In the old days, before telecom
competition, Ma Bell was
into Physical Diversity.
Physical Diversity meant that
there were several alternate
routes, each with different
geography, different technology,
and different physical
infrastructure. For example,
a phone call from New York to
San Francisco, using the modern
technologies of the 80s,
might have traveled underground
by cable, or hopped line-
of-sight from microwave tower
to microwave tower, or made
one long leap to a satellite
and another to Earth again.
Today, in the era of competition,
telcos want to be "Low
Cost Providers." Physical
Diversity is an "unnecessary
expense." They want to skinny
down, remove redundancy,
reduce inventory, refine processes
- find the best solution
and stick with it. They
don't need no steenkin' diversity.
Fiber *has* become very reliable,
and SONET makes it even
more so. This raises the
reliability coefficient, but does
not make it 1.0. Too bad
the whole system isn't just rings
of passive glass. There
are all kinds of hardware and
software components to light
the fiber, set up calls, parse
headers, route data, relay frames,
and monitor the system.
There are systems for connecting
customers, and systems for
interconnecting with other networks.
There are systems to
check on the other systems.
The result is that a complex,
linearly engineered system -
designed to be 99.9999 percent
reliable - ISN'T. It is
still a chaotic, adaptive system,
even though it wasn't
designed to be.
ONUS ON THE CUSTOMER
The onus for Physical Diversity
is on the customer. That's
why, in the era of telecom competition,
business critical
operations demand a couple of
interexchange carriers, an
office phone and a cell phone,
a desktop and a laptop, a
cable modem and a dial-up, and
at least two Internet
Service Providers (ISPs).
It is a good thing that
infrastructure is getting cheaper.
Physical Diversity, along
as many dimensions as possible,
is the most reliable route to
reliability. Note that
Galaxy IV's backup system failed
too! Any time that
parallel systems share components,
the event that takes one
system out is also likely to
bring the second system down.
If it's a software bug, and both
primary and backup systems
are running the same faulty code,
too bad. If radio
interference is the problem,
and both systems use the same
frequencies or modulations, sorry.
If you rely on
satellites and there are solar
storms or meteor showers,
look out. If you get primary
and backup from the same
company, and the company fails
(or goes on strike, or . .
.), remember you read it here
first.
When the onus for Physical
Diversity is on the customer, the
customer needs alternatives.
That's a problem when 90% of
computers run one company's operating
system, no matter how
innovative that company might
be. And it's a problem when
a single telco controls local
telephone service, no matter
how big the telco's territory.
EXODUS TO THE PROMISED BANDS
Exodus Communications is a
Stupid company - they are into
over-provisioning and Physical
Diversity. They call
themselves an "Internet
Data Center." Actually Exodus has
about 8 Internet Data Centers
around the world. They'll
buy data feeds - DS-3, OC-3,
or more - from any carrier
that'll sell them. UUNet,
GTE, Sprint - Exodus buys it
all. Their customers are
ISPs. An ISP gets a cage on the
Exodus floor, data feeds to order,
and a Chinese menu of
add-on services.
In one Exodus customer configuration,
the ISP has two
redundant racks. Rack #1
gets a primary 100BaseT feed
from, say, UUNet and a secondary,
totally redundant feed
from, say, Sprint. Rack
#2 gets its primary from Sprint,
and its secondary from UUNet.
Exodus maintains a 200% headroom
policy. It attempts to
have twice as much bandwidth
as it needs in its busy hour.
Its 200% and Physical Diversity
policies extend to electric
power and heating-cooling too.
It has contracts with two
different power companies, and
it has a back-up generator
on the roof and another in the
basement. The rooftop
generator has a different fuel
tank than the basement one.
There are four air conditioners,
one in each corner of the
data center. Each of the
four electrical feeds supplies
one AC. And so on.
A facilities based telco can't
do this. (Imagine AT&T
advertising that its redundancy
is due to secondary
facilities by MCI!)
Exodus can, because it buys facilities
from all comers. Reliability
emerges at a different point
in the value space.
Exodus is an excellent example
of what SMART Person Paul
Saffo calls "disinterREmediation."
Once upon a time, telcos
mediated Physical Diversity for
their customer, but competition
and the resulting drive to lower
costs made it prudent for
them to stop. Customers
can still buy Physical Diversity in
the age of telecom competition,
but they have to do it
piecemeal . . . one from GTE,
one from MCI, etc.
Exodus REmediates by providing
one-stop shopping for
Physical Diversity. The
whole process is called
"disinterREmediation."
Really.
Gentle Reader, if you are
still asking what is the
relationship of the Y2K Problem
to The Stupid Network, I
don't think I can help further.
To unsubscribe, send me a
brief message to that effect.
For the rest of us, let's
take the lessons of Physical
Diversity home. We could
ask ourselves now what might
happen if our communications,
our food, our electricity,
our heat, our transportation,
our money, our employment are
disrupted. Physical Diversity
is part of the solution
space. It can protect individuals
and defined groups from
potential technological failures.
I wonder whether Exodus
will rent cages for living spaces
next year :-)
Physical Diversity offers
much less protection against the
kinds of sociological phenomena
that could plausibly occur
when physical systems are disrupted.
I have no idea where
this discussion will lead, but
it is time to begin talking . . . .
David I
-------
<<to unsubscribe to the
SMART List, send a brief
unsubscribe message to isen@isen.com>>
<<for past SMART Letters,
see
http://www.isen.com/archives/index.html>>
*--------------------isen.com----------------------*
David S. Isenberg
isen@isen.com
d/b/a isen.com
http://www.isen.com/
18 South Wickom Drive
888-isen-com (anytime)
Westfield NJ 07090 USA
908-875-0772 (direct line)
908-654-0772 (home)
*--------------------isen.com----------------------*
-- Technology
Analysis and Strategy --
Rethinking the value of networks
in an era of abundant infrastructure.
*--------------------isen.com----------------------*
Date last modified: 27 May 1998