!@#$%^&*()!@#$%^&*()!@#$%^&*()!@#$%^&*()!@#$%^&*()!@#$%^&*() ------------------------------------------------------------ SMART Letter #15 - January 3, 1999 For Friends and Enemies of the Stupid Network Copyright 1998 by David S. Isenberg isen@isen.com -- http://www.isen.com/ -- 1-888-isen-com ------------------------------------------------------------ !@#$%^&*()!@#$%^&*()!@#$%^&*()!@#$%^&*()!@#$%^&*()!@#$%^&*() CONTENTS: + Lead essay: Seven Ways Year 2000 Could Hose Public Networks + Business 2.0 -- Will Public Networks be Ready for Year 2000? + Conferences on My Calendar, Copyright Notice, Administrivia ------- SEVEN WAYS Y2K COULD HOSE PUBLIC NETWORKS: Everybody needs a Plan B for Year 2000 network surprises. David S. Isenberg The Titanic's maiden voyage occurred on a 99% iceberg-free ocean. So is it not good news when Judy List, head of Bellcore's Year 2000 program, says that large US companies will be -- at best -- 92% Y2K compliant by January 1, 2000. And, she adds, there's no reason for telcos to be an exception. List says that 100% of network management systems and 75% of voice networking devices are date sensitive. If 92% of these systems are fixed on January 1, 2000, then 8% won't be. Not that some telcos aren't taking Y2K very seriously -- they are. They're attacking internal systems head-on. AT&T, for example, has spent $500 million to date. John Pasqua, AT&T's Y2K leader, is type-A serious about it. But I wonder about some of the other telcos. Bell Atlantic, for example, seems disturbingly naive. Bell Atlantic claims that all its mission critical network elements will be fixed by June 30, 1999, "*in*sufficient* *time*to*allow*for*testing*" (emphasis added). And if BA's customers are concerned about interoperability, they are invited to test, "simply by transmitting your data . . . after [the Bell Atlantic network] is fully Year 2000 compliant." The Y2K-sober telco must face the likelihood of network failure even as it works to prevent it. Some kinds of failures can be addressed directly by telcos, but others can't. Let's consider seven categories: 1: Intrinsic failure -- The typical telco head-on attack is focussed on failure of intrinsic network components. But despite heroic efforts, systems could still contain Y2K flaws that could bring down the network, or big pieces of it. Flaws surface even without Y2K; last spring, the Galaxy 4 satellite failed and the AT&T frame relay network went down. Y2K won't make intrinsic failure less probable. 2. Interconnection failure -- One telephone company's network might be working fine, but the networks of other telcos, or customer premises equipment, could send bad SS7 messages -- or other kinds of Y2K contaminated data associated with operations or maintenance -- that might cause network failures. 3. Overload failure -- If uncertainties in non-telecom sectors, or outright failures, cause a dramatic increase in call attempts, this could overload local switches or other network resources. If call centers fail (e.g., for airlines or banks) during times of high anxiety, this could cause even more overload. 4. Infrastructure failure -- There could be failures in the non-telco-related infrastructure. For example, if the electrical grid fails, it could bring telecom systems with it. Furthermore, it is plausible that Y2K electrical failures could last longer than network back-up facilities can operate. Or if transportation systems fail, key network operations people might not get to work. 5. Failures due to Y2K fixes -- Some of the biggest recent network failures have occurred during upgrades -- witness the SS7 debacle of 1991. In the pre-Y2K months, people will be pressured to rush Y2K upgrades into service. 6. Security breach failures -- Many strange hands touch software during Year 2000 remediation. Some of these hands could plant malicious code. Others could plant well intentioned (e.g. maintenance) access points that provide entry for later security breeches. 7. Emergent failures -- In complex, tightly coupled systems, an unexpected conjunction of improbable events can be disasterous. Because so many elements can interact in so many ways, an emergent failure is virtually certain to be a surprise. And when subsystems are tightly coupled, effects can cascade rapidly. A telecommunications network is a transmission system controlled by information systems that are, in turn, joined by transmission systems. They depend on electrical power systems that, in turn, critically depend on telecommunications systems. In situations like this, improbable interactions cascade. PLAN B Many kinds of failure originate outside of telco systems, yet telco contingency planning could avert or lessen damages in virtually every case. Furthermore, when time is critical, existing contingency plans -- plans that have already been made -- can save time. Has your company developed triage rules? Will motivation to good software practice maintain in the face of time pressure? Are there plans for shutting off operations interaction, signaling and/or traffic with other telcos? Are there plans for non-regional, widely distributed network overloads? How will network operations people get to work if transport systems fail? Are there non-network dependant alerting systems to respond to surprise emergent failures? The Titanic was unsinkable, so it sailed without lifeboats for everybody. Our networks are almost that good. We should have plans in place in case we hit one of the Y2K icebergs out there. (The above article appeared in America's Network, January 1, 1999, as Isenberg's fifth monthly "Intelligence at the Edge" column. Copyright 1999 Advanstar Communications, Inc.) ------- WILL PUBLIC NETWORKS BE READY FOR YEAR 2000? by David S. Isenberg Public communications networks hold the economy together like never before in history. As FCC Chairman William Kennard says, "Virtually all sectors of the global economy depend upon reliable communications networks." Yesterday, supplies for several months were stored locally in warehouses. But in today's "just-in-time economy", networks of information systems dispatch our daily bread. Network failures, even spotty ones, could rattle the entire supply chain. The transition to the next millennium could be a non-event for the world's public communications networks. Or it could be a disaster. Because the network plays such a critical role in the economy, we'd like to know . . . but we don't. FCC Commissioner Michael Powell talks about "the mathematical difficulty of testing the entire network." He cites GTE's calculation that it would need 10 to the 27th power (10^27) tests to test all the interactions in the entire network. If we started now, we would have to do 3000 billion billion tests a second to complete testing in the 300-some remaining days until December 31, 1999. The Gartner Group, perhaps glossing over these daunting figures, predicts that the public network will be "mostly uninterrupted" in the transition to Year 2000. And Year 2000 web sites of the major telcos of the U.S. speak in terms of massive mobilizations of effort -- and of "goals" and "targets" and "the feasibility of contingency plans." There has been much progress to date, but it is not comforting. John Pasqua, head of AT&T's Year 2000 efforts, says, "I get up with a nervous stomach most mornings. And that is exactly the way I want to be. I don't want to be lulled into a false sense of security prematurely." He explains that AT&T Y2K efforts are on schedule, and also that he will have teams of network specialists -- he calls them SWAT teams -- on duty during critical dates. Here is what we know: The network is complicated. Information systems that run communications systems are connected to communications systems that run information systems. In complicated systems we can expect the unexpected. There are two recent examples -- AT&T's massive date network outage last April, and last May's PanAmSat communications satellite failure. These illustrate that even networks that are engineered to be 99.999 percent reliable aren't. The satellite failure was caused by three extremely unlikely events that happened together. The data outage, in AT&T's frame relay network, occurred during a network upgrade. Network upgrade time is a particularly vulnerable period -- and urgent fixes of systems to make them Year 2000 compliant are likely to spur rushed upgrades. Complicated systems must interoperate. There are some 1400 telephone companies in the United States, and these interconnect with systems in 280 other countries. Year 2000 remediation has not begun in many of them. Then there are external factors. If the electric grid fails, for example, how long can the telecommunications network stay up? Telephone companies have back-up generators, but how long can they run? Will there be fuel for them if refineries, also highly complex, information-dependent, accident-prone systems, fail? Computer guru Ed Yourdon in his book Time Bomb 2000 points out that oil refineries often store only 4-5 days of crude, thus are dependent on constant tanker deliveries. And, he continues, many systems on tankers are non-Year 2000 compliant. If the public part of the network stays up, trouble could come from customer equipment. The Gartner Group says that call centers are at risk. These are the systems that tell us, "Please wait, your call will be answered in the order it was received." In January 2000, if your airline might not fly, you'll call a call center. If your bank account is "temporarily inaccessible" you'll call a call center. If you are being billed for 99 years of electricity, you'll call a call center. If you don't get through, you'll call again. The extra call volume could exceed the network's capacity to process calls, leading to regional failures -- or worse. No wonder John Pasqua's stomach is growling. (The above article was published as "Networks: The Domino Effect" in Business 2.0, January, 1999, p. 66. Copyright 1999 by Business 2.0.) ------- CONFERENCES ON MY CALENDAR + Global Carrier Network Reliability, Jan.27, 1999, Washington DC, 8:00 - 9:30 AM, Marriott Metro Ctr. Sponsored by America's Network. I'll be the (im)moderator. Admission free, but you must register. 800-854-3112 x446 or http://www.americasnetwork.com/nr_live/global_carrier.htm + Solutions 99! -- Feb 9, 1999, Denton TX: Sponsored by University of North Texas, . See http://www.cas.unt.edu/solutions99 or contact Mitch Land. + CLEC Reliability -- February 10, 1999, Atlanta GA, 7:45 - 9:00 AM. Westin Peachtree Hotel, Atlanta GA. Sponsored by America's Network. Your Im-Moderator will, once again, attempt to PRO-be and PRO-voke. Free if you register: 800-854-3112 x446 or http://www.americasnetwork.com/nr_live/register.cfm ------- COPYRIGHT NOTICE: Redistribution of this document, or any part of it, is permitted for non-commercial purposes, provided that the two lines below are reproduced with it: Copyright 1998 by David S. Isenberg isen@isen.com -- http://www.isen.com/ -- 1-888-isen-com ------- [ to subscribe to the SMART list, please send a brief, PERSONAL statement to isen@isen.com (put "SMART" in the Subject field) saying who you are, what you do, maybe who you work for, maybe how you see your work connecting to mine, and why you are interested in joining the SMART List. ] [ to unsubscribe to the SMART List, send a brief unsubscribe message to isen@isen.com ] [ for past SMART Letters, see http://www.isen.com/archives/index.html ] ------- *--------------------isen.com----------------------* David S. Isenberg isen@isen.com d/b/a isen.com http://www.isen.com/ 18 South Wickom Drive 888-isen-com (anytime) Westfield NJ 07090 USA 908-875-0772 (direct line) 908-654-0772 (home) *--------------------isen.com----------------------* -- Technology Analysis and Strategy -- Rethinking the value of networks in an era of abundant infrastructure. *--------------------isen.com----------------------*
Date last modified: 4 Jan 99