I'm going to use this issue to republish some of the letters I've sent into Risks Digest. You should look at the
Risks Archive to see ensuing discussions.
Note that some of the submissions (such as my tirade on the stupid leap-second, have
already subsumed by other essays. This is not a complete set of my entries and some of
been slightly edited. I've also added comments in italics where appropriate.
From Risks 19.93
This seems to have gotten wide circulation in both Risks Digest and David Farber's
Interesting People List. My point is more about the powerlessness that people feel in
dealing with these kind of problems than to pick on ATT. But it is a lesson in the value
of specific rather than theoretical essays.
Date:
|
Fri, 21 Aug 1998 10:21 -0400
|
To:
|
Risks Submissions <risks@csl.sri.com>
|
From:
|
Sample@Frankston.com
|
Subject:
|
AT&T and snails
|
Using Quicken I sent a payment to my ATT wireless account. A few weeks later they
started dunning me though the payment was clearly listed and processed. But it hadn't
cleared. After a while I looked at the payee record and noticed it was queued for
electronic payment. But that confuses ATT wireless which claims to not accept electronic
payments. So I try again but notice that my paper payment is coerced into an electronic
payment automatically. I finally figure out that if, instead of paying "ATT Wireless
Services", I add a {} comment to the end, it remains a paper payment. At least on my
side.
I figured this out when one of the ATT billing people called me on the phone. She said
she would note that the payment is on the way. Just got another call from someone at ATT
wireless demanding payment. Of course, nothing in my record and once again told me that
ATT doesn't handle accept electronic payments and that everyone places a check on the back
of a snail (OK, snailmail might not be a fair term but it seems most appropriate or, at
least, colorful here). Of course, this is nonsense considering the demographics of the PCS
early adopters.
Maybe I shouldn't be surprised since this is the same company that has been sending me
a monthly bill for a $.15 credit on an old home office line for over a year.
My real puzzle is why ATT doesn't seem to have a clue that it is their fault that the
payment is coerced to an electronic payment and that someone should attempt to solve it.
The larger issue is that whether a problem is caused by new technologies or more
traditional problems, I'm struck by the lack of an attitude that problems are there to be
solved instead of simply suffered. It is a reaction consistent with dealing with any
bureaucracy but for those of (some of) us reading this list they are teething problems
which need attention.
From Risks 19.85
In my ongoing role of being the naive* contrarian ....
I'm concerned about all the Y2K discussion that focuses on prevention and little, if
any, discussions of contingency plans. This represents a basic misunderstanding of how to
deal effectively technology. I use the term "ballistic automation" for the
clockwork-like model of automation in which one sets up all the rules and the system just
runs without ongoing intervention and tweaking.
In any system, there will be surprises and failures. While prevention is great, it is
never complete. Instead one must prepare for failures. We must assume that there will be
pervasive Y2K failures. The question is how do we survive and recover from them. Such
planning has a higher value than Y2K prevention in that they basis for resilience that can
deal with failures in general and, as a side benefit, provides better security since
security breaches are simply failures.
And Y2K is only one of many problems. There are many limited-size fields including
other clocks (like the Unix one due to expire in 2037?)
Systems do not deal with events that are unanticipated and have difficulty with those
anticipated but not experienced.
One simple example the response zip code changes. Read Post
office delivers new codes for more on the zip code changes in the Boston area.
It took years for phone systems to learn to deal with area code changes and generalized
area codes. But no one has heard of a zip code change. When I provide my new zip code on
e-forms, it gets rejected by systems that do checking. Even mail from the Post Office
itself uses the old zip code. Not only is the zip code changing but it will be recycled
within a year or so! Hopefully, unlike the phone network, I'll still be able to get mail
in the future since the Post Office does have some resilience in that it tries to handle
failures with manual intervention (for now). But the more general principles of systems
design need to percolate from what we've learned in designing systems into more the more
general awareness of design issues. The zip code system, for example, was designed without
leaving extra zip codes for future growth!
While on the topic of the Post Office, there was another article Errant
mail delivery brings bagful of woes about the consequences of unreliable delivery. The
concept of end to end vs link level reliability is something we've learned in the design
of computer systems (See End-To-End
Arguments In System Design). Again, this experience needs to feed back into low tech
systems.
There are indeed risks of technology. But there are also risks of nontechnology. We
must understand the risks but shouldn't be naive to assume that we can choose a risk-free
path. And we must learn that we only anticipate some changes and need to "shake
out" systems periodically. Just like we've learned that value of forest fires, Y2K
might help in clearing out the underbrush.
* I'm not really that naive, but a nonnaive discussion that goes into all the issues
would be too long and boring for this forum.
From Risks 19.73
This one got a strong reaction from many who defended the current system. I tried to
explain that the real point is that the current system emphasizes procedural correctness
over safety and that it is necessary have an approach that allows one to enhance safety
with the addition of GPS rather than to ban any change on the assumption that any change
anywhere in the plane will make the existing systems less reliable. Alas, aviation is a
field that selects against those will innovate. Safety in the airline industry is also
more about marketing safety. Thus even a modest number of crashes must be avoided by
making the cost of each seat very high. It is like when ATT could charge any price they
deemed necessary to assure you that your phone will last 100 years.
Caveat: I'm not an expert on avionics. My interest is in creating resilient distributed
systems....
I just walked off a DC-10 that had mechanical problems was delayed. The 757 I'm on is
racing it to Interop at the moment.
DC-10 was already an hour late getting from the hanger to the gate due to either
traffic problems (within O'Hare) or a cargo door problem.
But the new problem is (was) a bad compass. The third compass on the plane had to be
replaced due to FAA rules. After all, we can't take any risks, can we? I asked the crew
whether they could travel without it and rely on a GPS. Of course, a DC-10 has no GPS! Not
surprising given the age of the plane. But what is of concern is that they couldn't just
go out to the store, buy a GPS, and place it in the cockpit.. As a passenger, when I bring
my GPS and PC, I've got technology far far ahead to the technology on the plane.
Technology to which two hundred (whatever a full DC-10 holds) trust their lives! On the
other hand, if both of the other two compasses did fail, there are still lots of ground
systems that can find the plane and bring it to a nearby beacon (it is cloudy, so they
can't just get out their road maps).
I was already thinking about these issues after talking to the crew (while waiting for
the plane to appear out of the mists at the gate) about the 727 which has even more
primitive avionics. The reason that the systems can't be upgraded is that the whole plane
would have to be recertified as a new aircraft.
There is something very wrong here. The engineering practices that are supposed to
assure our safety seem to work to assure our lack of safety.
I can understand the historic necessity of treating the airplane as a single tightly
interconnected system. There wasn't the luxury of giving the electronic systems enough
capability to act autonomously. I presume, though, that the mechanical systems try to be
independent-enough to reduce the propagation of failures.
But, if we think about the simple example of just placing a GPS in the cockpit and
allowing the airplanes computer to use the data we have a very different model. Of course,
the navigation system should fully trust the GPS and must do some reasonable checks as
well as cross-check with other sources. If the GPS fails, then it would compensate.
Yes, there can be strange systemic interactions. But, instead, we have a situation that
assures lousy navigation rather than permitting improvements when available.
Understanding how to build such resilient distributed systems is still in the challenge
category. But the Web is a very good example. I see the technology growing more due to
hacking than design. Effective hackers work against the constraints of others and are thus
forced into being tolerant of other's mistakes. Most will get it wrong, but I'd rather a
pilot just put a GPS in the cockpit even if not interconnected, than having to get out the
sextant for each flight.
From Risks 19.77
I wrote this in response to comments on the previous entry.
I shouldn't be surprised that the general response has been to tell me (personally) why
things have to be the way they are. I've even been told both why there are three compasses
and also that there are only two.
Of course I know there are very good reasons for the current approaches. But where is
the outrage and dissatisfaction with such a cumbersome and limited approach to building
and, more important, evolving systems? Implicit in many of the responses is a naive notion
that system boundaries are well-defined.
It's as if I was back listening to ATT in the 70's explaining why it civilization would
end if I were allowed to plug my telephone into the phone network! (Yes, really!)
There are those of us who, in the 70's took the toys such as the Apple ][, and made
them the tools choice for trillion dollar calculations such as the national budget. From
the thread about sextants, the Navy is discovering that the retail marketplace has become
the driver. (Are there sextants in the cockpits?)
As to the complaints about the limitations of GPS (of which I and the pilots are well
aware), why is there no incentive to address them? Perhaps adding level indicators and
reasonableness checking? They already have batteries. One can evolve "toys" much
more quickly than "commercial" equipment as long as the linkage with the other
systems is arms-length and there is sufficient mutual suspicion.
It would be great to have the position data available on the in-plane IP network. Not
only would one be able to add equipment (such as terrain maps) without recertifying the
plane, it would allow passengers to use their PCs to enrich the view from the window.
I'm not sure how to respond to the safety issue. While I do wear my seat belts during
the entire flight it's a non sequitur. Of course I understand the difference between
safety and reliability but it is more than a simple matter of retreating into semantics
and formalisms. Safety is not absolute "freedom from accidents or losses".
So I'll fan the flames by asking why flying is safer than driving? The reason is that
the marketplace does demand it. Plane crashes are much worse PR per capita than car
crashes. So we spare no expense to make planes not crash. Those who can't afford it risk
their lives driving (see the 27 May 1998 NY Times business section). Have we simply
shifted the risk?
Only respond if you are dissatisfied with business as usual. Post no rationalizations.
Bob Frankston http://www.mit.edu/~bobf
I don't think I submitted this one but it is relevant. This experience of arbitrary
rules was reinforced on Delta Airlines which bans the use of cell phones while inside the
plane on the ground but allows me to use it near the cockpit at the open door as long as
I'm theoretically outside the circumference of the place.
Once more I'm focusing on my experiences in flying. Perhaps I'm doing it too much but
it is represents a strong contrast with marketplace thinking. In the consumer marketplace,
one must accommodate human foibles but when flying, I have the responsibility to follow
many rules. If they are violated by me or any of the other passengers the consequences are
dire. If this were true, the risk would be intolerable.
To attend INet 98 (http://www.isoc.org/inet98/) I flew Lufthansa (LH) which has
different rules from its partner United (UA). LH doesn't want anyone using CD players
connected to their laptops or, perhaps, even within. And DL doesn't want you using a
cellular phone on the ground in the back of the plane but you can use it inches from the
cockpit as long as it is on the curve that would be occupied by the door if it were
closed. Of course, the airlines casually ask people to turn off their phones and other
transmitters but I suspect that many, if not a majority of the people, carrying
two-pagers, small phones and other devices are even aware that they are carrying
transmitters. Those who have tried to look for eyeglasses only to realize that they are
already in place can understand not even being aware of carrying such a device. A PDA is
just a note pad, it's not a computer any more than one's hearing aid is a computer or ones
watch is an electronic device.
Given the reality, what is being done to make planes safe in spite of humans acting
like, well, humans?
Of course, I do have my own bias against sensory deprivation due to not being allowed
to use my electronic pen and paper. And against the silly need to lug the weight (and the
space) of wood pulp (thus increasing the weight of the plane) when all I want to do is
read the contents of a book. And I also fear flying on an aircraft that overly vulnerable
to electronic noise.
BTW, similar reasoning applies to banning the use of cellular phones while driving. The
ban should be extended to carrying passengers since I find the presence of interesting
passengers is even more likely to put me into "automatic" driving mode due to
the rich social matrix. A phone conversation tends to be less engaging and thus less
distracting. I can even conjure up the necessary statistics to "prove" this. Or,
at least, an anecdote or two or one.
PS: In addition to INet 98, I also attended the 2nd World Skeptics Congress in
Heidelberg. The emphasis there is on critical thinking
(http://www.gwup.org/konferenz_e.html). Very appropriate.
From Risks 19.89
This is a short comment following on to an earlier discussing in issues 19.80, 19.81
and 19.83.
On the plane back from Europe (Lufthansa appropriately enough), I sat next to a German
engineer (actually, chemical and running a company, but that's beside the point) and asked
him about the train crash. He said that the wheel had been off the tracks for 5km and the
magnitude of the problem was due to having a switch track just before a bridge. Similar
accidents had occurred in France but in less damaging locations. The solution is to track
the behavior of the wheel (or the proximity to the track) with sensors to discover the
problem early. This sounds even better than periodically checking the wheel since it will
catch the actual problem as soon as it occurs even if there were a separate cause for a
wheel problem.
I may have submitted a comment on this topics to Risks in the past but I don't know if
it got published.
I just encountered another web form this one asking me to type in my
registration number without hyphens or spaces. I encounter this often on entering credit
card numbers. All I can think of is that a complete and utter idiot setup the site since:
- The punctuation is there to make it easier for humans to parse the number and thus
reduce errors. This is in the interest of those asking for this information.
- It is trivial for anyone with a nonnegative IQ to parse the field and simply ignore the
punctuation.
I don't understand why there are so many forms of this sort. It represents the worst in
systems design forcing insane systems decisions on users instead of fixing them.
Actually, it's not just the user who suffers from the provider who gets bad information.
Will the madness never end? I don't really expect it to since cranial computrons seem
to be in short supply.
The advantage of publishing here instead of in Risks is that I don't have to
pretend to be well-behaved and diplomatic.
I'm not a Latin scholar, but I did make a mistake on my site which caused me to send
deliver failure messages to a number of authors in a recent Risks Digest issue. I'll
replace this entry with the apology after it appears in Risks itself. But it's the kind of
error that reminds one to be understanding of the technical foibles of others. The issue
is less how one avoids all problems as much as it is an issue of how one deals with them
after that fact.
"Peter G. Neumann" neumann@csl.sri.com:
[Failure] User risks-digest not listed in public Name & Address Book (Subject was:
Galaxy IV revisited)]
That said, those who tell me not to type hyphens are still idiots.