Simple multi-level redundancy

From Wise Nano

Jump to: navigation, search


This page began as a CRN science essay; used by permission. Please //make comments obvious// or use the discussion page.

Coping with Nanoscale Errors

There is ample evidence that MIT’s Center for Bits and Atoms is directed by a genius. Neil Gershenfeld has pulled together twenty research groups from across campus. He has inspired them to produce impressive results in fields as diverse as biomolecule motors and cheap networked light switches. Neil teaches a wildly popular course called "How to make (almost) anything", showing techies and non-techies alike how to use rapid prototyping equipment to make projects that they themselves are interested in. And even that is just the start. He has designed and built "Fab Labs"—rooms with only $20,000 worth of rapid-prototyping equipment, located in remote areas of remote countries, that are being used to make crucial products. Occasionally he talks to rooms full of military generals about how installing networked computers can defuse a war zone by giving people better things to do than fight.

So when Neil Gershenfeld says that there is no way to build large complex nanosystems using traditional engineering, I listen very carefully. I have been thinking that large-scale nano-based products can be designed and built entirely with traditional engineering. But he probably knows my field better than I do. Is it possible that we are both right? I've read his statements very carefully several times, and I think that in fact we don't disagree. He is talking about large complex nanosystems, while I am talking about large simple nanosystems.

The key question is errors. Here's what Neil says about errors: "That, in turn, leads to what I'd say is the most challenging thing of all that we're doing. If you take the last things I've mentioned—printing logic, molecular logic, and eventually growing, living logic—it means that we will be able to engineer on Avogadro scales, with complexity on the scale of thermodynamics. Avogadro's number, 1023, is the number of atoms in a macroscopic object, and we'll eventually create systems with that many programmable components. The only thing you can say with certainty about this possibility is that such systems will fail if they're designed in any way we understand right now."

In other words, errors accumulate rapidly, and when working at the nanoscale, they can and do creep in right from the beginning. A kilogram-scale system composed of nanometer-scale parts will have on the order of 100,000,000,000,000,000,000,000 parts. And even if by some miracle it is manufactured perfectly, at least one of those parts will be damaged by background radiation within seconds of manufacture.

Of course, errors plague the large crude systems we build today. When an airplane requires a computer to stay in the air, we don't use one computer—we use three, and if one disagrees with the other two, we take it offline and replace it immediately. But can we play the same trick when engineering with Avogadro numbers of parts? Here's Neil again: "Engineers still use the math of a few things. That might do for a little piece of the system, like asking how much power it needs, but if you ask about how to make a huge chip compute or a huge network communicate, there isn't yet an Avogadro design theory."

Neil is completely right: there is not yet an Avogadro design theory. Neil is working to invent one, but that will be a very difficult and probably lengthy task. If anyone builds a nanofactory in the next five or ten years, it will have to be done with "the math of a few things." But how can this math be applied to Avogadro numbers of parts?

Consider this: Every second, 100,000,000 transistors in your computer do 2,000,000,000 operations; there are 7,200 seconds in a two-hour movie; so to play a DVD, about 1021 signal-processing operations have to take place flawlessly. That's pretty close to Avogadro territory. And playing DVDs is not simple. Those transistors are not doing the same thing over and over; they are firing in very complicated patterns, orchestrated by the software. And the software, of course, was written by a human.

How is this possible, and why doesn't it contradict Neil? The answer is that computer engineering has had decades of practice in using the "math of a few things." The people who design computer chips don't plan where every one of those hundred million transistors goes. They design at a much higher level, using abstractions to handle transistors in huge organized collections of collections. Remember that Neil talked about "complexity on the scale of thermodynamics." But there is nothing complex about the collections of transistors. Instead, they are merely complicated.

The difference between complication and complexity is important. Roughly speaking, a system is complex if the whole is greater than the sum of its parts: if you can't predict the behavior that will emerge just from knowing the individual behavior of separated components. If a system is not complex, then the whole is equal to the sum of the parts. A straightforward list of features will capture the system's behavior. In a complicated system, the list gets longer, but no less accurate. Non-complex systems, no matter how complicated, can in principle be handled with the math of a few things. The complications just have to be organized into patterns that are simple to specify. The entire behavior of a chip with a hundred million transistors can be described in a single book. This is true even though the detailed design of the chip—the road map of the wires—would take thousands of books to describe.

Neil talked about one other very important concept. In signaling, and in computation, it is possible to erase errors by spending energy. A computer could be designed to run for a thousand years, or a million, without a single error. There is a threshold of error rates below which the errors can be reliably corrected. Now we have the clues we need to see how to use the math of a few things to build complicated non-complex systems out of Avogadro numbers of parts.

When I was writing my paper on "Design of a primitive nanofactory", I did calculations of failure rates. In order for quadrillions of sub-micron mechanisms to all work properly, they would have to have failure rates of about 10-19. This is pretty close to (the inverse of) Avogadro's number, and is essentially impossible to achieve. The failure rate from background radiation is as high as 10-4. However, a little redundancy goes a long way. If you build one spare mechanism for every eight, the system will last somewhat longer. This still isn't good enough; it turns out you need seven spares for every eight. And things are still small enough that you have to worry about radiation in the levels above, where you don't have redundancy. But adding spare parts is in the realm of the math of a few things. And it can be extended into a workable system.

The system is built out of levels of levels of levels: each level is composed of several similar but smaller levels. This quasi-fractal hierarchical design is not very difficult, especially since each level takes only half the space of the next higher level. With many similar levels, is it possible to add a little bit of redundancy at each level? Yes, it is, and it works very well. If you add one spare part for every eight at each level, you can keep the failure rate as low as you like—with one condition: the initial failure rate at the smallest stage has to be below 3.2%. Above that number, and one-in-eight redundancy won't help sufficiently—the errors will continue to grow. But if the failure rate starts below 3.2%, it will decrease at each higher redundant stage.

This analysis can be applied to any system where inputs can be redundantly combined. For example, suppose you are combining the output of trillions of small motors to one big shaft. You might build a tree of shafts and gears. And you might make each shaft breakable, so that if one motor or collection of motors jams, the other motors will break its shaft and keep working. This system can be extremely reliable.

There is a limitation here: complex products can't be built this way. In effect, this just allows more efficient products to be built in today's design space. But that is good enough for a start: good enough to rebuild our infrastructure, powerful enough to build horrific weapons in great quantity, high-performance enough—even with the redundancy—to give us access to space; and generally capable of producing the mechanical systems that molecular manufacturing promises.

Personal tools