Living with failure: Lessons from nature?Citation formats

Standard

Living with failure: Lessons from nature? / Furber, Steve.

Proceedings - Eleventh IEEE European Test Symposium, ETS 2006|Proc. Eleventh IEEE Eur. Test Symp.. Vol. 2006 IEEE Computer Society , 2006. p. 4-5.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Harvard

Furber, S 2006, Living with failure: Lessons from nature? in Proceedings - Eleventh IEEE European Test Symposium, ETS 2006|Proc. Eleventh IEEE Eur. Test Symp.. vol. 2006, IEEE Computer Society , pp. 4-5, 11th IEEE European Test Symposium, ETS 2006, Southampton, 1/07/06. https://doi.org/10.1109/ETS.2006.28

APA

Furber, S. (2006). Living with failure: Lessons from nature? In Proceedings - Eleventh IEEE European Test Symposium, ETS 2006|Proc. Eleventh IEEE Eur. Test Symp. (Vol. 2006, pp. 4-5). IEEE Computer Society . https://doi.org/10.1109/ETS.2006.28

Vancouver

Furber S. Living with failure: Lessons from nature? In Proceedings - Eleventh IEEE European Test Symposium, ETS 2006|Proc. Eleventh IEEE Eur. Test Symp.. Vol. 2006. IEEE Computer Society . 2006. p. 4-5 https://doi.org/10.1109/ETS.2006.28

Author

Furber, Steve. / Living with failure: Lessons from nature?. Proceedings - Eleventh IEEE European Test Symposium, ETS 2006|Proc. Eleventh IEEE Eur. Test Symp.. Vol. 2006 IEEE Computer Society , 2006. pp. 4-5

Bibtex

@inproceedings{ad5aceb236cc4c17b97770b0e5ad5233,
title = "Living with failure: Lessons from nature?",
abstract = "The resources available on a chip continue to grow, following Moore's Law. However, the major process by which the benefits of Moore's Law accrue, which is the continuing reduction in feature size, is predicted to bring with it disadvantages in terms of device reliability and parameter variability. The problems that this will bring are underlined by the predictions from an Intel commentator: within a decade we will see 100 billion transistor chips. That is the good news. The bad news is that 20 billion of those transistors will fail in manufacture and a further 10 billion will fail in the first year of operation. What does a 20-30% device failure rate mean for designers and what does it mean for production test? As a designer, I have some idea how to design for very low device failure rates. Redundancy, fault-tolerance and ECC are all approaches that can cope with very low failure rates. The basic assumption is that faults are infrequent so we only have to cope with one at a time. But a 20-30% failure rate will clearly violate this assumption, and the bottom line is that I have no idea how even to begin to design useful circuits that can cope with this level of failure. For an example of a functional device that can cope with this level of failure, we have to look to nature. Brains can cope with very high levels of neuron failure. But we have no idea how they work, let alone how they keep working after these failures. What might we be able to learn from biology about building systems that continue to function as components change and fail? Will manufacturing test change from being primarily about checking that every device on the chip works to checking that enough devices are working to ensure that the chip functions correctly and is likely to continue to do so even after many more devices have changed or failed over the early operational life of the chip? In this talk, I will describe a proposed chip multiprocessor system that is being developed primarily to help understand how the brain works, but which will also present the sorts of challenges that will increasingly dominate the future of production test. The chip does not need to be fully functional to be useful, so how can the production test establish that enough works for the chip to be useful, even after further early-life failure? {\textcopyright} 2006 IEEE.",
author = "Steve Furber",
year = "2006",
doi = "10.1109/ETS.2006.28",
language = "English",
isbn = "0769525660",
volume = "2006",
pages = "4--5",
booktitle = "Proceedings - Eleventh IEEE European Test Symposium, ETS 2006|Proc. Eleventh IEEE Eur. Test Symp.",
publisher = "IEEE Computer Society ",
address = "United States",
note = "11th IEEE European Test Symposium, ETS 2006 ; Conference date: 01-07-2006",
url = "http://dblp.uni-trier.de/db/conf/ets/ets2006.html#Furber06http://dblp.uni-trier.de/rec/bibtex/conf/ets/Furber06.xmlhttp://dblp.uni-trier.de/rec/bibtex/conf/ets/Furber06",

}

RIS

TY - GEN

T1 - Living with failure: Lessons from nature?

AU - Furber, Steve

PY - 2006

Y1 - 2006

N2 - The resources available on a chip continue to grow, following Moore's Law. However, the major process by which the benefits of Moore's Law accrue, which is the continuing reduction in feature size, is predicted to bring with it disadvantages in terms of device reliability and parameter variability. The problems that this will bring are underlined by the predictions from an Intel commentator: within a decade we will see 100 billion transistor chips. That is the good news. The bad news is that 20 billion of those transistors will fail in manufacture and a further 10 billion will fail in the first year of operation. What does a 20-30% device failure rate mean for designers and what does it mean for production test? As a designer, I have some idea how to design for very low device failure rates. Redundancy, fault-tolerance and ECC are all approaches that can cope with very low failure rates. The basic assumption is that faults are infrequent so we only have to cope with one at a time. But a 20-30% failure rate will clearly violate this assumption, and the bottom line is that I have no idea how even to begin to design useful circuits that can cope with this level of failure. For an example of a functional device that can cope with this level of failure, we have to look to nature. Brains can cope with very high levels of neuron failure. But we have no idea how they work, let alone how they keep working after these failures. What might we be able to learn from biology about building systems that continue to function as components change and fail? Will manufacturing test change from being primarily about checking that every device on the chip works to checking that enough devices are working to ensure that the chip functions correctly and is likely to continue to do so even after many more devices have changed or failed over the early operational life of the chip? In this talk, I will describe a proposed chip multiprocessor system that is being developed primarily to help understand how the brain works, but which will also present the sorts of challenges that will increasingly dominate the future of production test. The chip does not need to be fully functional to be useful, so how can the production test establish that enough works for the chip to be useful, even after further early-life failure? © 2006 IEEE.

AB - The resources available on a chip continue to grow, following Moore's Law. However, the major process by which the benefits of Moore's Law accrue, which is the continuing reduction in feature size, is predicted to bring with it disadvantages in terms of device reliability and parameter variability. The problems that this will bring are underlined by the predictions from an Intel commentator: within a decade we will see 100 billion transistor chips. That is the good news. The bad news is that 20 billion of those transistors will fail in manufacture and a further 10 billion will fail in the first year of operation. What does a 20-30% device failure rate mean for designers and what does it mean for production test? As a designer, I have some idea how to design for very low device failure rates. Redundancy, fault-tolerance and ECC are all approaches that can cope with very low failure rates. The basic assumption is that faults are infrequent so we only have to cope with one at a time. But a 20-30% failure rate will clearly violate this assumption, and the bottom line is that I have no idea how even to begin to design useful circuits that can cope with this level of failure. For an example of a functional device that can cope with this level of failure, we have to look to nature. Brains can cope with very high levels of neuron failure. But we have no idea how they work, let alone how they keep working after these failures. What might we be able to learn from biology about building systems that continue to function as components change and fail? Will manufacturing test change from being primarily about checking that every device on the chip works to checking that enough devices are working to ensure that the chip functions correctly and is likely to continue to do so even after many more devices have changed or failed over the early operational life of the chip? In this talk, I will describe a proposed chip multiprocessor system that is being developed primarily to help understand how the brain works, but which will also present the sorts of challenges that will increasingly dominate the future of production test. The chip does not need to be fully functional to be useful, so how can the production test establish that enough works for the chip to be useful, even after further early-life failure? © 2006 IEEE.

U2 - 10.1109/ETS.2006.28

DO - 10.1109/ETS.2006.28

M3 - Conference contribution

SN - 0769525660

SN - 9780769525662

VL - 2006

SP - 4

EP - 5

BT - Proceedings - Eleventh IEEE European Test Symposium, ETS 2006|Proc. Eleventh IEEE Eur. Test Symp.

PB - IEEE Computer Society

T2 - 11th IEEE European Test Symposium, ETS 2006

Y2 - 1 July 2006

ER -