Understanding the Impact of Silicon Errors on Functional Safety Standards Compliance

Functional Safety를 위해 HW level에서 필요한 일들에 대해서 잘 정리한 자료이다.

good starting point가 될 수 있을 것이라 생각한다.


링크가 깨져서 더이상 자료가 유효하지 않음. 그렇다고 남의 자료 올리기는 좀 그렇고..
자료가 필요하신 분은 댓글을 …


Faults in a functional safety system can be broadly classified into two categories:
Systematic and Random faults

Safety Faults and their Causes
Systematic Random
Systematic Faults
– Result from a fault in design or manufacturing
– Often a result of failure to follow best practices
– Rate of systematic faults can be reduced through continual and
rigorous process improvement
Random Faults
– Result from random defects inherent to process or usage condition
– Rate of random faults cannot generally be reduced; focus must be on
the detection and handling in the application.
Systematic or Random Fault?
• Use of IC outside of datasheet specification? Systematic
• SEU corruption of a memory element? Random
• Logic fault upon exercising a specific path? Either
• Data corruption in presence of EMI? Either
• ESD damage to an I/O? Either

• Both systematic and random faults must be addressed
• Systematic faults are addressed by:
– Application of robust development processes
– Verification and validation activities at all levels of development
– Continuous monitoring of operations once in production
• Random faults are addressed by:
– Architectural analysis to understand impact of faults on the system
– Application of diagnostics to detect critical faults
– Transition of system into a safe state upon fault detection

ISO 26262-5:2010 notes four fault models:
– “Stuck-at” fault model: “… a fault category that can be described with
continuous “0” or “1” or “on” at the pins of an element.”
– “D.C.” fault model: “includes the following failure modes: stuck-at faults,
stuck-open, open or high impedance outputs, as well as short circuits
between signal lines.”
– “A.C.” fault model: “transition faults … and path delays”
– “Soft error” fault model: “…These transient faults are also referred to as
Single Event Upset (SEU) and Single Event Transient (SET).”
• Such statements are the result of compromises in committee to attempt
to find common definitions for all types of electrical and electronic
• Any relevant failure modes known to the hardware developer or in the
state of the art should be considered in the functional safety analysis.

JEDEC Publication
– Failure Mechanisms and models for Semiconductor Device(JEP122G)
– Measurement and reporting of alpha particle and terrestrial cosmic ray-induced soft errors
in semiconductor devices(JESD89A)

Establishing Base Failure Rates
– IEC TR 62380
– SIEMENS SN 29500-2
– FIDES Guide 2009 Reliability Methodology for Electronic Systems

In a perfect world, we could establish a base
failure rate for every failure mode of an IC.
• From a practical standpoint, we estimate using models, field data, and published
• Partition of failure rates between elements on an IC is typically done via circuit type,
die area, and/or transistor count

Safe vs. Dangerous Faults

A safe fault is one which does not result in propagation to a safetyrelated
failure at the system level.
• Determination of safe vs. dangerous faults is primarily based on end application usage of the product.
• An “architectural safeness” factor can be established via fault injection and engineering analysis, providing a baseline for minimum
percentage of safe faults.
– Typical architectural safe faults include faults in debug and DFT logic which is not activated during normal system operation.
• Fault injection techniques applied during IC simulation or on functional models can be used to establish the ratio of safe vs. dangerous faults.

Considering Timing Aspects in ISO 26262

As illustrated in ISO 26262-1:2010; Figure 4, a system must be able to
detect faults and transition to safe state before a fault can become a
system level hazard (fault tolerant time interval or FTTI)
• To impact SPFM, any diagnostics must execute within the FTTI
• To impact LPFM, any diagnostics must execute once per drive cycle

Diagnostic Timing Implications

• Shorter FTTI safety goals result in strong demand for parallel redundancy based diagnostics.
– Airbag systems ( – Steering and braking (10ms – 100ms FTTI) converging on lockstep CPUs
• Longer FTTI safety goals allow more flexibility in diagnostic selection
– RADAR and vision ADAS systems (500ms+ FTTI) tend to rely on multiple
samples of input sensors and software based diagnostics
• Though it is attractive to repurpose DFT logic for functional safety diagnostics the execution timing aspects may prohibit effective use as run time diagnostics.

Selection of Diagnostics

• ISO 26262 does not mandate specific diagnostics to be implemented.
• It does recommend specific diagnostics for several different types of basic design elements, as illustrated in ISO 26262-5; Table D.6
• The IC developer must demonstrate the performance of implemented diagnostics. Typically this is done with fault injection testing.

Safety Manuals and Analysis Reports

• Safety manuals are created to provide instructions to the system integrator of how to use
the safety features of the IC.
• Analysis reports allow the system integrator to derive safety related metrics which
can be applied to their system level analysis.

ISO/AWI 19451 “Application of ISO 26262 to Semiconductors” currently has over 20 semiconductor companies participating.

– If you would like to participate, please contact me offline


One thought on “Understanding the Impact of Silicon Errors on Functional Safety Standards Compliance”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s