TI Training home
Automotive TI Tech Days 2020
Automotive TI Tech Days 2020 - Embedded processing sessions Enabling Functional Safety for mmWave Sensors
Automotive TI Tech Days 2020
3.2 Enabling Functional Safety for mmWave Sensors
Welcome again, everybody, to the automotive TI tech day. Hopefully, your first sessions were OK. Well, good, not OK, but this presentation is enabling functional safety in TI millimeter wave sensors, and the presenter is Ra-- Mr. Kamath here.
My name's Scott. I'll be moderating the session. All the participants on this list are going to be muted. So if you have any chats or if you're any questions or problems hearing anything, please bring it up on the chat function. All questions go to the chat function, and we'll address questions as the presenter would like. So with that, I'll hand it off to Mr. Kamath, and we'll go ahead and get started.
OK. Thanks, Scott, for the introduction. Hello, everyone, and welcome to this session on enabling functional safety in TI millimeter sensors. I'm Raghunandan Kamath. I'm the [INAUDIBLE] a software development manager for the millimeter wave devices.
Let's get into the agenda of today's session. We'll cover briefly the mmWave device introduction and also the TI solutions for the [INAUDIBLE] market. Then let's talk about what is the functional safety concept that this device was built on.
And we'll spend some time on the various safety features in our device, which includes predominantly one of the most important things. That is analog and RF modules, which has several built-in monitors net, and also the digital volumes. And I'd like to also discuss on the customer safety enablers that is provided by TI to make the safety goals easier for our customers. We'll also spend some time on the Q&A.
OK, so this is one of our high performance front ends which is available as of today with an integrated transceiver with four Rx and three Tx, sampling frequency of 45 Msps and IF bandwidth of 20 megahertz, enhanced RF performance in comparison to the gen one 1243 device, better phase noise, the lower noise figure, and higher, superior bumper handling. The package is actually FCBGA at 10.4 millimeters squared-- 10.4 into 10.4 millimeters squared. It's an ASIL-B capable device mostly targeted toward MRR or the mid-range and long range along with single 2243 plus external MCU and imaging radar, where you can cascade multiple runtimes with an external MCU.
So let's go into the functional safety concept that we have assumed four our millimeter wave device. As these devices are targeted for a broad market rather than any specific customer, it cannot be said that a specific implementation configuration is assumed here. And if you see the application that it can be used is like the safety guards for industrial applications. In case of automotive applications, the BSD, or the Blind Spot Detection, lane changes [INAUDIBLE] cross traffic alerts, emergency breaking, and also adaptive cruise controls.
So what we have identified as the safety goals for our sensors-- so do not miss any object when an object is present and also timely identification of the object when it is present. And do not detect or misidentify an object if it is not present. Don't have any kind of false detections. So TI mmWave [INAUDIBLE] safety concept and requirements are actually driven to address these above safety goals that we have listed here.
So this is again a representation of the hand equipment [INAUDIBLE] 1643 or the integrated radar sensor gives out the object data on the point cloud to a sensor fusion [INAUDIBLE] MCU. And then that ACU is connected to, again, another ACU, which actually controls the brake or steering or [INAUDIBLE]. Similarly, the imaging radar of your content only device interfaces with a safety MCU, which gives out the point cloud to the PCU that controls the [INAUDIBLE].
OK, so the architectural management of the random faults are actually listed here. So TI's mmWave product are actually built on safety island philosophy that is mostly borrowed from Oculus safety MCU family, which is one of the most popular safety MCUs that is there in the market today. So anybody who is aware of Oculus safety MCU can see a lot of coordination between mmWave device and the Oculus MCU family.
The critical modules are actually protected with continuous hardware monitoring, that they share a high level of confidence in safe operation. Once these models are actually made safe, it can be used to provide a comprehensive software diagnostic on other design elements. So the key aspects are that we actually use called the functional safety concept. Here is any kind of fault avoidance. OK, that could be the fault detection and control of faults, which is actually a result of any malfunctioning behavior.
And we do transitioning to the safe state. The safe state are actually represented here, which is the AR1 and AR2. And then any kind of fault tolerance-- fault tolerance is because of any degraded performance with driver warning. So what are the different operating states? A powered off state.
So this is the initial operating state of the device. No power is applied in this state, so to either the core or to the I/Os. And the device is actually non-functional at this point.
Then we have the safe state, which is the AR1. Here the device is powered but non-operational. The end research signals is actually asserted by the system, but it's actually not released until power supplies are stabilized. When this product actually reaches the AR1 state, the output drivers are actually tri-stated, and the I/O pins are actually kept in input mode when the default cooldown or cool ups [INAUDIBLE] values.
Then we have the cold boot state. The cold boot state, key components such as the oscillators or the digital objects, they make up state machines. And the debug logics are actually functional at this point. Once this cold boot state is actually completed, you get a warm reset signal which is released, leading to the warm boot state. And the warm research signal is a transition chain that can be monitored externally by this I/O, [INAUDIBLE] I/O pin.
Now in the warm-- sorry. Yeah, in the operational state, the device is capable of supporting any [INAUDIBLE] critical functionality. And then the two safe states, as we are calling out here, the AR1 and the AR2-- AR1 is when the reset is held by the device internally, and all the I/Os are actually moving to a tri-0state with their default cool ups and cooldowns while the AR2 two is when the analog signal or the research pin is actually asserted. And the error signaling model inside the device is configured to generate a NERROR.
OK, so the error signaling module in the mmWave device is actually a centralized block to which close to about 120 plus different identified errors actually are outed and with different priority levels. So the different errors could be CPU-related errors. It could be CPU abort or CPU interrupts.
During this abort, the program sequence actually transfers the context to the abort handler, and the software has the control to manage reports. Similarly, the CPU interrupts-- the interrupt allows the events external to the CPU to generate a program sequence context transfer to the internal handler. And then the software can actually take the control and manage that call. So we have this NERROR pin, which is there in the save state AR2, and generation of WARM RESET, which we already saw, and also the generation of NRESET, which is the safe state AR1.
So this is a representation of all the safety-critical functions that are there as part of the safety island diagnostic. So if you see the RF and analog sub system, whatever it is the models that are actually marked in red are all the safety critical functions. And there will be some kind of monitoring for each of these components on the device.
So we do have the RF/analog subsystem, the [INAUDIBLE] subsystem, the DSP, and then the master subsystem. So what are the safety critical functions that we categorize? The power supplies, the cloud generation research, RFN log modules, digital front-end filters, ramp generation, enhanced DMA [INAUDIBLE] buffers, then all the digital microcontroller elements-- the RTI or the realtime interrupts for the general purpose timers, the DMA modules, the inter-process communication like the mailbox, the SPI master interface, I/Os, and also the error signaling module. That is the ESM.
So let's take one by one the RF and analog subsystem. So internal-- so the power would actually compose of internal voltage monitors, then the temperature sensors. Clock-- that is the ACC, which is the [INAUDIBLE] clock comparator, the internal and extenral watchdogs, APL lock detections, RF and analog modules, [INAUDIBLE] lots of RF and analog modules which are actually monitored, and we'll discuss each of them in the further slides.
The RF and analog models actually includes the internal analog signal monitoring, external analog signal monitoring, the temperature sensor, TX power, TX ball break, the RX loopback test, RX IF loopback test, synthesizer chip frequency monitoring, BSS clock monitoring, saturation detection, TX loopback, and analog fault detection.
So safety critical element in the radar subsystem and the DSP subsystem-- so typically, all the memories would actually have the ECCs enabled and also boot time, PBIST, and LBIST for the [INAUDIBLE] generation and the DFE filters, then a lock step for the sequencer logic, then periodic software [INAUDIBLE] for configuration registers and also the BSS microcontroller, the boot time LBIST, PBIST for the RTCM memory and also the CPUs-- again, the ECC for all the TCM and [INAUDIBLE] memories, MPUs and PMUs-- Memory Protection Units-- and then a periodic software read back of configuration registers. And then again, for the ADC buffers, it's the ECC o the RAM and then boot time PBIST.
So the MSS, the master subsystem microcontroller elements, what are the critical functions, which, again the repetition of whatever we saw in the radar subsystem or the DSP subsystem? In addition to some of the IPs, like the IPC-- that is the mailbox-- with CRC, ECC on SRAM, then error signaling module, and then the SPI, which is error signaling module and end to end [INAUDIBLE] using the CRC. OK, so I'll actually use a brief about the safety concept.
Now maybe we can look at more details about the safety features that are there for the device. So that is a most important block, which is the RF and analog and digital monitoring, which is the most important aspect to consider when you're actually talking in terms of the context of safety. So mmWave device includes hardware and firmware elements to actually enable this monitoring of its analog and digital infections.
All these built-in feature are exposed to the users through formal APIs, and the customer or the users can actually configure each of these monitors and then use them to achieve their safety goals. So the APIs that are there have high amount of programmability in them. So the built-in monitor feature the customer wants to execute in the device or the user wants to execute in the device can be configured.
And also, you can configure the periodicity of its execution, verbosity of the monitoring report. The verbosity could be like give me a monitoring report every few FTTIs, or you can define the FTTI that you want for the monitoring report to be sent out from the device. And also, measurement comparison thresholds the device should use in reporting. The APIs provide you option to program the thresholds, and the devices or the formula would actually make the comparison of the measured value and the threshold that you have set and give us the report accordingly.
So what are the different RF, analog, and digital monitors supported? RX Gain/phase monitors, the IF stage monitors, the TX power, TX ball break, synthesized or chip frequency error monitor, clock monitors, temperature sensors, internal signal monitors-- so [INAUDIBLE] are a few of them. So before that, I want to discuss how we can actually configure or program these monitors. And for that, it's very important to understand the scheduling of these monitors by the device.
The device actually consumes a certain amount of time for execution of each of these monitors. So if the user has programmed a few of the like two or three monitors, then these monitors actually get executed in the interburst or interframe times of the user programmed waveform, [INAUDIBLE] waveform. And typically, it's the user who has to calculate the time consumed for the execution of these monitors and then program the monitor periodicity in their frame configuration such that [INAUDIBLE] actually have sufficient time to complete these are monitors in available interframe time.
So [INAUDIBLE] two examples here. In the example one, all the monitors that are configured get executed in one single interframe time or interburst time. In the example two, that is not enough interframe or interburst time to actually complete the entire monitoring activity. So what the formula does is it splits the monitoring such a way that they get distributed across different interburst time. And then based on that, the monitoring would get executed, and the reports are sent out.
Some common configuration messages that has to be programmed-- as I mentioned, you can actually program the periodicity of this monitoring. So calibration and monitoring time unit is something that needs to be programmed. It's the periodicity of the monitoring that has to be set for calibration and monitoring.
So whatever these AWR_CALIB_MON_TIME_UNIT configuration, these are actually the API configuration that can be used to program this time unit. And the analog monitoring configuration-- what are the analog monitors that you would like to execute or you'd like to use? So this particular API can be used to actually program the required analog monitors.
Similarly, the digital monitor-- what are the digital monitors that you would like to execute? You can configure using this particular monitor. So the API that you can use to configure this is RX monitor gain phase configuration API.
So you could actually use program the thresholds, like the tolerable thresholds, for any kind of deviation of gain from the program values and also the imbalances in gain and phase that can be programmed. And the device would actually or the formula would report a monitoring report with this particular asynchronous event once the monitoring is completed. And the report would actually consist of measured RX gain and phase value of the receiver across the RF frequencies.
So you can program different RF frequencies and also the receivers across these RF frequencies, the measured gain and phase values would be reported. So the APIs that we can use to actually configure this particular monitor is the AWR_MONITOR_RX_IFSTAGE confirmation. It can program the tolerable gain error and the cutoff frequency thresholds.
The report is actually given in the form of asynchronous event. And then the report would actually contain the measured RX amplifier gain and also the errors in the cutoff frequency. We have what are the APIs that can be used.
Also, we have three TX. So each of the TX has the different API that needs to be configured to get the TX power. So the thresholds that you can actually configure is the absolute power and the flatness. The monitoring report also is in the form of asynchronous event for the TX that is configured. The report would actually consist of the measured TX power for each and every channel and also the enabled RF frequency.
[INAUDIBLE] the TX ball break. So [INAUDIBLE] device actually enables detection of any kind of TX ball break using the internal power detectors that are there. The device can actually measure the incident and reflected powers using power detectors on the TX output.
And the magnitude of the reflection coefficient for all these output ports can be determined from this monitor. And these are the APIs that are used for configuring the TX ball break monitor. Again, it's independent. Like different TXs have to be configured for the ball break monitoring configuration.
The thresholds that you can actually configure is maximum allowable threshold and the TX reflection coefficient. So the report would actually give you the status of the ball break, and the TX ball breaks are actually-- if there is a TX ball break, then it is due to a high degradation in the reflection coefficient. So that is an indication of any kind of TX ball break.
So the internal analog signal monitoring, we have an API for that. And then the thresholds-- so we can actually program the 20 gigahertz maximum signal and minimum thresholds. So 20 gigahertz signal is something that is used for cascaded configuration where that is used to synchronize between the master and the slave. Then the monitoring report that [INAUDIBLE] have is the monitoring report would actually provide you with the supply voltage and DC bias of the PM [INAUDIBLE] subsystems or the distribution circuits that are there.
The TX and RX analog signal monitoring-- so there are APIs for each of these TXs and RX internal analog signal monitoring. The thresholds that you will configure typically are the [INAUDIBLE] monitor delta thresholds. Monitoring reports also-- for each of these TX and RX, you have a different monitoring report.
And the RX are all combined in one single monitoring report, whereas the TX-- HDX gives a different monitoring report. Then status flag actually indicates the supply voltage failures [INAUDIBLE] there are any more failures. Similarly, the PLL control voltage, you have an API to control the PLL control voltage.
And then the monitoring report would actually provide you the measured control voltage values and also any kind of failure flags. So synthesizer frequency monitoring, it allows you to configure the threshold for tolerable frequency errors. And the monitoring report would actually provide you the number of threshold violations in the monitoring duration and also the maximum frequency error during the monitoring duration. So this-- yeah, so this also is supported during active chopping.
DCC-based clock frequency monitoring-- so device actually supports autonomous monitoring with related frequencies between various clock pairs in this-- using the DCC. That is DL clock comparators. So the monitoring report would actually provide you the measured clock frequency from each of these enabled clock pairs.
So monitoring in cascaded applications-- so 3243 can be used for cascaded or imaging applications. So in cascaded applications, the monitoring execution of one device may interfere with those executing on the other device. So simultaneous execution of non-interfering monitors is actually targeted in this particular monitoring scheme.
So we do either a time division and frequency division-based separation of mutually interfering monitors on these cascaded state of the art devices. So typically, the categorization of the monitors is known as type one, type two, and type three. Type 1 are all the monitors that predominantly are with the TX off monitors, which are not transmitted.
The type 2 are the monitors that transmit don't receive any of the test signals through RX LNA. And type three are the ones where transmit and receive signals [INAUDIBLE] actually active, and they're susceptible to interference. So in order to actually support this, there's a scheme of host-based monitoring scheduling which is introduced for cascaded applications where the host can actually take control of when to execute monitors and what type of monitors. So it provides the host more control over the scheduling the monitor in the way so that they don't interfere with each them.
So all the type 1 monitors can be executed on all the devices in the same time. Similarly, for all the type 2 monitors, they can be executed at the same time after the completion of type one monitors. And then once the type one and type two are completed, the type 3 monitors which are susceptible to interference can be scheduled by the host for different devices at different things. So this amount of flexibility and also the PID configuration is provided by the monitoring APIs.
So customer safety enablers-- so there are different components that the TI provides to enable customers to achieve their safety goals. One of the components of the safety diagnostic library. So safety diagnostic library is a collection of functions to provide access to safety functions and any response handlers for various safety mechanisms and the [INAUDIBLE] devices.
In this, actually, the diagnostics are run in the context of callers protection environment, and all the responses are actually handled in the context of [INAUDIBLE] exception. So it implements all the diagnostic mechanism that are indicated in the device safety manual. And SDL is actually provided to customers in source code from, .c and .h, along with all the quality artifacts-- that is, the compliance support package that is required for the customers to use that as part of their certification process. This is released via mySecure and under special [INAUDIBLE] the safety [INAUDIBLE].
So the FMEDA, which is the failure mode [INAUDIBLE] diagnostic analysis, it's a detailed analysis of the different failure modes and the diagnostic capability for a piece of equipment. This is an effective method for detecting or determining any kind of failures, failure modes and failure rates, [INAUDIBLE] the requirement and also for the certification against this ISO 26262. In the AWS, modules use diagnostic features associated with each of these used modules.
So all the modules that are actually used, each of them actually have-- so what we do is what-- what do we need to do the evaluation if the failure rate meets the SLB hardware metric requirements for safety functions? So that is what is actually determined by this [INAUDIBLE]. This can be used to estimate the failure rate for any kind of safety function and effect of risk reduction for applying the diagnostic.
So this is the worksheet that TI provides for the FMEDA, and there are steps by following, like tailoring of the AWI use case conditions, selection of the AWI module used for safety functions, a selection of the safety mechanisms prior to diagnosic, and any kind of failure rate analysis. So this is just an indication of what is there in the FMEDA worksheet. So even this-- yeah, so this is also provided under [INAUDIBLE] to customers. And data would be available only under NDA.
The safety manual for mmWave device-- in this safety manual, we actually provide all the safety architecture and management of-- an overview of safety architecture for management of the random failures. There is a comprehensive list of diagnostic mechanisms provided in the safety manual. The failure modes and also the failure rates are all listed. And use of this tables, the diagnostic mechanism table, actually provides the details about the safe island region, the RF analog spike, and [INAUDIBLE] API level details.
Along with this, we also provide monitoring application node. This monitoring application node is-- the aim of this is to help customers build a software to program the monitoring APIs to achieve their safety goods. It gives a description of all the monitoring mechanisms and also the programming options that is offered by these APIs.
Also, it illustrates an example of post-processing of the monitoring reports produced by the APIs. It also illustrates the reports in programming conditions of TI's internal labs. So it has all the details about certain volume data that is available as part of TI's internal lab system validation. And also, this is shared under Safety NDA and available in mySecureSoftware Link.
So these are some SafeTI certified development processes. TI follows the SafeTI hardware development process, which is certified by TÜV SÜD for meeting the ISO 26262 up to ASILD and also the SIL3 requirements of IEC 61508. Similarly, we also have a software development process that is certified by TÜV SÜD for the ISO 26262 and also the EIC 61508. So the [INAUDIBLE] AP00261 is for the software, and [INAUDIBLE] AP00210 is for the hardware.
So again, a brief about the system element out of context-- so the technical safety requirements are actually derived from the safety goals, which gets into that next level of hardware and software requirements that are designed by the respective hardware and the software teams. So [INAUDIBLE] 216 is a certified software development process for developing functional safety capable devices. So all the firmware and software fall under this category.
And then the hardware, which is [INAUDIBLE] AP210, certified hardware development process for developing functional safety capable devices. So this brings us to the end of whatever agenda we had. So we covered the device introduction, the function safety concept, the safety features of the device, and then the customer safety enablers-- SDL FMEA, FMEDA, and monitoring application notes, the safety manual, and then the certificates.
2020년 8월 21일
mmWave devices are being chosen for many sensing applications in the Industrial and Automotive markets. Many of these applications require safety standards compliance. mmWave device architecture enables implementation of Industrial (SIL2) and Automotive (ASIL B) at the system level which can greatly speed up your time to market. In this presentation we will address popular queries from a customer implementation standpoint and demonstrate how you can being designing with TI mmWave sensors today.