what is metastability in fpga

Figure 3 shows an example sync chain of length 2, assuming the output signal feeds multiple register targets. Timing slack on each additional register-to-register connection is added to tMET. Therefore, designers cannot predict the sequence of signal transfers or the number of target clock edges until the data transfer is complete. fpga metastability uncertainty eeweb Any voltage levels that fall in an intermediate range may not be interpretable or processed by the system and, as a result, cause the logic gates of the circuit to misbehave and persist in an unstable equilibrium state for a given period of time. And at such places, precisely because the tools cannot know, they are guaranteed to tell you there is a clock timing violation. fpga metastability uncertainty eeweb Altera provides app-note 42 for some basic details, which also links to specifics of implementing synchronizers using the Quartus-II software. An external clock is used to increment a counter, the value of that counter being read by a system operating from a different clock domain. This article describes metastability in FPGAs, explains why this phenomenon occurs, and discusses how it can lead to design failure.

The counter block places new data onto the bus, and keeps that data constant until it has been acknowledged by the other side. If you know a lot about FPGAs already, you might still pick up some interesting tidbits in the post. The timing slack available in the synchronizer register-to-register path is the time available for metastable signals to settle, known as the available metastable settling time. Therefore, it is said that the higher the speed, the more prone to design problems! You cant see the actual metastability behavior because the signals are internal. And its not bad design to do so its actually understanding whats going on rather than sticking to rules-of-thumb. The first is that the output of the metastable flip-flop will take longer to propagate than expected, since there is some delay while the output is resolving. They also provide mechanisms allowing the designer to specify that they shouldnt worry about specific cases at clock-domain boundaries.. What will the flip flopdo? This usually happens when a signal is being transferred from one clock domain to another in a system using multiple, asynchronous clock domains. The same output, when again sampled by a second flip flop, will give you a stable condition, thus, resolving the issue of metastability and successfully persevering the integrity of your system and circuitry. Faster process technology and faster transistors enable faster resolution of metastable signals. Top Five MPU Suppliers Expand Share of Sales to 86% in 2021, CEO Talk: John Davis, Rapid Automation Design, Texas Instruments Keeps A Firm Grip As Worlds Top Analog IC Supplier, Difference between Synchronous and Asynchronous Counter, Get 3 Quotes from Electronic Design Companies, Tips For Installing a PCB Prototype Board, Benefits of Working With a Reliable Flex Printed Circuit Board Manufacturer, Firmware and Embedded Software Development Services, Be sure to follow our LinkedIn company page where we share our latest updates, Electronic Product Design and Development, Electronic Contract Manufacturing Companies, Electronic Manufacturing Services Companies. As a result, the received value of the bus data may be incorrect. The difficulty with this characterization is that the MTBF of a typical FPGA design is usually measured in years, so it is impractical to measure the time between metastability events using a real design under real operating conditions. Brett has updated the project titled Pi Coil winder. They typically go by the name clock domain crossing tools, and they cost quite a bit of money; a piddling amount by ASIC design standards, but still a shocking amount to people not used to chip design. And if theres no relationship, theres no way for the tools to know. Altera provides the DCFIFO megafunction for this operation, which includes various delays and metastability protection for control signals. Well let you read about the solutions yourself. Altera, App-Note 42: ftp://ftp.altera.com/pub/lit_req/document/an/an042.pdf, Xilinx, App-Note 94: http://www.xilinx.com/support/documentation/application_notes/xapp094.pdf, Lecure 07 of MIT Class 6.004: http://6004.csail.mit.edu/currentsemester/, Ran Ginosar, Metastability and Synchronizers: A Tutorial: http://webee.technion.ac.il/~ran/papers/MetastabilitySynchronizersTutorialIEEEDT2011.pdf, December 2014 Post on ProgrammableLogicInPractice.com [No Longer Available], Your email address will not be published. The counters are all large enough that they wont overflow during the experiment, which would also be flagged as an error. However, adding a register adds an extra delay stage to the synchronization logic, so the designer must evaluate whether this is acceptable. Or even the *mention* that MTBF is *exponentially* related the amount of time you wait for the metastability to settle. Required fields are marked *. These results are shown in Table 1. You Wouldnt 3D Print A House, Would You? The FIFO is setup to write data whenever there is space, and read data whenever it is present. In a synchronous system, the input signal must always meet the register timing requirements so that metastability does not occur. Dont let the word metastability scare you. The designer needs to figure it out. This article is mainly translated from Altera's white paper "Understanding Metastability in FPGAs", which mainly describes the problems related to metastability in FPGA design. And at such places, precisely because the tools cannot know, they are guaranteed to tell you there is a clock timing violation. A designer or tool supplier can have a very significant impact on the design MTBF by improving the tMET of the synchronizer chain with the worst MTBF. Unfortunately even crossing a single line can have trouble as will be explored when we discuss metastability, so we arent quite out of the woods yet. The destination register captures the output of the synchronizer one clock cycle and one and a half clock cycles later. FPGA designers can improve system reliability and increase metastability MTBF by increasing tMET design techniques that increase timing slack in synchronization registers. If the system was operating correctly, one would expect that the new value of the counter is the same as the previous value (no count occurred yet), or the previous value plus one (a count occurred).

FPGA vendors provide First In First Out (FIFO) blocks which have different clocks on the read and write ports. The data input to the synchronizer toggles every clock cycle (high fDATA). Tests were performed at various clock frequencies, and MTBF versus tMET results were plotted on a logarithmic scale.

Also an example of why this is so much of a pain in the neck. I have not seen an actual case of metastability in the whole video. Designers must use circuits such as dual clock FIFO (DCFIFO) logic to store signal values or handshake logic to accommodate this behavior. Also, the chain with the worst MTBF has a large impact on the design MTBF. Increasing the metastable MTBF reduces the likelihood that signal transmission will cause any metastability issues in the device. encoder count quadrature decoder counter quad input output pulses microcontroller easy way arduino quada clk module Think of the Byzantine Generals and Dining Philosophers problems, why there are so many relaxed ACID transactions mechanisms in databases, why two-phase commits are necessary, and why they fail making three-phase commits necessary (repeat for 3rd, 4th). If the clocks being buffered by like a phased-lock loop internal to the chip, and youre using the positive and negative edges of that clock, then yeah, its completely deterministic because the duty cycles known. In the case of a simple OR of 2 outputs of a flipflop, you can hit the window when the clock edges are within the clock-to-out plus propagation time of the first flipflop, or the second flipflop. The phenomenon of metastability is unavoidable, even in synchronous circuits, there is a probability of occurrence, so as designers, what we can do is to reduce the probability of metastability. When the metastable signal is not resolved within the allotted time, a logic failure can result if the target logic observes inconsistent logic states, i.e. He would have shown true metastability if ( AND ONLY IF !) The mean time between metastable failures is a function of device process technology, design specifications, and timing slack in synchronous logic. To minimize glitches due to metastability in asynchronous signaling, circuit designers typically use a series of registers in the target clock domain (a synchronization register chain or synchronizer) to resynchronize the signal to the new clock domain. But going back to a real-life case lets look at when the core voltage is set to 1.2V. The general rule is to avoid this at all costs, but there are situations where its unavoidable. Copyright 2017-2022, HardwareBee. I remember covering this in an intro digital logic class a few years ago. One of the most critical aspects of any FPGA design is where two clock domains meet. FPGA vendors can determine the constant parameter in the MTBF equation by parameterizing the metastability of the FPGA. A raw signal might be a problem if it switches, or might be OK if it is configured right after reset and left as a constant signal after that. The first is that we might attempt to sample the data in the bus clock domain right when the data from the ADC clock domain is changing, and we read some incorrect intermediate value. Some details are published in appnotes by Xilinx and Altera, although data for more recent families requires contact with a Field Application Engineer (FAE). You just need to make sure that the chance it propagates is very low. changing the output), sometimes the flip-flop will latch the old value (i.e. There are 3 flip flops in two different clock domains. No its not, all the edges are synchronous to each other with a fixed time relationship (assuming a fixed mark-space ratio). This version differs slightly from the print version this is my own author copy version before the Circuit Cellar editing. The unsteady state caused by the metastable state, if no measures are taken to eliminate it, it will spread to the subsequent circuit and affect its output, thus causing system failure. But rather than just rehash this material, I wanted to present to you some real experimental results about failures. All the video shows is that his FPGA is really good at avoiding metastability. While each individual bit of the bus may be free of metastability issues, because the synthesis tools cant constrain that path correctly (one might have set things like false path on the timing constraint), then one cannot guarantee that all the bits of that signal will arrive at the same time and is why making multi-bit control signals, things like 1-hot or grey-encoded, safer. For most reasonable clock speeds and recent devices this will be true, although there is always still a possibility of the second flip-flop entering a metastable state. One design has 10 MTBF The same 10,000 year chain, another design has 9 chains with MTBF of 1 million years, but 1 chain with MTBF of 100 years. Kinda wish, considering how detailed things are, that there was an appendix of caveats. You can also see the video below about real life occurrences. There are actually two separate issues that pop up here, and I want to give you a more intuitive feeling for both of them. The timing diagram shows how Din is sampled by CLK2 exactly on transition from zero to one, and therefore entering into metastability state.

It doesnt increase the window, it increases the chance for a signal to fall in the window.

Michail Wilson wrote a comment on RotaDuck. This is where the 2-phase approach can accelerate things a bit more as you can pass an event when transitioning back to 0 as well as the transition to 1. Remember, that control signal should be double-registered on entry to a new clock domain, so the round-time for the 4-phase system is, in a worse case 8 clock cycles (consider a situation where the two domains are almost the same clock frequency but not quite), or, if one domain is significantly faster than the other, around 4 cycles of the slowest clock. document.getElementById( "ak_js_1" ).setAttribute( "value", ( new Date() ).getTime() ); Your email address will not be published. As expected, this change results in no errors using the same test as before. FPGA tools *do* guarantee that *within* a clock domain, it will never happen thats the entire point of checking setup/hold times. One day i missed clock-flanks because of it. So sometimes it makes sense to be careful at constraining the positive->negative edge transitions a little better to allow for a range of duty cycles. The results in Fig. The tMET of a synchronization chain is the sum of the output timing slack for each register in the chain. Unless theres a derived relationship between the two clocks (e.g., the second clock is a multiple or a division of the first), this is totally impossible. Different FPGA vendors and families have different best practices about how to implement this. As mentioned before, metastability is introduced when the input signals hovers between 1 and 0 while the clock is sampling, meaning the system is unable to derive an output signal for a particular time period. The time it takes for the flip-flop to reach its final state may vary slightly, and is given by a probabilistic measure. References to previous articles are for Circuit Cellar Issues, as this was originally written for the print publication. If youve got 2 nominally 50 MHz domains, but different real clock source, youll hit the metastability window pretty damn often, so you better have a synchronizer between them.. Ive posted it here for your reading pleasure as well. A pointer to a decent reference would be more than sufficient. In addition you may need to constrain the path between the flip-flops, since the tools dont know that you actually want the shortest possible path between those flip-flops. fpga metastability uncertainty eeweb

Therefore, unless the vendor designs the FPGA circuit to improve the metastable robustness, the metastable MTBF will usually get worse. Faster clock frequencies and faster data switching reduce (or worsen) MTBF. In particular this circuit looks like a shift register, thus it may insert a shift-register primitive which doesnt have the desired metastability characteristics.

In most cases, the registers quickly revert to a stable, defined state. If our data is coming from one clock domain, and the internal clock from another, its likely we will eventually violate the setup or hold times, and enter this state. Therefore, metastability has not been a major concern for FPGA designers. The FPGA tools should provide mechanisms for indicating where they cant prove it cant happen. The second is that the invalid logic level representing the metastable state might be read by a 1 in some portions of the circuit and a 0 in other portions, due to small process variations. Altera's Quartus II MegaWizard Plug-In Manager provides an option to select an enhanced metastability protection option with three or more synchronization stages. Urgh, what a terrible blog post (theirs not yours, Al) they threw in a lot of the right words and some of the right explanations but mangled some of the results. encoder count quadrature decoder counter quad input output pulses microcontroller easy way arduino quada clk module

encoder count quadrature decoder counter quad input output pulses microcontroller easy way arduino quada clk module

Dan is happy to receive feedback perhaps you could post this to his email? The circuit is replicated throughout the device to reduce the effects of any local variations, and each instance is continuously tested to remove any noise coupling. The flag synchronizer in there is the one I use all the time in my designs. Everything Ive read from that guy seems excellent (to me as a novice). But you can actually see the behavior if youre really clever, like forcing a signal into the metastability window using the fine phase shifts on a PLL and sampling with another fine-phase shifted clock to probe things at literally picosecond scales. The first register in the new clock domain is used as the synchronization register. Bought as much fun as asynchronous logic. Just adding two flip-flops as in Fig. If we wanted to modify our design to simply insert a FIFO with separate clocks, it would now look as in Fig. Designers using Altera FPGAs can take advantage of Quartus II software capabilities to report the metastable MTBF of their designs and optimize design placement to increase MTBF. If youve got 2 flipflops -> OR -> 1 flipflop, now there are 2 times that it could happen, thanks to the different propagation times from the 2 flipflops through logic to the new domain. If you want to learn more basics about flip flops, you might want to start with the first post in that series.

There is a number of online resources, or if you prefer something printed I recommend Chapter 6 of the book Advanced FPGA Design by Steve Kilts. In Fig. Thats why you need a mechanism for preventing the metastability from propagating. An example of a data output signal starts from a low state and enters a metastable state, alternating between high and low states. Metastability can occur when signals travel between circuits in uncorrelated or asynchronous clock domains. On Reliability of EMFI for in-situ Automotive ECU Attacks, New England Hardware Security Day 2022 Talk, Apple AirTag Teardown & Test Point Mapping, Experimenting with Metastability and Multiple Clocks on FPGAs, ftp://ftp.altera.com/pub/lit_req/document/an/an042.pdf, http://www.xilinx.com/support/documentation/application_notes/xapp094.pdf, http://6004.csail.mit.edu/currentsemester/, http://webee.technion.ac.il/~ran/papers/MetastabilitySynchronizersTutorialIEEEDT2011.pdf. Any control signal coming from a different clock domain (where they are totally asynchronous) should be *at least* double-registered for meta-stability issues, plus that path between the two clock domains should have no combinatorial logic (i.e. However, if a metastable signal does not resolve to a low or high state before reaching the next design register, it can cause the system to fail. Now, we know the data register is safe (stable) because the control signal (an encoded 1-hot or grey-encoded count) said so but this assumes that the mux structure is not influenced by the other registers (it may not always synthesise as a perfect OR-of-AND like structure). Whats even worse is that a design can work most of the time and only hit a set up or hold violation occasionally. Huh? Another way is to, instead of trying to eradicate the presence of metastability altogether, which is quite a difficult thing to do, reduce the Mean Time Between Failures, or the MTBF. 8A, which uses a persistence vision mode in the oscilloscope to plot hundreds of traces on-top of each other. yes and no. To improve metastability MTBF, designers can increase tMET by adding additional register stages in the synchronization register chain. I dont think that conflicts with what I wrote. Unless theres a derived relationship between the two clocks (e.g., the second clock is a multiple or a division of the first), this is totally impossible. Thus if you enter the metastable state it will most likely be resolved in some short period of say 1 nS, but its possible (although unlikely) to take say 5 nS. A nave implementation is shown in Fig. Even if you only have one clock, any inputs from the outside world that dont reference your clock or, perhaps, any clock at all introduce the possibility of metastability. What you do, is that you allow your first flip flop to sample the asynchronous input signal coming from the clock, resulting in the generation of a metastable state. To give you a more obvious figure Ive dropped the core voltage of the FPGA to 0.7V (again this is BELOW the normal minimum voltage of 1.14V for VCCINT). (Comment Policy). In order to give you a feel for a metastable circuit, Ive implemented the block diagram from Fig. Note that exactly as expected the number of failures scales with external frequency, as a faster external frequency simply means more data changes (clock edges), and thus more chances for data to be read at the wrong time. The FIFO logic uses synchronizers to transfer control signals between the two clock domains, and then uses dual-port memory to write and read data. Stphane Lebonnois liked Digital Hacks for the Centennial Camera. For example, consider two different designs with ten synchronizer chains. Unfortunately many softies ignore/deny them in the same way that hardies ignored/denied metastability back in the 70s. Otherwise, if the asynchronous signal acts as part of the handshake logic between the two clock domains, the control signal indicates when data can be transferred between the clock domains. If you have combinational logic this increases the window quite a lot. It can also be seen from the formula of MTBF that as the system clock frequency increases, it also brings challenges to the robustness of the system.

That information is useless, and the designer should tell the tool to suppress such warnings.. By using our website and services, you expressly agree to the placement of our performance, functionality and advertising cookies. Metastability is a phenomenon that can cause systematic failure of digital devices, including FPGAs, when signals travel between circuits in uncorrelated or asynchronous clock domains. 6 note the metastable state isnt even a valid logic level. The failure rate of the design is the sum of the failure rates of each chain, where the failure rate is 1/MTBF. The farther the ball is from the top, the faster it will reach steady state at the bottom. Which actually leads you into *another* problem in those cases you usually have to add a constraint for signals propagating between the two clocks. (Im presuming this isnt simply a case of accidentally creating a very poorly-defined flip-flop. Kevin Santo Cappuccio wrote a comment on RotaDuck. If the data signal transfer occurs after the clock edge and the minimum value tH, it is similar to dropping the ball on the "old data value" side of the mountain, and the output signal remains at the original value of that clock transfer. In some cases the FPGA does not switch at all, because of timing violations. If you are trying to learn FPGAs, youll want to read it. When a signal travels between circuits in unrelated or asynchronous clock domains, the signal must be synchronized to the new clock domain before it can be used. FPGA tools certainly can tell when a signal crosses from one domain to another, but they cannot tell if the crossing is OK or might be problematic without a lot more logic. Its worse if you have combinatorial logic before the first flipflop in the new domain, because now youve added more possible asynchronous delays on top of things. Thus crossing clock domains is not just something to consider with high-speed designs, but every time you cross clock domains. Its possible to do a 2-phase system whereby an edge from the old domain is used as an event but it looks identical to the 4-phase, you just skip the clear steps. Anything else is flagged as an error, and then the total number of errors is counted, along with total comparisons. You can observe some of this in the following video: The most classic solution to this problem in shown in Fig. Normally, in many cases, if the system does get stuck in such an unstable equilibrium state, it tends to resolve itself by shifting to a valid state, wither the old or the new value. convergent paths: where you have a multi-bit control signal being safely passed between domains. Jeremy has updated the project titled DuePrint.

This is a mistake a lot of people make. With 1 flipflop -> 1 flipflop, the only time it happens is when the two clock edges are within the metastability window (less the clock-to-out and propagation time between the two flipflops). The calculated mean time between failures (MTBF) due to metastability indicates whether the designer should take steps to reduce the chance of such failures. 9 isnt enough you must ensure you are implementing the circuit you intend, not what the synthesis tools guess you are doing.

The timing diagram below shows the timing relationship between CLK1 and CLK2. fpga 7. Of course, this is also a must for a robust system. Metastability in FPGAs is a state that digital electronics systems can find themselves stuck in for a period of time. The flags are then relative to the proper clock input, i.e. 8C arent so dramatic looking, but you can see the additional time required for the metastable state to resolve. The internal system clock is operating at 40MHz, and the results are calculated for various external frequencies of 100kHz, 1MHz, and 10MHz.

By chance its very likely that were going to attempt to read from the ADC while the data is changing. If that is the case, then the gods themselves contend against bad designs!).