

# BIST-BASED GROUP TESTING FOR DIAGNOSIS OF EMBEDDED FPGA CORES

N.C.Sendhil Kumar

Assistant Professor Department of ECE Ranipettai Engineering college Vellore, India.

# Abstract

A group testing-based BIST technique to identify faulty hard cores in FPGA devices is presented. The method provides for isolation of faults in embedded cores as demonstrated by experiments on the Virtex-5 family of Xilinx FPGAs. High-level HDL code is developed to instantiate a Finite State Machine (FSM) which generates the test inputs for the Blocks Under Test (BUTs). The BUTs are divided into groups of four and at the end of a single stage of testing, up to 2 faulty BUTs are isolated successfully in each group of four. Experiments conducted show efficient fault isolation with a maximum of 30% area overhead under testing conditions. Isolation of faulty DSP cores is rapidly achieved without any permanent area cost. The approach can be readily extended to other embedded cores such as Block RAMs and Multipliers, thus providing a fast, efficient technique for testing prior to System On a Programmable Chip (SoPC) implementation on state of the art SRAM FPGAs.

Keywords: Embedded Cores, SOPC, FPGA Testing, Group Testing, DSP Hard Cores.

# **1. Introduction**

The current generation of 65 nm FPGAs by Xilinx, such as the Virtex-5 platform FPGAs introduce space-efficient hard IP cores implemented using the column-based Application Specific Modular Block (ASMBL) architecture. The Virtex-5 platform provides anywhere from 32 to 640 embedded DSP48E cores across a range of devices [1]. These cores are designed, placed and routed into the fabric of the FPGA, and have been characterized and verified to optimize performance. Unlike soft IP cores, these enable designers to save the Configurable Logic Blocks (CLBs) as general-purpose logic resources and minimize the space and power required to implement diverse applications, including DSP applications on FPGAs. The embedded IP cores are characterized by their predictable timing and are optimized to work efficiently in a manner independent of the rest of the design. These cores are highly customizable based on the designers' requirements and provide a range of in-built structures for efficient arithmetic calculation and signal processing requirements. All these characteristics lend to more efficient implementation of an entire system on an FPGAs known commonly as a System On Programmable Chip (SOPC). The development of FPGAs with an increasing number of embedded hard IP cores drives the need for faster testing methods for failures in the cores.



Since the embedded cores are numerous and distributed throughout the FPGA fabric as an integral part of the computational resources, they require extensive post-manufacturing testing and verification. Hence it is important to develop efficient testing methods to identify hardware faults with minimal latency and resource overheads. This article presents a Built-In Self Test (BIST) based approach that improves on previous techniques using a single-stage non-adaptive group testing algorithm to isolate defective embedded cores. Section 2 describes the previous work in testing FPGAs using BIST and BIST-inspired methods. Section 3 describes the grouptesting based algorithm for fast isolation of faulty embedded cores, and Section 4 provides details from experiments conducted on the Virtex-5 family of devices. Finally, Section 5 completes the article by identifying areas for improvement and providing concluding remarks.

#### 2. Related works

High device component density along with the trend toward high clock frequency and low power consumption present challenges for conventional methods of testing FPGAs. Advances in FPGA production technologies have improved capabilities to the point where FPGAs have dedicated embedded cores, including DSP cores and Block RAMs[1]. The most widely utilized approach to detect faults at the chip level in VLSI is to apply BIST at the component level [3,4,5]. The built-in nature of BIST also allows testing the chip in a variety of working environments. In BIST both the Test Pattern Generation (TPG) and Output Response Analyzer (ORA) are incorporated inside the device. Assuming that all levels of the hierarchy use BIST, each element can test itself and transmits the result to the succeeding level in the hierarchy. BIST also increases controllability and observability by providing access to the internal nodes since tester logic is located on the chip. BIST allows tests to be run at system speed and eliminates this gap.

BIST has been the choice of convention for testing Embedded Memory [3, 4]. Conventional ASIC BIST techniques typically accrue between 10% to 30% area overhead and delay penalties [5]. Therefore, it is essential that the FPGA core test method leverages the reprogrammability inherent in FPGAs. An additional advantage of utilizing the programmable feature of an FPGA to test itself is that BIST logic can be removed when the circuit is reconfigured for another use and testability is achieved without permanent area overhead or performance degradation.

There has been considerable research on developing BIST techniques for programmable logic resources in an FPGA including CLBs [6,7,8] and interconnect matrix of routing resources [9,10,11]. Abramovici and Stroud [6] presented a BIST architecture to test CLBs in an FPGA. In their scheme, a column or row of CLB is configured to generate pseudo-exhaustive test patterns to alternating columns of identically configured CLBs under test. They use two identical TPGs to detect any fault in the CLBs used to construct TPGs. Comparator-based ORAs monitor the output of the BUTs and latch mismatches due to faults. The BUTs are tested and configured for



different modes of operation. Each testing session covers only half of the CLBs and another session is required to test the other half blocks are indicated by an "x" mark whereas operational, fault-free blocks are indicated by a " $\sqrt{}$ " mark. As shown in Figure 1(a) and Figure1(b) two sets of partitions are created and the outputs of the cores bordered by the dashed lines are compared for discrepancies. Post processing the results from the scan chain generates the list of fault embedded cores. However, the technique configures the device twice in order to complete fault isolation. This minimizes resource utilization and simplifies the post-processing procedure. However, this method fails to isolate faulty blocks when there is a defective block in each of the compared pairs.

This paper extends the technique to an automated diagnostic methodology that is applicable to different cores, including DSP cores, that takes into account the different modes of operation. The method is scalable to different FPGA families including the Xilinx XtremeDSP products and the Virtex-5 family of FPGAs. Further, these techniques can be readily adapted to provide testing coverage for new families of embedded cores on FPGAs since the method is core-independent. By generating, comparing, and encoding the outputs produced by the cores in response to the test pattern, complete fault resolution is achieved in a single test.

# 3. Group testing-based bist for isolating faulty cores

The embedded IP cores in the Xilinx Virtex-5 family of devices are distributed throughout the fabric ensuring optimal timing. The proposed BIST technique utilizes the CLBs adjacent to the embedded cores to realize the TPG and the ORA. Each embedded core comprises a BUT. The TPG is realized using an FSM to generate the states required for testing the embedded cores. In order to test the DSP48E cores, the FSM generates approximately 400 states and 14-bit wide control signals for each state. Each state defines a valid combination of control signals.

| Fainty of FT GAS |        |                     |                   |                  |                                                 |            |
|------------------|--------|---------------------|-------------------|------------------|-------------------------------------------------|------------|
| Device           | DSP48E | Available<br>Slices | Available<br>LUTs | Available<br>FFs | Resource Utilization under Test<br>(Percentage) |            |
|                  |        |                     |                   |                  | LUTs                                            | Flip flops |
| XC5VLX30         | 32     | 4800                | 19200             | 19200            | 1,418 (7%)                                      | 384 (2%)   |
| XC5VLX50         | 48     | 7200                | 28800             | 28800            | 1862 (6%)                                       | 408 (1%)   |
| XC5VLX85         | 48     | 12960               | 51840             | 51840            | 1862 (6%)                                       | 408 (1%)   |
| XC5VLX110        | 64     | 17280               | 69120             | 69120            | 2300 (3%)                                       | 432 (1%)   |
| XC5VLX155        | 128    | 24320               | 97280             | 97280            | 4058 (4%)                                       | 528 (1%)   |
| XC5VLX220        | 128    | 34560               | 138240            | 138240           | 4058 (2%)                                       | 528 (1%)   |
| XC5VLX330        | 192    | 51840               | 207360            | 207360           | 5822 (2%)                                       | 624 (1%)   |
| XC5VSX35T        | 192    | 5440                | 21760             | 21760            | 5822 (26%)                                      | 624 (2%)   |
| XC5VSX50T        | 288    | 8160                | 32640             | 32640            | 8462 (25%)                                      | 768 (2%)   |
| XC5VSX95T        | 640    | 14720               | 58880             | 58880            | 18139 (30%)                                     | 1296 (2%)  |
|                  |        |                     |                   |                  |                                                 |            |

 Table 1: Resource Utilization Results from Experiments Conducted on the Xilinx Virtex-5

 Family of FPGAs





Figure 2. BIST Structure for Testing a Group of Four Blocks Under Test

Each pair of BUTs requires a 48-bit comparator and 4 1-bit comparators for their outputs to be compared for discrepancies. Each comparator contains the value 0xFF to register any mismatch. In addition to these, for each pair of BUTs, a  $2\times1$  multiplexer is used to serialize the results of the comparators. Thus for every group of BUTs, a total of six  $2\times1$  multiplexers are required. This circuitry is further optimized as described in the following section. Figure 2 shows these six comparators k1(i,j) for comparing the outputs of the 4 BUTs in group n = 1. The technique uses a test controller in addition to the TPG and the ORA, to activate the test routine by asserting the START signal. Termination of the test is achieved when the DONE signal is asserted, followed by the propagation of the test results.

#### 4. Fault isolation experiments on Virtex-5 fpgas

As a particular example of the BIST technique, experiments were conducted on the Virtex-5 family of Xilinx FPGAs. The testing of an XC5VLX30 device provides the following case study which further elaborates the procedure. The XC5VLX30 device consists of 32 DSP48E embedded cores, with 4800 slices that provide 19200 LUTs. The m = 32 embedded cores on the XC5VLX30 device are divided into n = 8 groups. Since six 2-to-1 multiplexers are required for each group, a total of 48 such multiplexers are required. However, the synthesized design optimally uses six 8-to-1 multiplexers.





Figure 3. BIST Structure used for Testing the XC5VLX30 Device

Further, the solution was implemented on various devices of the Virtex-5 family with Xilinx ISE synthesis tool. Table 1 summarizes the resource usage for each of these devices. As listed in the Table, for the XC5VSX95T device, which contains 640 DSP48E embedded cores, the device utilization during testing is approximately 30%. In Table 1, all Utilization Percentage figures less than 1% have been rounded up to 1%. Also, each slice in the Virtex-5 family of FPGAs contains four 5-input LUTs and four flip flops

#### 4. Conclusion

Embedded cores within FPGAs provide improved performance by optimizing area and power consumption. With improvements in the process technology, the smaller geometries will drive the inclusion of an increasing number of diverse hard IP blocks in FPGAs. As shown in this article, the XC5VSX95T device in the Virtex-5 family contains 640 DSP cores and 488 Block RAM cores. This shows the need for efficient fault isolation techniques to diagnose these devices to improve yields and facilitate faster debugging. The demonstrated technique achieves the goal of fast detection and isolation of faults by leveraging a group testing technique that isolates faulty embedded cores in a single-step procedure that precludes the need for device reconfiguration. The approach is scalable at the cost of temporary area overhead. However, no permanent area cost or performance overheads are incurred as a result of testing. The presented technique can be used in conjunction with other existing methods for isolating faults in interconnect and CLBs to provide complete post-manufacturing testing for FPGAs with embedded cores. Additionally, the method is also suitable for periodic offline testing of embedded cores in operational devices.

#### References

1. Xilinx, Inc., Virtex-5 Family Overview – LX, LXT, and SXT Platforms, December 2007.



2. Y. Zorian, E.J. Marinissen, S. Dey, "Testing Embedded-Core Based System Chips," Proceedings International Test Conference, pp. 130-141, 1998.

3. K. Saluja, S. Eng, and K. Kinoshita, "Built-In Self Testing of Embedded Memories," IEEE Design & Test of Computers, Oct. 1986, pp.27-37.

4. P. Camurati, P. Prinetto, M.S. Reorda, S. Barbagallo, A. Burri, D. Medina, "Industrial BIST of Embedded RAMs," IEEE Design & Test of Computers, Autumn/Fall 1995, Vol. 12 Issue 3, pp. 86.

5. C. Stroud, P. Chen, S. Konala, and M. Abramovici, "Built-in self-test for programmable logic blocks in FPGAs (finally, a free lunch: BIST without overhead!)," Proc. IEEE VLSI Test Symp., pp. 387-392, 1996.

6. M. Abramovici, C. Stroud, "BIST-Based Test and Diagnosis of FPGA Logic Blocks," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol.9, No.1, pp. 159-172, 2001.

7. E. Atoofian and Z. Nabavi, "A BIST Architecture for FPGA Look-Up Table Testing reduces Reconfiguration," Proceeding of the 12th Asian Test Symposium, pp. 84-89, 2003.

8. M. Niamat and P. Mohan, "Logic BIST architecture for FPGAs," Proceedings of the 44th IEEE Symposium on Circuits and Systems, Vol. 1, pp. 442 – 445, 2001.

9. J. Liu and S. Simmons, "BIST-Diagnosis of Interconnect Fault Locations in FPGAs," CCECE 2003.

10. A. Doumar and H. Ito, "Detecting, Diagnosing, and Tolerating Faults in SRAM-Based Field Programmable Gate Arrays: A Survey," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol.11, No.3, pp. 386-405, June 2003.

11. M. Renovell, J. M. Portal, J. Figueras, Y. Zorian, "Testing the Local Interconnect Resources of SRAM-Based FPGAs," Journal of Electronic Testing: Theory and Applications, Vol. 16, pp. 513, 520, 2000.

12. C. Stroud, S. Garimella, "Built-In Self-Test and Diagnosis of Multiple Embedded Cores in SoCs," Proceedings of The 2005 International Conference on Embedded Systems and Applications, ESA 2005, pp. 130-136, Las Vegas, Nevada, USA, June 2005.

13. S. Garimella, C. Stroud, "A system for Automated Built-In Self-Test of Embedded Memory Cores in System-on-Chip," Proceedings of the Thirty-Seventh Southeastern Symposium on System Theory, pp. 50-54, 2005.