

# QREAL FPGA Report

Finn Moore  
fdm@umich.edu

University of Michigan

## Introduction

Quantum information science (QIS) has revolutionized communication, sensing, and computing capabilities by leveraging the unique properties of quantum states. Quantum receivers play a pivotal role in improving performance by enabling enhanced information processing tasks such as quantum-state discrimination and parameter estimation. Traditional analytical approaches to Quantum Receivers struggle to adapt to diverse operational conditions and are quickly overwhelmed by noise in practical scenarios. To address this challenge, we utilize the Quantum Receiver Enhanced by Adaptive Learning (QREAL) architecture. QREAL leverages reinforcement learning to adaptively design quantum receivers capable of tackling a wide range of QIP problems and environmental conditions. Our results showcase the ability of QREAL to adapt to noise and imperfections, thereby outperforming conventional quantum receivers and classical approaches.

## Methodology

### 1 Receiver Architecture



Figure 1: Receiver Architecture

The receiver was developed in three parts (1). The quantum hardware system was developed using a phase-locking interferometer and a superconducting nanowire single-photon detector (SNSPD). The FPGA control unit was developed in Verilog using Xilinx Vivado and deployed onto a Red Pitaya board. The adaptive learning model was developed in Python using the TensorFlow library.

### 2 FPGA Controller

#### 2.1 Introduction

A field programmable gate array (FPGA) is an integrated circuit composed of many logic elements formed by look-up tables (LUTs). Due to the D-Flip-Flop in each of these logic elements as shown in 3,



Figure 2: Optical Setup [1]



Figure 3: Basic Logic Element [2]

each LUT can either produce their output sequentially or combinationaly, based on the programming of the FPGA. These basic logic elements are connected to form configurable logic blocks (4), and an array of configurable logic blocks connected to inputs and outputs forms an FPGA (5). An FPGA is programmed in a hardware description language (HDL) like Verilog or VHDL, where the programmer directly defines the digital logic that can be synthesized and translated into hardware. Once an HDL module is synthesized, the SRAM cells of the basic logic elements in the FPGA are populated with the values required to implement the programmed logic function.

## 2.2 Red Pitaya

The FPGA board selected for this project is the Red Pitaya STEMlab 125-14 as seen in 6. This board includes an Xilinx Zynq 7010 FPGA and a Dual-Core ARM Cortex-A9 MPCore processor for running Linux. The board also includes two SMA inputs and outputs. The inputs are used for synchronizing the locking circuit and for the SNSPD. The outputs control the phase and amplitude modulators in the local oscillator (LO) arm of the fiber optic circuit (2). The Red Pitaya is also equipped with a 125 MS/s 14-bit analog-to-digital converter (ADC). For this experiment, the Red Pitaya was programmed using Xilinx Vivado incorporating open-source and custom Verilog cores. The Linux processor was also used to run custom C programs for setting and reading BRAM values to communicate with the machine learning model.



Figure 4: Configurable Logic Block [2]



Figure 5: FPGA [2]



Figure 6: Red Pitaya STEMlab 125-14

## 2.3 Block Design

The block design (7) for the FPGA was developed in Xilinx Vivado and consists of four main components. Each of these modules works together to implement the data processing and communication between the formulator and the optical setup. The four main components are as follows:

### 2.3.1 DataAquisition (8)

This module uses the open-source AXI4-Stream core `axis_red_pitaya_adc_0` developed by Anton Potochnik [3] and a signal splitter to read the synchronization and SNSPD values from the ADC.

### 2.3.2 Xilinx Processing System (9)

Next is the processing system PS7 consisting of a ZYNQ IP core (`processing_system7_0`) allowing the FPGA to interface with the Linux processor.

### 2.3.3 Control (10)

The most important component, `control`, consists of many different sub-modules. The first of which (`axi_gpi_0`) is another open-source module enabling the use of a GPIO address on the Red Pitaya in conjunction with a C script to control whether block memory (BRAM) is being read or written to. This controls the bit enabling the `axis_bram_writer_0` module to write data to BRAM. In the case where the C program tells the FPGA to write BRAM, it takes the detection data from our custom Verilog module, `QREAL_42_BPSK_0` (A). This module consists of a state machine that implements the QREAL logic: for every cycle, for every bin, count the detections from the SNSPD and adjust the displacement for the signal generator. Finally, store the detections in BRAM to be sent to the machine learning model.

### 2.3.4 Signal Generator (11)

The fourth important module is the Singal Generation Module. This module actuates the displacement adjustment specified by the QREAL logic and creates signals to adjust the phase and amplitude modulators of the optical setup. It does this by reading BRAM from an address specified by the `control` unit and inputting that data to an open source DAC module [3] `axis_bram_reader_0` to be sent on the SMA outputs.



Figure 7: Overall Block Design



Figure 8: DataAcquisition Module



Figure 9: Processing System Module



Figure 10: Control Module



Figure 11: Signal Generation Module

## 2.4 C Programs

For communicating with the formulator, we need a way to take inputs and outputs from a regular operating system. To do this, we leverage the Linux chip present on the Red Pitaya board and its connection to the FPGA via the open-source AXI cores present in our block diagram allowing the reading and writing of GPIO and BRAM. To access the BRAM of the Red Pitaya board, the C scripts must utilize a pointer to the BRAM address specified in the block design for memory-mapped IO. The values in this range can be accessed through pointer arithmetic and bit shifting as seen in B.

## Simulation Results

Our model shows significant performance benefits over heterodyne in simulation.



Figure 12: Velocimetry Mean Square Error Performance



Figure 13: Velocimetry Mean Absolute Percentage Error Performance

## Outlook

The outlook for this project is promising, marked by advancements in quantum receiver technology facilitated by the QREAL architecture. Leveraging reinforcement learning, QREAL has demonstrated adaptability to diverse operational conditions and noise environments, showcasing improved performance compared to traditional analytical approaches. Moving forward, key areas of focus for improvement include addressing noise sources in the optical setup and refining the code used to facilitate the experiment. By continuing to innovate and refine these technologies, this project is positioned well to contribute significantly to the advancement of QIS and its applications in communication, sensing, and computing domains.

## References

- [1] Chaohan Cui, William Horrocks, Shuhong Hao, Saikat Guha, Nasser Peyghambarian, Quntao Zhuang, and Zheshen Zhang. Quantum receiver enhanced by adaptive learning. *Light: Science & Applications*, 11(1):344, December 2022.
- [2] Umer Farooq, Zied Marrakchi, and Habib Mehrez. FPGA Architectures: An Overview. In *Tree-based Heterogeneous FPGA Architectures*, pages 7–48. Springer New York, New York, NY, 2012.
- [3] Anton Potocnik. redpitaya\_guide. [https://github.com/apotocnik/redpitaya\\_guide](https://github.com/apotocnik/redpitaya_guide), 2019.

## A QREAL\_42\_BPSK.v

```
1 module QREAL_42_BPSK(
2   input  clk,
3   input  synch,
4   input  signed [15:0] snsdp,
5   output [31:0] result,
6   output  synch_out,
7   output [9:0] add_result,
8   output [9:0] add_disp,
9   output  wren
10  //output done
11  );
12  //parameters
13  reg signed [15:0] THRESHOLD = 16'd1600;
14  parameter [9:0] ADD_LOCKING = 1022;
15  //regs for values
16  reg [31:0] reg_result = 0;
17  reg signed [15:0] reg snsdp;
18  reg reg_synch = 0;
19  reg [9:0] reg_add_result = 0;
20  reg [9:0] reg_add_disp = 0;
21  reg [9:0] reg_add_disp_real = 0;
22  reg reg_wren = 0;
23  reg reg_photoncount = 0;
24  reg [9:0] reg_Q = 10'b1011010010;
25  //regs counters
26  reg [10:0] reg_counter_framedelay = 0;
27  reg [9:0] reg_counter_bin = 0;
28  //regs states
29  reg [1:0] reg_state = 0;
30  reg [1:0] reg_biniidx = 0;
31  reg [1:0] reg_cycleidx = 0;
32
33  reg [31:0] reg_test = 100;
34
35  always @(posedge clk) begin
36    reg snsdp <= snsdp;
37    reg synch <= synch;
38    case(reg_state)
39      2'b00: begin // locking & modulation
40        if(!reg_synch && synch) begin // rising edge of sync
41          reg_state <= 2'b01;
42          reg_add_result <= reg_add_result + 1;
43          reg_result <= reg_Q[9];
44        end
45      end
46
47      2'b01: begin // waiting shutter goes off
48        if(reg_counter_framedelay == 11'd2000) begin // when delay is over
49          reg_counter_framedelay <= 0;
```

```

50         reg_state <= 2'b10;
51         reg_add_disp <= 0;
52         reg_add_disp_real <= reg_Q[9]*100;
53     end
54     else begin
55         reg_counter_framedelay <= reg_counter_framedelay + 1;
56     end
57 end
58
59 2'b10: begin // qreal logic
60     if(reg_counter_bin == 10'd625) begin // if the bin is over
61         reg_counter_bin <= 0;
62         reg_result <= (reg_result << 1) + reg_photoncount;
63         reg_photoncount <= 0;
64         if(reg_binididx == 2'b11) begin // if the cycle is over
65             reg_binididx = 2'b00;
66             reg_add_disp <= 0;
67             reg_add_disp_real <= reg_Q[9]*100;
68             if(reg_cycleidx == 2'b10) begin // if the last cycle is over
69                 reg_state <= 2'b11;
70                 reg_cycleidx <= 2'b00;
71                 reg_add_disp <= ADD_LOCKING;
72                 reg_add_disp_real <= ADD_LOCKING;
73             end
74         else
75             reg_cycleidx <= reg_cycleidx + 1;
76     end
77     else begin // if the cycle is not over
78         reg_binididx <= reg_binididx + 1;
79         reg_add_disp <= (reg_add_disp << 1) + 1 + reg_photoncount;
80         reg_add_disp_real <= (reg_add_disp << 1) + 1 + reg_photoncount + reg_Q[9]*100;
81     end
82 end
83 else begin // within a bin
84     if((reg_snspd < THRESHOLD) && (snspd > THRESHOLD) && (reg_counter_bin > 125))
85         reg_photoncount <= 1;
86     reg_counter_bin <= reg_counter_bin + 1;
87 end
88 end
89
90 2'b11: begin // storing result
91     if(reg_synth && !synch) begin // falling edge of sync
92         reg_state <= 2'b00;
93         reg_wren <= 0;
94         reg_Q <= (reg_Q << 1) + reg_Q[9]^reg_Q[6];
95     end
96     else begin
97         reg_wren <= 1;
98     end
99 end
100 endcase
101 end
102
103 assign result = reg_result;
104 assign add_result = reg_add_result;
105 assign add_disp = reg_add_disp_real;
106 assign wren = reg_wren;
107 endmodule

```

## B QREAL\_42\_BPSK.c

```

1 #include <stdio.h>
2 #include <stdint.h>
3 #include <unistd.h>
4 #include <sys/mman.h>
5 #include <fcntl.h>
6 #include <string.h>
7 #include <stdlib.h>
8
9 int main(int argc, char **argv)
10 {
11     int fd;
12     FILE* fp;
13     void *rd;
14     void *ctrl;
15     void *wr;
16     char *name = "/dev/mem";
17     const int freq = 124998750; // Hz

```

```

18     uint32_t buffer[1024];
19     uint32_t j;
20     uint32_t b;
21     uint32_t i;
22     uint32_t c;
23     uint32_t result[17] = {0};
24     uint32_t a[32];
25     int32_t dispa[15] =
26         {2981, 3003, 3542, 3575, 3487, 3685, 3542, 3872, 3366, 3685, 3685, 3597, 3036, 3685, 3630};
27     int32_t dispp1[15] = {4664, 4664, 0, 4664, 0, 0, 4664, 4664, 0, 0, 4664, 0, 4664, 4664, 0};
28     int32_t dispp2[15] = {0, 0, 4664, 0, 4664, 4664, 0, 0, 4664, 4664, 0, 4664, 0, 0, 4664};
29
30     printf("One\n");
31     if((fd = open(name, O_RDWR)) < 0)
32     {
33         perror("open");
34         return 1;
35     }
36     if (!(fp = fopen("result.txt", "w")))
37     {
38         printf("Error opening file\n");
39     }
40     printf("Two\n");
41     rd = mmap(NULL, sysconf(_SC_PAGESIZE), PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0x40000000
42         );
43     wr = mmap(NULL, sysconf(_SC_PAGESIZE), PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0x40010000
44         );
45     ctrl = mmap(NULL, sysconf(_SC_PAGESIZE), PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0
46         x41200000);
47     printf("Three\n");
48
49 // -1~1V -8192~8191
50     for(j = 0; j < 15; ++j)
51     {
52         (*(int32_t *) (wr + 4*j))) = (dispa[j]<<16) + dispp1[j] ;
53     }
54     for(j = 0; j < 15; ++j)
55     {
56         (*(int32_t *) (wr + 4*(100+j)))) = (dispa[j]<<16) + dispp2[j] ;
57     }
58     (*((int32_t *) (wr + 4*1022))) = 0 + (341<<16);
59     (*((uint32_t *) (ctrl))) = 1<<0; // wren high
60     sleep(2);
61     (*((uint32_t *) (ctrl))) = 0<<0; // wren low
62
63     for(j = 0; j < 1024; ++j)
64     {
65         buffer[j] = (*((uint32_t *) (rd + 4*j)));
66     }
67
68     printf("Five\n");
69     for(j = 0; j < 1024; ++j)
70     {
71         b = buffer[j];
72         fprintf(fp, "%d\n", buffer[j]);
73         for(i = 0; i < 17; ++i)
74         {
75             result[i] = b%2;
76             b = b/2;
77         }
78         printf("%u\t", result[16]);
79         for(i = 0; i < 4; ++i)
80             printf("%u", result[15-i]);
81         printf("\t");
82         for(i = 0; i < 4; ++i)
83             printf("%u", result[11-i]);
84         printf("\t");
85         for(i = 0; i < 4; ++i)
86             printf("%u", result[7-i]);
87         printf("\t");
88         for(i = 0; i < 4; ++i)
89             printf("%u", result[3-i]);
90         printf("\n");
91     }
92
93     munmap(rd, sysconf(_SC_PAGESIZE));
94     munmap(ctrl, sysconf(_SC_PAGESIZE));

```

```
86     munmap(wr, sysconf(_SC_PAGESIZE));
87     return 0;
88 }
```