A Conformance Testing Methodology and System for Cognitive Radios

The fifth generation (5G) of mobile networks has started its operation in some countries and is aimed at meeting demands beyond the current system capabilities such as the huge amount of connected devices from IoT applications (e.g., smart cities), explosive growth of high-speed mobile data traffic (e.g., ultrahigh definition video streaming), and ultrareliable and low latency communication (e.g., autonomous vehicle). To attend to these needs, the electromagnetic spectrum must be made available, but the static spectrum allocation policy has caused a spectrum shortage and impaired the employment/expansion of the wireless systems. To overcome this issue, the dynamic spectrum access (DSA) has been promoted in 5G/6G networks, which is enabled by the cognitive radio (CR) technology. Although diverse mechanisms have been developed to tackle the challenges that emerge in different CR layers/functionalities, a standardized testing methodology and system for CR is still immature. Existing standards or methodologies and systems for CR only focus on the definition of network technologies (e.g., IEEE 802.22 and IEEE 802.11af), performance evaluation of CR algorithms/mechanisms, or definition of the device cognition level via performance results or psychometric approaches, not covering systems/methodologies to verify if the device meets the CR capabilities and regulatory policies, neglecting the conformance testing. In this respect, this paper proposes a flexible methodology and system for CR conformance testing under two perspectives, functionalities and limits. We instantiate it by using the Universal Software Radio Peripheral (USRP) software-defined radio platform and present a proof-of-concept with a conformance metric. The results show the feasibility of our proposal.


Introduction
The cellular systems have evolved from a voice-centric and analog network with only voice service in the first generation (1G) of mobile communication systems to an all-IP digital Long Term Evolution (LTE-) based network that offers a plethora of services such as voice, data, high definition multimedia, and smooth global roaming with lower cost in the fourth generation (4G). The fifth generation (5G) has started its operation in some countries and is aimed at meeting demands beyond the current system capabilities such as the huge amount of connected devices from IoT applications (e.g., smart cities) and device-to-device communication (e.g., factory automation), the explosive growth of highspeed mobile data traffic (e.g., ultrahigh definition video streaming and virtual reality applications), and ultrareliable and low latency communication (e.g., telesurgery and autonomous vehicle) [1]. To address these needs, the electromagnetic spectrum must be made available [2].
A solution covered in the Release 15 of New Radio (NR) access technology is to adopt the higher spectrum sectors (above 6GHz) via millimeter wave (mmWave) communications [3] or new ultrahigh-frequency bands (THz and visible light), which are expected in 6G networks [4]. However, due to the high path loss at these frequencies, signals can get severely reduced and attenuated when facing obstructions in nonline-of-sight scenarios [5], which limits the supported applications or may increase the capital expenditure (CAPEX) to achieve a reasonable signal's coverage.
Another way is to take the advantage of underutilized sub-6 GHz bands [4], i.e., those who are not being used all the time, mainly in rural areas that lack infrastructure or economic interest by the operators. However, the static spectrum allocation policy, which assigns spectrum to the primary users-PU (e.g., cellular and TV broadcasting operators) for exclusive use, has caused a spectrum shortage and impaired the expansion of the wireless systems. To overcome this issue, the dynamic spectrum access (DSA) and cognitive radio (CR) have been promoted in 5G/6G networks because they allow that wireless systems, called secondary users (SUs), access the licensed bands opportunistically, i.e., when the PUs are not using them, without causing interference to the primary users [6]. In this direction, companies such as Ericsson and Nokia have launched infrastructure products to allow 4G bands to be shared dynamically with 5G systems, accelerating their deployments [7,8].
To do so, CR requires two main capabilities: cognition and reconfigurability. The former addresses the ability of sensing the spectrum (e.g., available bands detection), analyzing the collected information (e.g., band capacity estimation) and the user's demand to decide on the spectrum band, protocols, and transmission parameters to be adopted in the communication. The latter refers to the capacity of adjusting the transmission parameters (e.g., transmission power, modulation scheme, and carrier frequency) and protocols via software, with no hardware modification [9].
Although diverse mechanisms have addressed the challenges that emerge in different CR layers and functionalities such as spectrum sensing [10], spectrum mobility [11], packet routing [12], media access control, and security [13], there is a lack of standardized testing methodology and system for CR. Existing standards for CR only focus on the definition of network technologies (e.g., IEEE 802.11af and IEEE 802.22), not covering system/methodology to assert if a given device meets the CR capabilities and regulatory policies, which is essential to launch CR devices in the market.
Moreover, current testing methodologies and systems just target the performance evaluation of algorithms/mechanisms for CR (e.g., in terms of usual metrics such as throughput, packet loss, spectrum utilization, and interference) or the definition of the device cognition level by mapping the performance results into cognition levels [14] or via psychometric approach, such as the Cattell-Horn-Carroll (CHC) intelligence model [15], neglecting the conformance testing of CR devices.
In this respect, we propose a flexible methodology and system for CR conformance testing that analyzes the device conformity under two perspectives. First, by checking if the device under test (DUT) is able to perform a given functionality (e.g., spectrum sensing and spectrum mobility) or desired action. Second, by verifying if the DUT operates (does the target functionality/action) within the defined limits. We instantiate it by using the Universal Software Radio Peripheral (USRP) software-defined radio (SDR) platform and present a proof-of-concept with a conformance metric. The results show the feasibility of our proposal. This paper is organized as follows. Section 2 presents works on methodologies and test systems for CR. Section 3 describes the proposed conformance testing system and methodology, how it was instantiated, and two test cases used as proof-of-concept to demonstrate its feasibility. Section 4 presents the metrics and results obtained in the tests. Section 5 concludes this paper and presents future directions.

Related Work
The evaluation of cognitive radios comprises different aspects (e.g., metrics, test environment type, and test purpose) and has received attention from the academia. For instance, in [16], the authors present testing cases for CR, metrics, utility functions, cognitive engines (CEs), and their performance. They classify the metrics into three levels: node, network, and application and propose the radio environment map-based scenario-driven testing (REM-SDT) for evaluating the CR. The REM is a database with multidomain information, such as available services, spectral regulations, past experience, locations, and radio device activities. The paper focuses on the performance evaluation of mechanisms for CR via simulation.
An USRP-based CR platform that supports multiple test cases and allows different CR characteristics to be measured (e.g., channel movement time, channel closing transmission time, and interference detection threshold) is presented in [17]. The authors study the radar test signal detection via 802.11h off-the-shelf devices and point out the problems found in the devices.
By combining emulation and over-the-air testing in a shielded box, [18] proposes a virtual electromagnetic environment that allows testing devices in multidimensional scenarios (e.g., simultaneous use of multiple frequencies, multiple users, MIMO systems, and different radio channel characteristics). The authors deal with the environment complexity by adopting a multilevel design. Some examples of scenarios are presented, but a proof-of-concept is not addressed.
A Smart Grid testbed that adopts a real-time digital simulator and software-defined radio to support both power system and CR-based communication system is proposed in [19]. To show its feasibility, a bus power system with one wind farm and CR-based communication that uses a machine learning algorithm for spectrum sensing is instantiated and evaluated in terms of communication latency and voltage stability. By focusing on performance evaluation, the proposal is able to address other CR functionalities and smart grid features.
In [20,21], two low-cost SDR-based testbeds are proposed. The former focuses on CR tests for LTE and LTE-A networks and deals with the spectrum management problem, evaluating the link throughput of the cognitive radio network when the proposed spectrum band allocation algorithm is employed. The latter addresses the potential of adopting SDR in multimedia communication. To do so, video and audio file transmissions are performed by using GNU Radio and USRP kits. Although the authors claim that the proposed testbed is designed for CR networks, no CR functionalities 2 Wireless Communications and Mobile Computing were considered in the communication, and the evaluation of performance, conformance, or cognition is neglected. Testing and evaluation methodologies for CR are studied in [22]. The authors propose the Cognitive Radio Testing System, which evaluates the CR performance under different scenarios and adopt the Cognitive Radio Network Testbed (CORNET) [23] to instance the CRTS.
By addressing the lack of a common approach for evaluating signal detection methods in spectrum sensing, [24] proposes a seven-step methodology (from detection method identification to result analysis) to evaluate these methods quantitatively in simulation or practical experiments. The methodology is applied to an experimental performance evaluation with nine signal detection methods, and metrics such as complexity, noise level sensitivity, and minimal detectable signal are analyzed. Its applicability is centered on spectrum sensing and performance analysis, not covering other functionalities such as spectrum mobility and power control or conformance testing.
To develop CR prototypes faster, cheaper, and easier, [25] proposes a visual programming tool that encompasses protocols, security mechanisms, and individual modules for CR functionalities. The tool generates software code for simulation and emulation environments automatically. Similar to the previous works, the authors focus on the performance evaluation of CR and deal with the performance versus overhead (complexity) trade-off.
A Cognitive Radio Test Methodology (CRATM) that infers the device cognition based on the PU and SU performances is presented in [26]. The cognition is defined according to the SU capacity of improving its transmission rate and reducing the interference to the PU (inferred from the PU throughput). To do so, the users (SU and PU) are implemented using the Wireless Open-Access Research Platform.
By considering the CR and human cognitions as analogous, [14,15] measure the CR device intelligence via psychometric approaches. The former uses item response models to evaluate the CR performance and investigates the cognition properties of each cognitive engine (CE) item. The latter evaluates a CE based on the Cattell-Horn-Carroll intelligence model [17]. Through the performance analysis, the model identifies and quantifies the intelligence factors and cognitive abilities of the device under test and thus points out the aspects that are in accordance with the CR nature. Although interesting, the current psychometric proposals just evaluate subsets of CE and do not cover the whole integrated system. CR tests may be classified into three categories [14]: research and development (R D), regulation (compliance), and consumer (end user). The first category comprises the tests performed by the academia in the early stages and is aimed at evaluating the CR designs and optimizing their architectures, parameters, and algorithms. The second encompasses the conformance tests that verify whether the CR presents the required functionalities/behavior and does not violate the regulatory policies and standards defined by the agencies (e.g., Federal Commission Communications in the USA), industries, or even scientific researches. The last category involves tests that allow the end user to decide on the product use. Although these categories include the sys-tem performance evaluation, they present some differences. The R D uses performance tests to optimize the cognitive engine's structure, algorithms, and parameters. The second carries out evaluations to address additional features of the final product, such as the energy consumption and interactions with other components or systems. Finally, in the third class, the performance tests are user-oriented and must be fast and accurate, focusing on the worst cases. While the R D category has received great attention from academia and industry, the others have been neglected. Our paper differs from previous studies since it addresses this lack in the second category, by proposing a system and methodology for CR conformance testing.
As noted, several works have addressed CR testing in literature. Table 1 summarizes the characteristics of the presented works. In general, they differ in terms of purpose (e.g., performance evaluation of mechanisms for CR [16] or cognition level determination [15,27]), technique adopted to represent the radio environment (e.g., simulation/emulation [16], experimentation [17], or hybrid approach [18]), and analyzed metrics, for example. However, conformance testing system for CR is still little explored, with no available mature study, even being fundamental to assert if the CR devices meet the regulatory policies and have the functionalities/abilities needed to operate without causing harmful interference to the PU. In this respect, the next section presents the proposed system and methodology for CR conformance testing.

Proposed System and Methodology for CR Conformance Testing
Cognitive radio provides flexibility and intelligence to the devices via radio softwarization. Introducing it into products requires a systematic checking of conformity to the defined standards and verification of their impacts in real environments, making the testing process essential. The software testing may be divided into structural and functional. In the former, the internal structure of the product is known, which allows that specific pieces of a component may be asserted. It aims to test the software, taking into account all the knowledge about the product (e.g., running each instruction at least once, all the ramifications and loops). The latter analyzes the externally observed functionality based on the product specification. It is also called black-box or conformance testing and adopts inputs and outputs values to determine if the built product is right and satisfies the defined standards and regulatory policies [28].
In order to standardize the conformance testing of open systems, the International Organization for Standardization (ISO) and the International Telecommunication Union (ITU) developed the norm ISO IS-9646: "OSI Conformance Testing Methodology and Framework" that provides a structure for specification of conformance tests and procedures to be followed during their execution, leading to comparability and large acceptance of the results produced by different laboratories [29]. Our methodology/system follows the norm ISO IS-9646 by structuring the conformance testing into three stages. In 3 Wireless Communications and Mobile Computing the first stage, a set of abstract tests (implementation independent) for CR is defined. Information such as definition and applicability, minimum requirements, test purpose, initial conditions, procedure, and test output requirements make up each test description. The second stage comprises the test implementation, i.e., the transformation of the abstract tests into executable ones that may run in real devices or testing system. Details of this stage are presented in Section 3.2. The last stage is the test execution in which the behavior of the system under test (SUT) is observed and its conformity is checked. The results are registered in the Protocol Conformance Test Report (PCTR) [28].
The architecture of the proposed conformance testing system is shown in Figure 1. The radio environment (primary and/or secondary communications) required for the tests is created by using software-defined radio (SDR). To make the conformance testing under the functionality perspective, two devices are considered: the device under test (DUT) and the reference one. The former refers to the device being tested, and the latter is the device adopted as reference, which is calibrated and has the target/desired CR behavior. In order to be approved, the DUT has to follow the reference device behavior, i.e., perform the desired functionality/action properly. Under the operation limit view, the DUT passes in the test if it operates within the defined limits. This perspective is commonly adopted in the conformance testing of mobile devices. Differently from other systems that focus on the device performance evaluation, our proposal inspects the DUT consistency by comparing it to a reference device (that meets the market standards) and its operation values to the defined limits (e.g., given by regulation and business policies). It is worth mentioning that even if no reference device (cognitive radio) is available on the market, our system is feasible because it admits that a logical device (comprising desired behavior and operation limits) may be considered. The architecture presents three databases (DBs). The first ("Test Cases") contains the test cases, which may be organized in macro areas such as spectrum sensing, spectrum mobility, location aware, and power control. Their input parameters may be set by the user, taking into account regulation policies and local transmission parameters (e.g., bandwidth, carrier frequency, transmission power, and type of primary signal). The test cases address specific aspects of the CR functionalities and complement each other, providing a fine-grained way to check in which aspects of a given functionality the DUT passed/failed. The second database ("Result Evaluation Scripts") stores the scripts used to evaluate the test case results. Different tests may require different evaluation scripts. Thresholds and regulatory policies that need to be satisfied by the device under test to check its conformity are found in the last database ("Conformance Thresholds"). These operation values reflect the regulation (e.g., defined by government bodies) and operator business policies.
The Cognitive Radio Testing Controllers (CRTs) manage all the stages of the testing process (from creation to result analysis). Our architecture comprises three controllers denoted as central CRT, DUT CRT, and Reference CRT. The first provides user interface and manages the synchronization and message exchanges among the architecture's elements. It provides inputs to the other CRTs, defining their local actions; accesses the databases to get the test case selected by the user, proper result evaluation script, and conformance thresholds to be used in the target test; analyzes the test results sent by the local controllers and check the DUT conformity, summarizing them for the user. The last two controllers perform local control (over the DUT and reference device), reconfigure the transmission parameters (e.g., transmission power, carrier frequency, and sensing time) according to the test case, and register the events that happen in the devices (DUT and Reference), sending the log to the central CRT. In addition to the previous components, our architecture may use auxiliary devices to create the radio environment of the test cases.
Our proposed system also follows the 3rd Generation Partnership Project (3GPP) and the European Telecommunications Standards Institute (ETSI) testing standard, which present two macroelements: SUT and conformance test system (CTS). The former is our DUT, and the latter is represented by the CRTs. The CTS defines the limits and open interfaces for testing and which tests may be performed. In addition, it has total control over the SUT and exchanged messages.   Figure 2 shows the interactions between the architecture components when a test is performed. First, the user requests the list of test cases from the central CRT (1), which queries the test cases database for getting this list (2). Once the list is ready (3), the central CRT shows it to the user (4). After the selection of the test case (5), the central CRT accesses the test cases database again (6) and returns the selected test case (7). After that, the CRT central asks the user for the input parameters and thresholds to run the test (8). Once they are received (9), the CRT sends the command of test execution to the local CRTs (DUT and Reference) (10) (11). At this point, the interactions among the components (e.g., message sequence and involved players) may be different, reflecting the dynamics of each test case. When the test execution ends, the local CRTs send the results to the central CRT (12) (13). The central CRT gets the script evaluation by accessing the script evaluation database (14) (15), conformance thresholds from the conformance threshold database (16) (17), and performs the result evaluation (18). Finally, it presents the conformity results to the user (19).
3.2. Implementation and Proof of Concept. Our system contains a set of predefined tests, just requiring the input parameters and thresholds from the user. Given these inputs, the system creates the scenario and performs the testing and con-formance analysis automatically, reducing the time spent in the test case configuration and not demanding a deep knowledge in programming from the user.
To instantiate the proposed system, we adopted the Universal Software Radio Peripheral SDR platform [30]. It allows PHY/MAC layers prototyping, dynamic spectrum access, and cognitive radio functionalities. We used two N210 USRP kits with LP0410 antennas, which provide transmission and reception of signals in the range from 400 to 1000 MHz, and SBX USRP daughterboards that cover frequencies from 400 MHz to 4.4 GHz [31]. The antennas were placed in shield boxes, enabling tests in both isolated and open scenarios. Each USRP was connected via Ethernet interface to a computer Intel Core i5-4460 3.20 GHz with 8 GB memory running the Ubuntu 64bit operating system. GNU Radio [32] and the Phyton [33] programming language were used to code the cognitive radio functionalities and the test cases. The result analysis scripts were coded by using the R language [34].
Although both DUT and reference device are instantiated by using the same USRP kits in our system, the reference device is set to have the desired behavior, operation limits, and abilities. The set of features, abilities, and operation limits that compose a reference device may be seen as a logical reference device. In this respect, we aim at showing the feasibility of our proposal, even when no reference CR device is already available (e.g., by using logical device and operation  The CR functionalities were designed in GNU Radio by GNU Radio Companion (GRC). GNU Radio contains signal processing features to build software-defined radio and signal processing systems. To make the test execution automatic (parameter inputs, test type selection, repetitions, etc.), a manager program written in Python handles the Python codes generated by GNU Radio. During the test execution, the events are logged in files, which are inputs for R scripts. System calls are used by the manager to integrate all these pieces (R scripts, GNU Radio codes, and Python programs).
The synchronization and message exchange between the architecture components were managed by the ZeroMQ middleware [35]. It is a high-performance asynchronous messaging library used in distributed or concurrent applications. We adopted the publish-subscribe communication mode, in which a data distribution tree is defined and the events flow from publishers to subscribers, indirectly addressed via the event's content. Our architecture takes advantage of the ZeroMQ to provide the following functions: (i) Attaching Devices. In the test initialization stage, the central CRT starts a publisher process that waits for a given number of subscribers to be attached. The amount of subscriptions is defined according to the test case (ii) Process Control. After the attaching of devices, the publisher defines the tasks (e.g., test starting and ending) to be performed by the devices (iii) File Exchange. During the test, the devices log the events locally. The ZeroMQ gets the log files from the devices and store them in the central CRT to be analyzed (iv) Synchronization. ZeroMQ synchronizes the devices via message exchange, ensuring that the events take place in the right order, defined in the test case Figure 3 illustrates the implementation of the proposed architecture by using two N210 USRP kits. More devices may be easily added to allow test cases with more users. Table 2 summarizes the hardware and software description.
As proof of concept of the architecture, we defined two test cases that follow the three main stages of conformance testing described in the norm ISO IS-9646 (Section 3). The first test case is related to the spectrum sensing functionality, and the second checks if the CR operates within the defined limits, involving spectrum sensing and handoff. Spectrum sensing is an essential capability for CR operation using the spectrum overlay approach in DSA because the CR user must discover available channels for its transmission, detect PUs, and release the channel (handoff) when they reappear [36]. To do so, the CR continuously alternates between transmission and sensing periods, in which the defined time for each stage may depend on factors such as primary usage pattern,  SU type, and regulatory policies. It means that different channels may demand heterogeneous transmission (sensing) times as well as the same channel may require different values for sensing (transmission) periods throughout the time in order to satisfy the current SU application. To meet these demands, the CR device has to be able to sense the spectrum (perform its transmission) by using different times defined via software.
In this respect, the first designed test case verifies the CR's capacity of adapting its transmission and sensing intervals and checks if the CR device is working properly and has knowledge of its state (transmission/sensing times). The more transmission (sensing) intervals the CR supports, the better flexibility it may provide to the application. This basic test case does not have minimum requirements and comprises the following steps: CRTs send the subscription requests (2) (3) to the Central CRT, which replies to them accepting their attachments (4) (5). These first five interactions comprise the attaching process provided by ZeroMQ. Next, the central CRT sends the test case and the command for starting the test execution (6) (7) to the local CRTs. They reply confirming the start of execution (8) (9). After that, both Reference and DUT CRTs run the test case (10) (11) and send the result logs to the central CRT (12) (13) (it refers to the file exchange function managed via Zer-oMQ). When the central CRT receives the results, it sends a message/command to terminate the processes that are running in the local CRTs (14) (15). The local CRTs do it and send a confirmation message to the central CRT (16) (17). These request/reply commands/messages exemplify the process control and synchronization achieved by using ZeroMQ.
It is also important to note that the interactions presented in Figure 4 take place between the events (10) and (13) shown in Figure 2 and are specific for this test case. Figure 5 shows the GNU Radio block diagram for the spectrum sensing functionality, and we highlight four blocks: the two "Parameter" blocks, which receive the input values for the transmission and sensing times; the block that provides the interface between the GNU Radio and the USRP device, named, "UHD: USRP Sink;" and the "Transmission-Sensing Controller 1.1," which manages the sensing and transmission stages according to the values defined in the "Parameter" blocks.
The previous test focuses on the DUT's ability to perform a given action (e.g., transmission and sensing with defined   (iii) Set DUT to start its transmission in the test beginning (iv) Define the channel detection time for each scenario (e.g., 2 s) For readability sake (too long section), we suppressed the GNU Radio block diagram and message signaling of this test case, which are similar to the first one.

Results
We defined a metric to be used in the DUT conformity verification, regarding the first presented test. It is aimed at where, A variation of PoM is given in Eq.
(2) and named PoM α . It is similar to the PoM, but adopts an error margin to define the match. It admits that the DUT view may be slightly different from the Reference one. The parameter alpha defines the maximum admissible difference between |t i − t i+1 | and |t i ′ − t i+1 ′ | to indicate a match. The α value may be defined to compensate the possible measurement errors (e.g., device calibration or time synchronization) or admit a light operational difference that does not imply in violation of regulatory and operator policies (e.g., level of admissible interference to the primary signal). where, Similarly, we can also use the PoM and its variant in the second test case analysis, but replacing the reference view by the defined limits/thresholds. In this way, we may check if a device under test passed or failed in the second test or how far/close it is from the defined limits.
We defined 10 instances (sensing and transmission time values) to be considered in the DUT and Reference device in the first test case, which are presented in Table 3. They aim at verifying the DUT's ability to perform sensing and transmission under different parameter values. For each instance, 10 runs were performed. Figure 7 presents the percentage of matching between the DUT and the Reference device for each instance in the first test case. We can observe that on average the PoM was 20-30%, achieving its highest value (about 45%) in the instance #5. It shows that the DUT behavior is far from the Reference one with regard to this test, not being approved in the test if

10
Wireless Communications and Mobile Computing the required grade was over 80%, for instance, and, as a consequence, demanding DUT improvements.
Putting the first test case instance in perspective (0.5 and 1) and considering an error margin (α) equals to 0.1 s, Figure 8 (Figure 9) presents the difference (in dashed line with square marker) between the duration of each sensing (transmission) instance performed by the DUT and Reference device in one test execution. It is noted that the sensing (communication) took place 12 times, and no one presented a difference value higher than the error margin (solid line). Regarding this error margin, the DUT achieves 100% of Po M 0:1 . So, if this margin is admissible by the operators and regulation bodies (e.g., in those scenarios in which the primary user usage pattern is not so dynamic, such as TV bands/signals), the DUT would be approved in the test. But, when no error margin (α = 0) is considered, only 5 instances of sensing or transmission carried out by the DUT have the same corresponding duration to those achieved by the refer-ence device. It leads to a PoM value equals to 20.83%, indicating that the DUT was not able to pass in the test. Figure 10 shows the results in terms of PoM for 10 test executions with the sensing and transmission times equal to 0.5 s and 1 s, respectively. We may observe that, in general, the PoM score did not exceed 40%, and its highest value was about 60% in instance #8. It implies that (considering the PoM metric) the DUT behavior does not have a great similarity to the reference one, given that the target level is 100% of matching, for example.
When the PoM α (with alpha equals 0.1) is taken into account, Figure 11 shows that the DUT operates within the error margin in all the execution instances, achieving Po M 0:1 equals 100%.
In the second test case, we analyzed the DUT channel detection time conformity under different bandwidths and limits. Values such as 1.4, 3, 6, 10, 15, and 20 MHz (used in LTE networks and TV operators) and 0.2, 0.3, 0.4, 0.5, 0.6,

11
Wireless Communications and Mobile Computing and 2 s were adopted for bandwidth and channel detection time (CDT), respectively. Moreover, we considered a DUT equipped with two antennas, allowing that the spectrum sensing takes place even when the DUT is performing transmission. Our system admits other approaches (e.g., disjoint transmission and sensing with fixed or variable periods) that may be seen as devices with different capabilities (e.g., processing and hardware configurations) inside a device family or from different vendors. Figure 12 presents the results (PoM) got by the DUT in the second test, considering different channel detection times and a bandwidth equals 1.4 MHz. It is noted that when the CDT is 0.2 s, the DUT is not able to operate within the limit, i.e., PoM is 0%. This limit (0.2 s) could be considered in licensed bands where the primary usage pattern is so dynamic, such as the cellular bands in urban areas. When higher CDTs are defined, the DUT got higher PoM, achieving 100% in 0.6 s. In bands where the primary use is less frequent (e.g., TV broadcasting in rural areas), CDT equals 0.6 s could meet the protection requirements. In this scope, the DUT would be in conformity. Figure 13 presents the results for the second test under different channel bandwidths and a channel detection time equals 2 s, which is the limit defined in the IEEE 802.22 standard for Cognitive Radio Wireless Regional Area Networks (WRAN) operating in TV White Spaces (TVWS) [37]. The results show that the DUT is able to operate within the limit (CDT) for all tested bandwidths, i.e., it is in conformity (PoM = 100%) with the defined limit.

Wireless Communications and Mobile Computing
It is worth mentioning that the presented results aim at showing the feasibility of our system for CR conformance testing. Thus, the DUT is handled as a black-box, i.e., the mechanisms or technologies embedded in the DUT that contribute to it passes/fails the test are not the focus. How the proposed system works and may provide outcomes for DUT conformance analysis is the main point.
In addition to the previous metrics, our proposed system is flexible to support others that may be defined and stored in the Result Evaluation Script Database. For instance, if it is important to know how far (in absolute values) the DUT behavior is from the Reference device regarding the sensing and transmission times in the first test case, the total time difference could be defined to express the accumulated difference between the DUT's sensing (transmission) time and the correspondent reference's one.

Conclusion
The cognitive radio technology provides an intelligent and efficient spectrum usage, allowing that new wireless systems and 5G applications may be supported. Although many researches have been conducted in CR, conformance testing methodologies and systems are still unexplored. In this respect, we proposed a conformance testing methodology/system for cognitive radios that allows to verify whether a device meets the regulatory policies and CR functionalities, which is essential to launch it in the market. We adopted a USRP-based testbed to instantiate our system/methodology and showed its feasibility through a proof of concept with two test cases and a proposed metric, analyzing the device conformity with regard to the sensing, transmission, and channel detection times. Results showed that the device under test was far from the reference in both perspectives (functionality-behavior and operation limits) and needed to be improved for getting acceptable levels. But, when an error margin was tolerable, it operated in the accept-able range/behavior of the reference device, getting 100% of matching.
In addition, we have pointed out the modularity and flexibility of our system to support other test cases, metrics, and thresholds. Our testbed also performs conformance testing and analysis automatically, not demanding a deep knowledge in programming from the user.
Future directions include the design of new test cases that encompass the spectrum sensing and other CR functionalities (e.g., spectrum mobility and power control) and related metrics, as well as their addition into the testbed. In addition, 5G/6G scenarios and features are envisioned to be addressed in our system.

Data Availability
Data may be available under request.

Conflicts of Interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.