Color Image Retrieval Method Using Low Dimensional Salient Visual Feature Descriptors for IoT Applications

Digital data are rising fast as Internet technology advances through many sources, such as smart phones, social networking sites, IoT, and other communication channels. Therefore, successfully storing, searching, and retrieving desired images from such large-scale databases are critical. Low-dimensional feature descriptors play an essential role in speeding up the retrieval process in such a large-scale dataset. A feature extraction approach based on the integration of color and texture contents has been proposed in the proposed system for the construction of a low-dimensional feature descriptor. In which color contents are quantified from a preprocessed quantized HSV color image and texture contents are retrieved from a Sobel edge detection-based preprocessed V-plane of HSV color image using a block level DCT (discrete cosine transformation) and gray level co-occurrence matrix. On a benchmark image dataset, the suggested image retrieval scheme is validated. The experimental outcomes were compared to ten cutting-edge image retrieval algorithms, which outperformed in the vast majority of cases.


Introduction
Due to the advent of diverse devices such as smart phones, tablets, drones, CCTVs, and other image-capturing tools, as well as high-speed Internet, technology and its range of applications are rapidly expanding in the contemporary era. Huge amounts of unstructured image data have been generated in a variety of disciplines, including medical and health insurance, forensics, cars, archaeology, criminal prevention, architecture, and defence [1][2][3][4]. As the Covid efect fades across the world, it is anticipated that roughly 1.4 trillion photos will be taken in 2021. In addition, with the apparent rise in relevance of the IoT, edge devices such as smartphones generate a great quantity of photos. Huge amounts of data necessitated an appropriate approach to organise, manage, and retrieve photographs from a vast database, which proved to be a difcult undertaking. Text-based image retrieval (TBIR) techniques were initially used to retrieve images from digital datasets using text-based queries based on their annotations, which could include a range of descriptions or tags. Because human engagement or labour is an important part of the annotation/tagging process, it highlighted numerous errors. Terefore, traditional procedures are inefective, and their accuracy is questioned. Furthermore, it is a time-consuming, costly, and repetitive operation. As a result, alternative approaches known as content-based image retrieval (CBIR) have been discovered to overcome the shortcomings of traditional methodologies, providing a fresh opportunity to solve the problem of image retrieval [5]. CBIR is the technique of automatically indexing and retrieving images from big databases based on the contents of the images known as features such as color, texture, and shape. A novel content-based picture retrieval strategy is proposed in the presented work. Unlike previous systems, which rely on a single feature for retrieval, our research uses color and texture content for image retrieval; color and texture are the most important visual features for humans [6]. Te color and texture contents of preprocessed photographs were retrieved in this research, and the Laplacian flter was used to remove unnecessary information by sharpening the color images. To extract the image's color instances, the HSV color space quantization approach is being used. Te texture contents are obtained with discrete cosine transformation (DCT) and gray level co-occurrence matrix (GLCCM), and the image is then processed with the Sobel edge detection method. Te spatial and interblock relationships were determined using GLCCM-based DC and AC coefcients to calculate these contents. Finally, by fruitfully merging color and texture contents, a low-dimensional single feature descriptor is generated, which speeds up retrieval. Te Euclidean distance is being used to compute the similarity between the database image feature descriptors and the query image. Te accuracy of the proposed scheme was tested using two datasets, the Corel 1K and the UC Merced, in terms of precision rates, recall rates, and F-scores rates.
1.1. Motivation. Te retrieval of the desired images from a digital repository with semantically varied categories is a very tedious task for many researchers, especially in cloud assisted IoT [7]. Te image retrieval is a searching technique in several real-world applications, such as medical imaging [8], searching individual video frames [9], object retrieval [10], image classifcation [11], digital libraries [12], and multimedia event detection [13]. In this regard, several CBIR schemes have been proposed based on transformation tools like discrete cosine transformation (DCT) and/or spatial domain techniques like color quantization, color histogram to extract the efective and signifcant features for image retrieval applications [14][15][16]. In most of the existing CBIR applications, either transformation tool or color information is applied for feature extraction. It is evident that the combination of the DCT and color-based approaches provide the signifcant image feature descriptors for image retrieval. Te proposed research work is motivated by a simple nonuniform histogram quantization process and block level DCT based interrelated information using statistical analysis. In order to extract image information, the statistical color moments from quantized image have been computed while an image plane is divided into 8 × 8 fxed-sized blocks, and each block is employed by the DCT transform to get inter-related information. Te main advantage of applying DCT is that it has powerful image analysis and discriminative properties. To improve the performance, the color information-based features are fused with DCT-based features for efective and efcient image retrieval.

Literature Survey
Numerous CBIR approaches have been presented that are based on the extraction of low-level image features/contents in the transform or spatial domain. Aamer et al. [17] developed a scheme/method for extracting DCT features from images that improves retrieval speed and reduces the amount of storage required during image retrieval. In this study, the researchers separated the input image into 8 × 8 nonoverlapping chunks and then applied DCT to each one. Te image features can be extracted from the histograms of the quantized AC and DC coefcients of each transformed block, and the Euclidian distance between the query image's features and the database images can be calculated, and the closest images from the database can be retrieved using the minimal level quantifed similarity measures. Yun et al. [18] suggested a CBIR approach based on the image's color and texture attributes. Color features are taken from distinct normalized GCLMs of the grayscale image, while texture features are extracted from both color and block color histograms. For superior retrieval results, they combined both features using a simple fusion method. Kavitha et al. [19] present another block-based image retrieval approach, in which an image will be frst segmented into equal-sized sub-blocks for feature extraction. After that, the color information for each block is recovered by dividing the HSV color space into nonequal periods and representing the color features with a cumulative histogram. To represent the fnal feature, the texture feature is obtained using GLCM and combined with the color feature. Priyanka et al. [20] conducted a comparison of CBIR systems employing various feature extraction approaches. Te texture feature was computed using wavelet and Gabor flters, while the color feature was retrieved using the color moments of the HSV color space. Te similarity distance is calculated using the chi-square and Euclidean distances, and the top photos with similar features are retrieved. Tey found that employing Euclidean distance to combine color moments with Gabor texture gave them the highest precision rates of any known method. Jiquan ma et al. [21] suggested a CBIR scheme for image feature extraction based on HSV color space and discrete wavelet transform (DWT). Tey used the wavelet transform to breakdown the signal into a number of fundamental functions, and then used the Daubechies-4 wavelet transform to decompose the image. To create an eight-dimensional texture feature, the mean and standard deviation of the four bands are determined. Te texture feature based on wavelet transform provides a better performance and stability, according to the testing results. Wang et al. [22] proposed image retrieval based on DCT and DWT with feature extraction utilising grading algorithms in 2015. Te color moments, color histogram, and a novel dynamic color space quantization based on color distribution were modifed to generate a color feature in the DCTdomain, while the texture feature was computed using the DWT domain. In terms of retrieval accuracy, the experimental fndings show that two grading image retrieval methods operate efciently and efectively. Kaipravan et al. [5] propose another CBIR approach based on color and texture features. Te color 2 Computational Intelligence and Neuroscience feature is computed by partitioning an image into three equal horizonal regions and then computing the two color moments from each subimage plane using each color channel separately. Gabor wavelets capture energy at a given frequency and orientation to extract the texture information.
Weights are assigned to each feature vector, and the Manhattan distance is used to calculate the similarity measure. Tey came to the conclusion that a single color or texture feature is insufcient to efectively characterise a picture; therefore, color-texture features are combined for improved retrieval efciency. Chen et al. [23] constructed a CBIR technique that extracted color-texture features utilising the HSV color space in the year 2020. For feature representation, they frst divided the image into 4 × 4 blocks to split the image into 16 sub-blocks. In order to extract signifcant features, the proposed method further divides a rectangular overlapping block into nine overlapping subblock regions based on the sixteen sub-blocks. Tis overlapping method has advantages such as reducing the storage space and reducing the calculation amount of the similarity measure of the image. Tis method also does not destroy the information connection between the images because of the sub-blocks, thus ensuring better retrieval accuracy. Our presented work is also compared with some state-of-art schemes; those are described one after another in detail. In year 2015, Shrivastava et al. [24] proposed a new image retrieval technique that retrieves similar images from an image dataset in three stages using primitive low-level image features such as color, texture, and shape. In their proposed scheme, a fxed number of images are frst retrieved based on their color feature similarity. Te color feature was extracted using quantized color histograms in HSV color space, and the number of pixels in each bin of the histogram was used to form a color feature vector. To improve the retrieved images' relevance, their texture and shape features are matched, respectively. Te Gabor wavelet transform was used to compute the texture information of the image, while the shape feature vector was constructed by computing the Fourier descriptor based on centroid distance. Tis method reduced the computation time and increased the overall accuracy due to the retrieval of images in three diferent stages. Later in year 2016, Dubey et al. [25] proposed a novel method for image description with multichannel decoded local binary patterns and introduced an adder and decoderbased scheme for combining the local binary patterns (LBPs) from multichannel of image. Two multichannel decoded local binary patterns are introduced-multichannel adder local binary pattern (maLBP) and the multichannel decoder local binary pattern (mdLBP). Both maLBP and mdLBP utilize the local information of multiple channels based on the adder and decoder concepts. Te feature descriptor has high dimensionality due to combination of the multichannel-based LBPs. Mistry et al. [26] proposed a hybrid feature-based efcient CBIR scheme using various distance measures in year 2018. Spatial domain features including color auto-correlogram, color moments, HSV histogram, and frequency domain features like moments using SWT and Gabor wavelet transform were used. Further, color and edge directivity descriptor features were performed to enhance precision binarized statistical image features. Tey claim that their results are better than all the existing models. Similarly, in 2018, Irtaza et al. [27] proposed an approach that resolves the classifcation disagreement amongst different classifers and the class imbalance problem in CBIR. Tey have used a genetic algorithm (GA)-based classifer comity learning (GCCL) method to generate stable classifers by combining ANN with SVMs through asymmetric and symmetric bagging. Once the stable classifers were generated, the query image was presented to the trained model to understand the underlying semantic content of the query image for association with the precise semantic class. Later, they computed the feature similarity within the obtained class to generate the semantic response of the system. Later on, in year 2019, Ahmed et al. [28] proposed a novel technique to fuse the spatial color information with shaped extracted features and object recognition which increases the strength of the image features for the information fusion purpose in retrieval process. Tey extracted the color features from RGB images and used the gray level image for the pixel intensity-based local features. Tey combined the local image features, spatial information in BoW architecture and evaluated the results on popular image collection databases. Vimina et al. [29] proposed another CBIR scheme in year 2020 using texture-color descriptor by integrating the multichannel features. For the texture feature, they used a fxed-sized local intensity-based descriptor, MMLBP, integrating the multichannel local intensity information of the image at the pixel level. Te dimensionality of the descriptor is fxed irrespective of the number of channels in the image. Te resulting histogram of the patterns is used for representing the image texture. Te color feature is extracted by quantizing the RGB color space and is represented with a histogram. Te color-texture descriptors are further fused to characterise the images. Te MMLBP, along with a quantized color descriptor, is used for characterising the images for retrieval purpose. Garg et al. [30] proposed a CBIR scheme in year 2021 to obtain feature descriptor from multilevel image decomposition. Tey achieved this by extracting approximation and correct coefcients by applying discrete wavelet transformation to the RGB channels. Terefore, both approximation and correct coefcients are applied to the dominant rotated local binary pattern called texture descriptor, which is computationally efective and rotationally invariant. Te local descriptors are extracted from the entire image, for which they used methods such as SIFT, SURF, HoD, and LBP. A rotation invariance function image for a local neighbor patch is obtained by measuring the descriptor relative to the reference. It navigated approaches that contained the complete structural information, extracted directly from the local binary patterns, and the additional information like the information of magnitude, which, in turn, achieves extra discriminating power. Ten, the GLCM description is used by obtaining the dominantly rotated local binary pattern image to extract. Te proposed scheme is trained and tested on the three classifers: support vector machine, K-nearest neighbor, and decision tree. Varish et al. [31] demonstrates an image retrieval scheme where a fused low-dimensional feature Computational Intelligence and Neuroscience descriptor is obtained by fusing probability histogram-based HSV color moments and multiresolution-based shape moments. Te color moments and shape moments were extracted from the Laplacian flter-based preprocessed image. In addition to the color-shape feature, the texture feature is also included by Sumit et al. [32] for feature representation, where YCbCr color space for the feature extraction process is used and Y, Cb, and Cr color planes are minimally overlapped. A mid-rise quantization scheme preprocesses the images. Te texture and shape features are extracted from the Y plane using BDIP and BVLC techniques. Subsequently, they used adaptive tetrolet transform in the output of BDIP and BVLC to extract local textural and geometrical features. At the same time, they selected the Cb and Cr components and applied adaptive tetrolet transform to analyze the regional local color variations of the image. Finally, they combined the nonoverlapping extracted shape, texture, and color features to form the fnal feature vector for the retrieval process. In order to, reduce image retrieval's search space and computational complexity, Joseph et al. [33] developed a CBIR scheme that investigates various search space reduction techniques and classifes the image collection into a subset of related images. Tey proposed an image clustering using hybrid K-means moth fame optimization algorithm (KMFO) which enhances the performance of the K-means algorithm by assigning the optimum number of clusters and cluster centroids based on the number of fames and fame values. HSV color histogram, color correlogram, wavelet transform, GLCM, color moments, dominant color, and region-based descriptors are used as feature descriptors. Motivated by the above-discussed works, authors have proposed a novel feature extraction technique using color and texture features based on the spatial domain and transform domain, respectively, where features have been computed from an HSV color image.

Major Contribution.
Te main contribution of the proposed image retrieval scheme includes the following: (i) A low-dimensional feature descriptor using the fusion of spatial domain-based color information and transform domain-based texture information is constructed for image retrieval applications. It reduces the computational overhead for retrieving images from large-scale datasets in a speedy manner. (ii) To extract the color information, a nonuniform quantized color histogram is used and subsequently the color moments from the preprocessed image have been computed for formation of color feature descriptor. (iii) Te inter-related information between blocks has been extracted using block level DCT tool, where associativity has been determined using the AC and DC coefcients of image blocks. (iv) Te spatial relationship between the preprocessed AC and DC coefcients in the DCT domain has been established using GLCCM, and subsequently, the statistical parameters have been calculated from corresponding GLCCMs for the construction of texture feature descriptors. (v) Te integration of color and texture information is done, and the proposed image retrieval scheme is validated on two Corel 1K and the UC Merced benchmark databases, where the diversifed results have been achieved.

Organization of Paper.
Te remainders of the paper are laid out as follows: Te suggested CBIR image retrieval approach, as well as its feature extraction and retrieval procedure, is discussed in detail in Section 3. Te experimental results and comments are detailed in Section 4, and a comparative study with existing retrieval methods is presented to quantify retrieval efciency. Finally, Section 5 brings the proposed work to a close.

Proposed Image Retrieval Methodology
Te proposed scheme comprises of preprocessing, color, and texture information, and similarity distance-based image retrieval system. Details of each will be described and discussed in the following subsections. [31] is considered in the proposed scheme for sharpening color images. Because it is based on second-order derivatives, this flter produces a considerably enhanced version of the image, whereas other kernels such as Prewitt, Sobel, and others are based on frstorder derivatives. Fine thin lines and isolated points are also produced. Te 3 × 3 mask with centre (− 8) has been used in the presented work for fltering (L1) images, as shown in equation (1), and it has been implemented in all spots of the image by a convolving operation. Tis mask is not just for grayscale images; it may also be used on color images.

Preprocessing. Te Laplacian flter
Te HSV color image is decomposed into its three color components: H, S, and V. Color visual characteristics are retrieved from the H and S components of the HSV color image, while texture visual features are computed from the V component. In [34], the entire process of preparing an RGB image is detailed with multiple kernels.

Color Feature
Representation. An HSV color image is more intuitive and closer to people's subjective color consciousness than visuals in other color spaces [35]. In this paper, an RGB color image is frst converted into the HSV color model, and then some preprocessing processes are performed. Tis step is important since the RGB color space specifes the image in primary colors, which is less efective than the HSV color space when it comes to describing objects. Similar to how the HSV color space defnes an image, the human eye interprets images based on comparisons such as color, vibrancy, and brightness. Te values of H, S, and V must be transformed to a specifed range based on the human perception system for easier computation. Te hue component has angles ranging from 0 to 360 degrees, while the value and saturation components have values ranging from 0 to 1 percent. Te image in HSV color space needs to be quantized according to the human eye's perception characteristics as referenced in [36]. Te obtained H ′ S ′ V ′ color image contains only 81 colors which represent original images with lesser numbers of colors as compared to the original HSV color image. Te statistical moments-based color information of an image provide the important properties of intensity level distribution like smoothness, uniformity, fatness, contrast, and brightness [37], which improves the retrieval efciency of system. Te proposed color feature representation method computes mean, standard deviation, skewness, and kurtosis from H ′ S ′ V ′ color image for formation of the color feature descriptor [38]. Let μ CC be the mean, σ CC be the standard deviation, c CC be the skewness, and κ CC be the kurtosis of each CC color component, where CC ∈ H ′ , S ′ , V ′ . Te statistical moments from the quantized color image are computed as where X i is the i th pixel value in the CC color component, P(X i ) is the corresponding probability and T is the total number of pixels in the corresponding component. Te μ is the average of intensity values, which describes the brightness of an image while the σ measures the distribution of intensity values about the mean and defnes the contrast of an image. Te c the measure of symmetry or more precisely about its mean value and it also determines the lack of symmetry in a set of data points. Te kurtosis calculates peak of the distribution of intensity values about the μ and also measures the outliers present in the distribution. An integration of statistical moments from all three color components has constructed the color feature descriptor, which represents the color information of the image efectively. Hence, color feature descriptor is defned as where CC ∈ H ′ , S ′ , V ′ , since the four moments have been computed from each color component, therefore, the dimension of the color feature descriptor will be 12. Te algorithmic 1 steps for color feature representation are as follows: 3.3. Texture Feature Representation. Sobel edge detection ( [39]), discrete cosine transformation ( [40]), and the gray level co-occurrence matrix (GLCCM) ( [41]) are used to extract texture information. It is a derivative-based approximation operator that performs a 2D gradient measurement on an image and accentuates high spatial regions around the edges. As seen in the image in Figure 1, the operator is made up of a pair of 3 × 3 convolution kernels.
One kernel is simply the other one 90 0 rotated. Te fundamental goal of edge detection is to reduce the quantity of data in an image while keeping structural qualities that can be used for future image processing. Tere are several edge detector techniques, and this study concentrates on the Sobel edge detection methodology.
In the proposed method, the Sobel operator is employed on the Laplacian-based preprocessed V-component. Te convolved operation is performed image using given template, and the gradient values in each horizontal and vertical directions are computed. Tese kernels are designed to respond efciently to the edges running vertically and horizontally relative to the pixel grid. One kernel is considered for each of the two perpendicular orientations noted. Tese kernels can be used individually on the input image to yield distinct measurements of the gradient component in each orientation as f a (a, b) � f a and f b (a, b) � f b . One can acquire the absolute magnitude of the gradient at each position as well as the gradient's orientation by combining these two. Te magnitude of the gradient is calculated as follows: where Te angle of orientation of the edge (relative to the pixel grid) giving rise to the spatial gradient is given by When the orientation is 0, the largest contrast from black to white runs from left to right on the image, and all other angles are measured clockwise from the horizontal direction. Te approximate magnitude is determined as follows for easy and quick computation: Te discrete cosine transform converts an image from the spatial to the frequency domain, and is commonly used Computational Intelligence and Neuroscience 5 in data reduction, feature extraction, and watermarking. Te block of size Y × Y is represented by f (p, q). An image's 2-D DCT transform can be defned as shown in the following equation: where and α � cos Te function F(s, t) represents DCT transformed image corresponding to the given image block f(p, q) with respect to the (p, q) coordinates. Te transformed image has DCT coefcients which represents the image information. Te most important image information is concentrated in the upper left corner of the transformed image known as the low-frequency band information while the lowest right corner has the insignifcant information known as highfrequency band information and it refects the contour and other unnecessary information of the image. Te some coefcients in low-frequency band have been selected by discarding the high-frequency band information completely. Te value F(0, 0) represents DC coefcient or average/ energy of image and the remaining DCT coefcients are known as AC coefcients. In order to fnd the appropriate information, the transformed block is quantized using standard quantization table [42]. Tereafter, the selection of DCT coefcients in zigzag scanning order gives the most appropriate image information, and this order is shown in Figure 2.
Since, the image blocks contains the overlapping or relative information with each other, in order to determine the associativity between the blocks based on the DC and AC coefcients, DC and AC matrices have been established for feature extraction. Te process for selection of DCT coefcients and formation of DC and AC matrices are described in the following equation. For two adjacent image blocks, where b represents block, DC b(i) is energy of i th block and DC df i is absolute diference value between the next block (i + 1) th and current block i th . Tis value has been computed throughout the image. For example, if size of an image size M × N and block size is 8 × 8, then total number of blocks will be bn � M × N/8 × 8. So, all the diference values are collected as follows: Similarly, eight values have been selected from each block and special kind of coding is performed on selected AC coefcients in zigzag order between two adjacent blocks. Te computation process is given as where j represents number of selected AC coefcients from each block. For one block AC j , j � 1, 2, . . . , 8, values are given as for example (AC 1 , AC 2 , AC 3 , AC 4 , AC 5 , AC 6 , AC 7 , AC 8 ) � (1, 1, 1, 1, 0, 1, 1, 0) � 246 which is the decimal representation and this decimal value is represented by array AC df i . Tus, the collected values of an image is given as Finally, the DC and AC matrices have constructed by using (11) and (13), respectively. Tese matrices will be used for formation of the texture feature descriptor.
Te gray level co-occurrence matrix (GLCCM) is one of the important methods to examining the texture properties Require: an RGB color image Ensure: color feature descriptor (1) Take an RGB color image (I) as an input and use the Laplacian operator to process and analyze it. Transform a preprocessed I I ′ into an HSV color image. IHSV � HSV(I ′ ) (2) Decompose an HSV color image into its H, S, and V components then quantize each one separately to get H ′ , S ′ and V ′ components. (3) For color feature representation, construct the probability-based histograms of H ′ , S ′ , and V ′ components, respectively, and accordingly compute the statistical color moments using equation 2. (4) Finally the color feature descriptor is obtained by integrating all the computed statistical color moments using equation 3 which can be re-written as FV Color � μ CC , σ CC , c CC , κ CC .
ALGORITHM 1: Color information extraction. 6 Computational Intelligence and Neuroscience of the image. To obtain the texture feature descriptor, GLCCM is applied on both DC and AC matrices separately and create GLCM eigenvectors from certain parameters. Te GLCCM for a given pair of pixels in a specifc direction (θ) and particular pixel distance d is defned as the frequency of elements i and j of the matrix, respectively. Te value of the GLCCM is denoted by the P(s, t | d, θ) which is computed symmetrically throughout the matrix and the size of GLCCM is depend on gray levels number. In general, the GLCCM method computes four matrices for four directions i.e., 0 0 , 45 0 , 90 0 , 135 0 and for a constant pixel distance d. For texture feature representation, frst GLCCM is normalized to avoid variation among the elements of the matrix. Te normalized element P(i, j) of GLCCM is defned as where G is the total number of gray levels. In order to extract the texture features, a number of statistical parameters are obtained from normalized GLCM, however, the computational requirements for considering all these are too high and unrealistic. Terefore, only four parameters are considered for construction of texture feature descriptor in this work. Te four parameters are defned as In the proposed method, above the texture parameters from the gray level co-occurrence matrices GLCCM DC and GLCCM AC of the DC and AC-based components, respectively, where 4-4 matrices have been constructed using four directions i.e., 0 0 , 45 0 , 90 0 , 135 0 and corresponding the energy, contrast, entropy and correlation have been constructed. Te collective values of the abovementioned four parameters represent the texture feature descriptor of any image.
FV � F con , F asm , F ent , F cor , Let FV DC and FV AC be represent the texture properties of DC and AC-based matrices and calculated by using (16), then the fnal texture feature descriptor is obtained as Te speed and retrieval of the system is improved vastly due to the formation of low dimensional texture feature descriptor. Te whole process of computing the texture feature descriptor is presented by the following algorithmic 2 steps.
Te block diagram of proposed method is shown in Figure 3, where color and texture features are integrated to present fnal feature descriptor. Te fnal feature descriptor obtained by integrating the (3) and (17) is given as Since this feature descriptor is not normalized because the components are already in a proper range and it is not degrading the retrieval accuracy. Te fnal feature descriptors of all images available in the digital repository and that of the query image are constructed. Te collection of fnal feature descriptors is stored in the feature database, and similarity distances are calculated between the given query and digital repository based on the feature descriptors. Corresponding to the frst few minimum distances, the topdesired images have returned to the user in relation to the query.

Similarity Measure.
To ofer an appropriate response to an image query, a vast number of image databases require both rapid comparison and feature extraction. It can take a long time to look through every image in a huge image library. Te appropriate distance between query feature descriptor and database image feature descriptors has been generated to measure the similarity between the query image and the database images. Because of its efciency and effectiveness, the Euclidean distance is one of the most widely used approaches for retrieval. Terefore, in the proposed system, the Euclidean distance (ED) is used to calculate the similarity measure. It calculates the square root of the total of the squared absolute diferences to determine the distance between two image feature descriptors. It is defned as follows: Computational Intelligence and Neuroscience where d is the dimension of the feature descriptor, fd q and fd t are feature descriptors of query image and target image of the digital repository.

Computational Intelligence and Neuroscience
In general, any CBIR approach is divided into two parts. In phase I, all of the photos in the dataset are acquired one by one for the extraction of color and texture information. To establish a feature database, the extracted combined features are stored in a database as feature descriptors. In phase II, however, to get comparable types of images from the digital repository using the same way, the user must provide the query as an input image. By computing the similarities using ED, the combined texture and color feature vector is created and compared with the feature descriptors of the database photos. Te user gets shown the photographs that are the most comparable to the query image.

Evaluation
Measurements. Tis section has described the performance assessment methods that not only evaluate the efectiveness of image retrieval but also ensure the stability of fndings in order to constructively illustrate the success of the suggested image retrieval system. In the purposed CBIR approach, three signifcant measurements were employed to evaluate retrieval performance: precision, recall, and f-score. Tese parameters are described as follows: where A denotes relevant retrieved images, B denotes the total number of retrieved images from the dataset, and C is the total number of relevant images publicly available per category. Consider the following scenario for clearance: a CBIR technique for query image retrieves a total of 10 images, of which 7 are relevant, from a total of 30 similar images in one database category. Te precision will be 7/ 10 � 70%, while the recall rate will be 7/30 � 21%. Terefore, in this observation it can be observed that the recall rate alone is insufcient to establish the success of a CBIR; precision must also be calculated. F-score in the CBIR method designed for it. Tese parameters are described as follows: precision and recall are two metrics computed to illustrate the efectiveness of image retrieval, and they assess the accuracy of image retrieval with relevance to the database images and query. Tese two measurements, however, do not represent the total accuracy of efective image retrieval. So they can be combined to produce a single value that measures picture retrieval accuracy, which is known as the F-Score or F-measure. When retrieving images from the Corel-1000 and UCML-2100 datasets based on a query image, the values of B and C are set to 10 or 20 and 100, respectively, in the retrieval. Since each image in the category is used as a query image, the accuracy must be presented in terms of averages/means. Te average precision, recall, and F-score can be calculated as follows: where @P avg (M), @R avg (M) and @F avg (M) are the averages for precision, recall, and F-score, respectively, for M image category and nc is the total number of images in each category. In this paper, the Corel and radar image datasets are used so the value of nc � 100 because each image class has 100 images.

Retrieval Results and Discussion.
Te proposed method is performed on preprocessed and without pre-processed image in the retrieval process using a simple fusion of color and texture descriptors. For color descriptor, an HSV color model is quantized into nonuniform bins to get a new model color H ′ S ′ V ′ model which consists of 81-colors. Since the four statistical color moments have been computed from each quantized color component, therefore, the dimension of the color descriptor will be 4 × 3 � 12-D. For texture descriptor, authors have computed DC and AC matrices (i.e., already discussed in texture feature representation section in detail) of V component and corresponding to each matrix, the GLCM is computed for four directions i.e., 0 0 , 45 0 , 90 0 , 135 0 and constant pixel distance (d � 1). Te four values for DC matrix and four values for AC matrix have been computed, so the dimension of the texture feature descriptor is 8-D. Te dimension of the fused feature descriptor will be 12 + 8 � 20-D. Te fused feature descriptors of all database images have been constructed for retrieving process. Table 1 shows the top 10 retrieved images from the  Computational Intelligence and Neuroscience dataset based on preprocessed and without preprocessed images for UC Merced land-use dataset while same is shown in Table 2 for the top 20 retrieved images. In both the cases, the preprocessed method gives the better results as compare to the without preprocessed image method. In case of top 10 images, it is evident that the preprocessed image technique provides the huge improvement in some category images. For example, the best category image i.e., chaparral has increased precision from 90.52% to 99.50%, which is almost 9% hike from without preprocessed to preprocessed image. Similarly, the worst category image i.e., roads have increased precision from 78.20% to 90.40% which is almost 12% hike from nonprocessed to preprocessed technique. In overall, the mean average for precision, recall, and F-score are satisfactory of the proposed CBIR method for top 10 and 20 retrieved images from the digital repository. It is also noticed from the tables that the accuracy is highly increased from top 10 to top 20 retrieved images. Te F-score results indicate that when using statistical texture features with color diferent quantization gives the highest retrieval performance for the class chaparral.
Similarly, the results for the Corel-1K dataset have been included for the top 10 retrieved images using both the preprocessed techniques. Table 3 shows the retrieval results for the top 10 images using the preprocessed technique for Corel-1K image dataset. In this table, the proposed CBIR method have produced the highest results for dinosaur images while the lowest retrieval results were attained by the mountain category images for the 10 retrieved images. For top 20 images, Table 4 shows the retrieved results, where the most of the category images have good retrieval results but beach and mountains images have the lowest results because the images are complex, structures and their contents are mixed with each other. Te overall means for recall, precision, and F-score for top 10 retrieved images without preprocessing are 8.18%, 81.53%, and 14.87%, while it becomes 9.21%, 92.11%, and 16.75% using with pre-processed image. Hence, it is a huge improvement from without preprocessing to with technique. It is also observed from the table that the little bit accuracy has been decreased from top 10 to top 20 images. Te retrieval results for top T − i, (i � 10, 20, 40, 60, 80, and 100) images are shown in Figure 6 for radar remote image dataset in terms of average precision, recall, and F-score rate while it is shown in Figure 7 for Corel-1K image dataset, where T − i represents number of images retrieved from the dataset.For visualisation purposes, we have selected two best and two worst category images from both the radar remote sensing image and Corel-1K image datasets. Figure 8 shows the best retrieval results for retrieving top 20 images from radar remote sensing image dataset for chaparrals and forest image category, where these results are only given query images. For the chaparral query image, values of the precision (P), recall(R), and F-score (F) are 100.00%, 20.00%, and 33.34%, respectively, while the technique gives P � 80.00%, R � 18.00%, and F � 29.38% for forest image category. For worst-case results, Figure 9 depicts the retrieved images for beach and grounds images, where the frst images from top left corner are queries. Figure 10 depicts the top 20 dinosaurs and horses' images from Corel-1K image dataset, where top upper left corner images are the queries. Te results may be changes if different query images are selected but in overall categories, dinosaur, and horse images have given the best results. In case of lowest results, the beach and mountain image categories are shown in Figure 11, where blue color cross symbol × represents the nonrelevant images.

Comparative Study and Analysis.
To check the relative efcacy of the proposed CBIR method, the authors have compared the results with ten relative state of art CBIR schemes [24][25][26][27][28][29][30][31][32][33]. Te discussion and analysis with our proposed scheme is as follows. Te most of the above discussed papers have very good retrieval results but they have many limitations such as high dimensional feature, retrieval speed descriptors, and accuracy. However, our proposed method has a high retrieval speed and low dimensional feature descriptors with satisfactory results in terms of average precision, recall, and F-score. Te proposed method is compared in terms of average precision (@P avg ), recall (@R avg ), and F-score (@F avg ) with the above-discussed methods and it is shown in Tables 5-7, respectively. In the CBIR schemes [24][25][26][27]29] mountain category images have the minimum retrieval accuracy while beach image category Te bold values concludes the mean representation.
in [28,32] schemes has the minimum accuracy in terms of @P avg , @R avg , and @F avg . Te elephant images have the minimum accuracy in CBIR scheme [30] while scheme [31] has the lowest accuracy for monuments image category. Lastly, the foods image category has the lowest accuracy in CBIR scheme [33]. Te most of images in mountain, beach, elephant, and monuments categories have blue color contents which are the mixed with each other while the food category images have very complex structures, so it is very difcult to distinguish actual image contents to classify such categories. So, strong feature extraction methods are required. In our proposed method, mountain images have the lowest accuracy i.e., also 72.70% precision which also acceptable. Te overall means of @P avg , @R avg , and @F avg are      86.89%, 17.38%, and 28.96%, respectively, for the proposed scheme. As compare to the other state of art methods, the proposed scheme has generated the good retrieval results in most of the categories. It is evident that our proposed scheme has better retrieval results than the existing CBIR schemes.

. Conclusion
A novel content-based image retrieval strategy focusing on key color and texture feature descriptors is presented in this research. Color information has been retrieved from a quantized image using color moments. Using the Sobel edge detection algorithm and the GLCCM approach, texture information based on DCT is calculated. Te single feature descriptor has a relatively small dimension, allowing it to represent an original image in a compact format without sacrifcing retrieval performance and increasing the system's retrieval speed. Te experimental obtained results are [31] compared to certain state-of-the-art algorithms using two benchmark datasets, and it is concluded that the average recall rate [45], precision rate, and F-score rate are extremely efcient. In the future, various deep learning-based features could be used to execute the presented feature extraction technique, which could be useful in a variety of real-world applications.

Data Availability
No data were used to support this study.

Conflicts of Interest
Te authors declare that they have no conficts of interest. Te bold value indicates the mean representation. 16 Computational Intelligence and Neuroscience