Feature Signature Discovery for Autism Detection: An Automated Machine Learning Based Feature Ranking Framework

Autism spectrum disorder is the most used umbrella term for a myriad of neuro-degenerative/developmental conditions typified by inappropriate social behavior, lack of communication/comprehension skills, and restricted mental and emotional maturity. The intriguing factor of this disorder is attributed to the fact that it can be detected only by close monitoring of developmental milestones after childbirth. Moreover, the exact causes for the occurrence of this neurodevelopmental condition are still unknown. Besides, autism is prevalent across individuals irrespective of ethnicity, genetic/familial history, and economic/educational background. Although research suggests that autism is genetic in nature and early detection of this disorder can greatly enhance the independent lifestyle and societal adaptability of affected individuals, there is still a great dearth of information to support the statement of proven facts and figures. This research work places emphasis on the application of automated machine learning incorporated with feature ranking techniques to generate significant feature signatures for the early detection of autism. Publicly available datasets based on the Q-chat scores of individuals across diverse age groups—toddlers, children, adolescents, and adults have been employed in this study. A machine learning framework based on automated hyperparameter optimization is proposed in this work to rank the potential nonclinical markers for autism. Moreover, this study aimed at ranking the AutoML models based on Mathew's correlation coefficient and balanced accuracy via which nonclinical markers were identified from these datasets. Besides, the feature signatures and their significance in distinguishing between classes are being reported for the first time in autism detection. The proposed framework yielded ∼90% MCC and ∼95% balanced accuracy across all four age groups of autism datasets. Deep learning approaches have yielded a maximum of 92.7% accuracy on the same datasets but are limited in their ability to extract significant markers, have not reported on MCC for unbalanced data, and cannot adapt automatically to new data entries. However, AutoML approaches are more flexible, easier to implement, and provide automated optimization, thereby yielding the highest accuracy with minimal user intervention.


Introduction
Developmental disorders are chronic disabilities and in recent years autism has gained importance across the globe owing to the increasing count of families facing a dilemma while raising afected children [1]. Tis has caused anxiety about the future of autistic children, their acceptance in society, personal and professional competence, and the need for individual attention. Research in the sphere of autism disorders suggests that autism, when diagnosed early, can be treated efectively, although a complete cure at present is considered impossible. A research team at the University of Missouri recorded the results of their research that stated that autistic children and healthy controls share quite a few typical facial attributes. Tis study was based on the images distributed by the Kaggle database [2,3].
Detecting autism sans delay is pivotal to choosing the most appropriate course of therapy, deciding on the level of attention and personal care for the child, managing patient expenses, and prioritizing the nature of schooling and education. Te societal stigma associated with Autism Spectrum Disorders (ASD) has escalated the mental trauma faced by families. Early detection of this neurological condition is expected to assist in timely therapy and nurturing for the afected child and family [4]. Complete dependence on physicians and scientists to research and analyze the causes, symptoms, nature, and therapeutic measures to treat ASD is a labor-intensive task that would consume substantial resources in terms of money, time, and expertise [5,6]. Tis paper proposes a hybrid computational framework based on automated machine learning techniques to unveil the most crucial factors that can enable early and timely detection of the disorder.
Artifcial intelligence (AI) solutions are infuencing every sphere of life ranging from fnance and education to the feld of medicine and defense. Machine learning, one of the major building blocks of AI, has been studied and diverse techniques have been proposed in the recent past for computeraided diagnosis of autism [4][5][6][7][8]. Interactive mobile and web applications have been deployed as computerized personal assistants to support the convalescence and treatment of autistic patients [9][10][11]. Several scientists have proposed machine learning algorithms for the classifcation of neurodegenerative ailments, such as schizophrenia [12][13][14], dementia, depression, and other psychiatric disorders [15][16][17] from MRI data. Most of the previous work is related to classifying autism from the brain or facial images of individuals.
Machine learning techniques have been proposed in the past for autism classifcation from clinical datasets obtained by collating real-time questionnaires from parents and healthcare workers. Tese have been made publicly available as four diferent nonclinical ASD datasets. Recent research has now delved into the use of deep learning techniques for image classifcation and text analysis. Many recent reports in the literature, have compared the performance of conventional and deep learning approaches for autism classifcation [12][13][14][15][16][17][18]. Presently, state-of-the-art methods have placed emphasis on auto-machine learning [19,20], a new domain of machine learning that automates the process of hyperparameter selection, feature ranking, and evaluation protocols that aim at obtaining the highest accuracy in distinguishing between healthy and afected patients while also disclosing the most important feature combinations that contribute to the accurate categorization of afected individuals.
Autism needs to be initially identifed by the caretakers of the infant/child. Hence, nonclinical marker detection plays a crucial role in enabling parents/family members to easily identify the level of developmental delay in the child [1][2][3]. Te development of a machine learning-based computational framework that would reveal the potential nonclinical markers for autism would enable even a medical inexpert to identify the possibility of autism in their ward and seek early medical advice. Tis research work focuses on achieving the following main objectives: (i) propose an AutoML-based computational framework that combines the best feature ranking and classifcation approach to generate high classifcation accuracy. (ii) Identify the role of potential nonclinical markers in the order of increasing importance (feature signatures) that can detect autism with minimal, yet signifcant information. (iii) Compare the use of traditional, deep learning, and AutoML techniques in classifying autism from nonclinical data.
Te organization of the research article is planned as follows: this section is followed by the state-of-the-art on the recent research fndings in the sphere of computational classifcation of ASD. Later, the article describes the materials and proposed methods carried out in this work, which is followed by a detailed description of the experimental results and analysis. Discussions on the fndings of this research are presented after the results followed by the conclusions from the work.

Literature Survey
Autism Spectrum Disorders have been studied by researchers for ages and the fndings indicate that ASD diagnosis adopted two main approaches-categorical and dimensional. Volkmar et al. [21] stated that for clinical practice in real-time settings, the diagnosis of autism was inclined towards adopting an ideographic approach, that placed emphasis on the characteristics and symptoms of the specifc individual. Te authors observed that in recent years, categorical approaches had resorted to the use of Research Diagnostic Criteria (RDC) and had great value for maintaining clinical records for statistical analysis. However, the downsides included handling some critical challenges like defning thresholds for certain conditions, linking symptoms with comorbidities, recognizing developmental changes, etc. Hence, this motivated the authors to focus on detecting the potential features from data obtained by observation and questionnaires that could enable caretakers of children to identify autism early.
Ahmed et al. [22] proposed a classifcation prototype that was a fusion of a restricted Boltzmann machine (RBM) and support vector machine (SVM), the former being used for feature selection from fMRI images while the latter was utilized for binary classifcation between healthy and afected individuals. A myriad of data processing steps that included slice time correction and normalization were also performed before generating the machine learning models. Teir dataset comprised of 105 typical control (TC) and 79 ASD subjects from the authentic ABIDE data repository. Teir fndings suggested that the amalgamation of RBM and SVM methods may be applied as a future tool to diagnose ASD. Te author's report does not suggest a comparison with other traditional/deep learning models, and neither does the work report on the ranking of facial features and how they are distinguished between the target classes.
Mohanty et al. [23] reported their classifcation results on publicly accessible and authenticated ASD datasets from the UCI machine learning (ML) repository and Kaggle. Te authors proposed a deep learning-based classifer on the child data set gathered from the UCI repository. Tey analyzed two types of data-complete data and data with missing values. Te difusion mapping feature selection method was utilized, and classifcation was done by implementing a deep neural network classifer. Teir proposed method yielded an accuracy of 94% on complete data and 92% on missing data, which was higher than the traditional machine learning models. However, their work did not report on the importance of specifc markers for autism detection but rather focussed only on the classifcation performance. Moreover, deep learning approaches require machine learning expertise for the accurate implementation and proper hyperparameter optimization to generate higher accuracy.
Alsaade et al. [1] proposed a novel deep learning-based system, designed to detect autism from facial images. Te authors identifed the need for a unique technology that could extract signifcant facial features/patterns to distinguish between autistic and nonautistic facial images. Tey proposed a simple web application using a deep learning approach based on a convolutional neural network with transfer learning and the fask framework. Xception, Visual Geometry Group Network-19 (VGG19), and NASNetMobile were chosen to be the models that were pretrained and utilized for categorization. Te publicly available facial image dataset from Kaggle was utilized for this purpose. Teir results recorded that Xception was the most accurate with 91% correctly classifed samples, followed by VGG19 (80%) and NASNetMobile (78%). Tis work is limited by the fact that the facial images available in the Kaggle repository belong to children aged 5 and above. Moreover, this system will not be interpretable to amateur child caretakers who will remain unaware of their ward's neuro-developmental state.
Several studies were undertaken on classifying ASD from images through traditional machine learning models, hybrid models that infuse feature selection with classifcation, deep learning models, and AutoML models [20,[23][24][25][26][27]. A concise review of the recent research on ASD classifcation is depicted in Table 1.
However, thus far in research, there have been only comparisons of diferent ML models based on their performance in classifying autism using the UCI ML and Kaggle toddler datasets [28][29][30][31]. Both deep learning and traditional methods have been applied and their results have generated reasonable accuracy. However, the methods portrayed latent inadequacy since (i) the models required human intelligence and utilized trial-error experiments for hyperparameter optimization, (ii) research on AutoML models for classifying autism across all age groups has not been reported thus far, and (iii) most of the previous work in the literature have relied on heavy data preprocessing which is perceived to be the reason that their models could not yield high Mathew's correlation coefcient on unbalanced datasets such as the toddler dataset from Kaggle.
Tis research work contributes to the current state of autism classifcation through machine learning models by aiming at the following three main objectives: (i) Design of a hybrid AutoML-based machine learning framework that interprets the role of potential attributes (feature signatures) for detecting autism. (ii) Automation of algorithmic parameters during execution such that in the event of new data being added to the training set, the use of AutoML approaches would optimize the model parameters and tune the functions to yield the highest performance. (iii) Compare the performance of traditional, deep, and AutoML approaches in autism classifcation on all 4 datasets employed in this study by utilizing metrics suitable for both balanced and unbalanced datasets.

Materials.
Tis study concentrates on four autism datasets (i) the child autism dataset-UCI, (ii) the adolescent autism dataset-UCI, (iii) the adult dataset-UCI, and (iv) the toddler dataset-Kaggle. Tabtah [29,30] provided the data to screen autism across all age groups. Table 2 summarizes the description of the attributes of toddlers.
Te authors noted that most of the predictor attributes were similar in all four datasets. Te type of data is diverse across the datasets. All datasets have missing values. Te UCI dataset incorporates one additional feature of including individuals who have already used the screening app. One of the objectives of this research is to fnd the most crucial questions/observations that could lead to early and accurate detection of autism in a noninvasive and less stigmatic manner.
Te number of instances and attributes for all the datasets included in this study is graphically represented in Figure 1.
Te software platforms and machine learning models based on automated machine learning are described in the following subsection.

Methods.
Te software suite utilized in this research is JadBio Automated Machine Learning Platform [32]. AutoML tools have been utilized in recent years to generate robust predictive and diagnostic models that do not carry the traits of the black box approach. Te most recently unveiled Just Add Data Bio (JADBio) platform (https://www.jadbio. com) is an AutoML technology that is very fexible and userfriendly with a highly interpretable interface that generates exhaustive reports on every machine learning task [33]. Tey are readily applicable to data of diverse natures and have inbuilt automated data preprocessing techniques that lead to the generation of highly accurate predictive models and feature signatures.
To the best of our knowledge, no other AutoML platform can identify small-size feature signatures or generate such accurate interpretable results in less time.
Te AutoML framework proposed in this article is portrayed in Figures 2 and 3. Te framework comprised of 2 phases, training phase and validation phase. Initially the data was uploaded following which the suite provided an option to transform data if necessary. In all datasets, sample ID and sum of Q-chat score results were removed as part of the transformation. Tis was done to identify which questions in the questionnaire held most signifcance in the order of increasing importance according to the best performing AutoML model.
Te transformed data were then loaded onto the AutoML software suite, and the analysis commenced after the software acquired certain user inputs such as the option to choose between interpretable models or the best-Computational Intelligence and Neuroscience performing models. Te users can also choose to perform feature selection or aggressive feature selection. Te choice of evaluation metric to rate the performance of the models, and the number of CPU cores for large data was also available. In this work, the authors chose to analyze using the best-performing model with feature selection and aggressive feature selection. Aggressive feature selection difers from feature selection as it places focus on generating the most crucial and small-sized feature signature even at the loss of predictor performance.
Te proposed framework commenced with the data preprocessing phase, and as is evident from Table 1, the data comprised both numerical and categorical values. Hence, to replace the missing values, the mean of the available numerical values and the mode (most frequently occurring) of the available categorical values were substituted [34]. Tis    was followed by standardization and normalization to ensure that the features of the input data set were not constrained to a particular range and that large values in the data set did not impact the training process adversely [35].
Te analysis of the model began with feature selection. AutoML in JADBio implemented the SES (statistically equivalent signature) and LASSO (least absolute shrinkage and selection operator) methods. Te SES algorithm works on the principle of Bayesian networks' constraint-based learning [36].
Te SES method manages to unearth multiple predictive feature subsets whose performances are statistically equivalent [36].
LASSO is a linear model that uses the following cost function: where a j is the coefcient of the j th feature. Te fnal term is called the L 1 penalty and α is a hyperparameter that tunes the intensity of this penalty term. Te higher the coefcient of a feature, the higher the value of the cost function. LASSO feature selection aims to optimize the cost function by minimizing the absolute values of the coefcients [37]. Tis feature selection is applicable to data that   Computational Intelligence and Neuroscience were scaled using standardization for optimum results. Since this machine learning technique follows an automated approach, the α hyperparameter value is automatically selected by a cross-validation approach. Hence, when the coefcient of any feature is 0, it is discarded. Te model fts a lasso regression on a scaled version of the data and retains only those features whose coefcient is diferent from 0.
Feature selection is followed by classifcation and the best performing model was SVM of type C-SVC using a linear/polynomial/radial basis function kernel with cost and gamma hyperparameters set to 1 [38]. Te detailed results are discussed in the next section. SVM has been very widely used in the medical literature to solve diverse classifcation problems. An exhaustive description of the algorithmic principles and hyperparameters is available in this article [39][40][41].

Experimental Results and Performance Analysis
Te results of this work are reviewed in three sections. Te frst section elaborates on the performance metrics and the evaluation methods that were adopted in this study. Te second section depicts the feature selection and classifcation results based on the selected metrics. Te third section describes the comparative performance of AutoML classifers with previously reported records on the same autism datasets.

Evaluation Metrics and Methods.
Te standard metrics for unbalanced data were adopted to rate the AutoML models in this study. It can be noted from the graph in Figure 4 that the toddler and adult datasets have a high-class imbalance between autistic and control subjects. Hence, the authors proposed to use MCC [42] and balanced accuracy which has been reported in the literature to be the most accurate in predicting datasets with class imbalance and those with balanced data as well. Tis is the frst time in the literature that these metrics have been adopted to classify ASD datasets. Both these metrics are computed using the confusion matrix that calculates the number of true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN). In this study, if the subject is nonautistic, it is considered a positive case. Based on these terms, the MCC and balanced accuracy were calculated as follows [43,44]:  (TN + FN) , Te performance evaluation method adopted in this work was, a repeated 10-fold cross-validation technique, wherein the training dataset was divided into 10 groups (folds) and during each epoch, 9 folds were used for training the model, and one fold was used to test the trained model. Tis was repeated 10 times and the average MCC and balanced accuracy have been reported. Tis is a standard statistical method used in data mining and machine learning for evaluating classifer performance. Te standard confdence interval was set to 95% while performing the experiments.

Feature Signature and Classifcation.
Feature signature is the combination of features from the original dataset that are highly predictive of the target class. Te cumulative results of the proposed AutoML framework are tabulated in Table 3.
Te feature signature is generated by AutoML predictors in the order of increasing importance. It is to be recorded based on the tabulated performance metrics in Table 3 that the feature signature generated by the feature selection method on all four datasets has yielded high MCC and balanced accuracy, almost comparable to the performance of the full feature selector. However, in the case of the adolescent dataset, the MCC shows considerable variation, especially while implementing aggressive feature selection.
To ascertain the contribution of each individual component to the feature signature, the authors attempted to identify the role of each feature in enhancing predictive performance. Since the aggressive feature selection methods try to minimize the feature signature size at the cost of the model's predictive performance, as witnessed from the results in Table 1, the authors focussed their attention on the best-performing AutoML models with feature selection alone. Figures 5 and 6 portray the reduction in loss of predictive performance as each attribute is added to the feature signature in the decreasing order of importance on the toddler and child datasets, respectively.
Te feature signature on the toddler and child dataset denotes that there is very little space for improving the performance as both graphs indicate a maximum reduction of ∼0.2% on the predictor performance.
Te country of residence is a new attribute added to the feature signature in the age groups over 11 as noted from Figures 7 and 8, that is contributing to almost maximum MCC, while almost all Q-chat attributes seem to be playing a very prominent role in both these feature signatures.
It is evident from the feature signatures generated across all age groups that the questions in the Q-chat alone will sufce to detect autism in young children based on observation of behavior with more than 90% accuracy. Moreover, the use of the autism screening apps, ethnicity, or the existence of autism in the family has not surfaced as a ranked feature in any of the four feature signatures that generated high predictive performance.
Recent work in the literature that has reported on the use of the JadBio AutoML framework on other biological datasets has recorded only the training estimates. Te authors in this work proceed to validate the performance metrics generated by JADBio using validation data. Four metrics to test the validity of the generated AutoML models across all four datasets have been utilized as seen in Table 4.         Computational Intelligence and Neuroscience Te OUT OF SAMPLE TESTING (OOST) metrics have been recorded for the frst time on Autism data. Tis is a measure of how the model predicts a sample that it has not been trained on, that is, totally new test data. Te lesser false predictions (FP and FN) in OOST, the more robust the model.
Te tabulated results clearly reveal that the OOS metrics have very few misclassifcations across all age groups. Even aggressive feature selection models have depicted a classifcation performance comparable to feature selection-based models on the validation data.

Comparison to Previous Work.
Since no previous literature is available on AutoML models for autism classifcation from nonclinical data and most of the existing literature has reported only on the accuracy using traditional and deep learning models, the authors make a comparative performance based on the accuracy of the AutoML models on the ASD datasets.
Te comparative study of results in Table 5 clearly reveals that the AutoML methods are most suitable for improving predictor performance on balanced and unbalanced data. Te accuracy of the proposed methods was obtained by using repeated 10-fold cross-validation. However, since earlier results have reported a higher accuracy of ∼100% on the child dataset, the authors attempted to validate the AutoML model against a test dataset. Te prediction accuracy was close to 100% using the AutoML model on the child dataset when all the features were included for classifcation.

Discussion
Autism-afected individuals irrespective of their age, gender, social status, and educational backgrounds, are subjected to intense social stigma and this afects not only the individual but also their families and the society at large. Tis research work was undertaken to identify the exact combination of features that could enable early detection of autism, based on questionnaires to caretakers of infants/toddlers who are at high-risk. Research has suggested possible links towards certain high-risk factors like premature gestation delivery, adopting improper delivery mechanisms, unprescribed medical dosages during pregnancy, increased maternal age, and genetic factors [45,46].
Te feature signatures generated for autism detection across all age groups clearly reveal that questions on social interaction and general behavior would sufce to identify the possibility of autism development. Tis would surely enable caretakers and physicians to ensure that every child is monitored during regular clinic visits and general health checks for behavioral and social interaction abilities.
Tis discussion is restricted to identifying the feature signatures for toddler and child datasets since adolescents and adults, by general behavior, reveal their neurodegenerative status. However, those datasets were included in this study for validating the AutoML classifer performance. It is also to be noted that although multiple signatures could be generated for a single dataset, the proposed AutoML models generated only a single distinct signature for each of the four datasets.
Te feature signature for toddlers: A9, A7, A2, A6, A5, A4, A8, A1, and A3 Te feature signature for children: A10, A4, A9, A5, A8, A1, A3, A6, A7, and A2 Te feature signatures indicate that all questions of the Q-chat as described in the attribute description section are required for detecting autism. Besides, the authors also tried including the Q-chat score result as a predictor in the data. In that case, the AutoML model selected only that 1 predictor as a feature signature. Te result is depicted in Figure 9.
Te graph and histogram are shown for the class probability "Yes" which indicates the probability of autism based on the Q-chat score. When the score value is above a certain threshold (∼30%), all subjects are classifed as autistic. Tis is valid proof that the AutoML model is robust in its performance and auto updates itself as per the changes in the training dataset. Te graph showing the class distribution based on the Q-chat score is portrayed in Figure 10.
Te authors also elaborate on the diference between the feature selection, aggressive feature selection, and the basic classifcation model. Feature selection works by repeated epochs where the focus is placed by the AutoML model only on achieving a high-performance metric.
Aggressive feature selection concentrates more on the minimal signature size and reveals the most reduced signature with a reasonable performance above a certain threshold as decided by the AutoML model itself. Feature selection is crucial to this study as the proposed work has identifed that ethnicity/jaundice/exposure to app/country of residence are not potential nonclinical markers for autism, especially during the infancy stages and early years of growth. Besides, the authors confrm the efectiveness of the signature markers identifed by the AutoML hybrid technique for feature selection and classifcation by visualizing the projection of the data using principal component analysis [32,47] and recording the planes that retained most of the original data distribution. Tis information is available in the Supplementary fles S1-S4 and S5-S8 for the child and toddler datasets, respectively. Te PCA projects the data using the information contained in the feature signatures and can project ∼95% of the data on the lower dimensional plane. Te principal component values and the corresponding density plots that are projected for the respective target classes are available as Supplementary fles for both the toddler and child datasets.
Probability values should be close to 1 for the positive class and close to 0 for the negative class when plotted according to the predictors with high performance [47]. It is inferred from the present study that the model selected by AutoML (SES + SVM) can distinguish with precise probability the correct class of data samples for the child dataset. Te model, however, reveals minor overlaps for the toddler data. Tis is attributed to the relative class imbalance in the dataset, which may be resolved by augmenting the data with more samples to balance the class ratio. Te exact probability values for each of the samples using the feature signature generated are provided as supplementary fles for both the toddler and child datasets.
Te authors also explored the performance of traditional learning models on all four datasets to further establish the superior performance of AutoML models in terms of time, performance, and minimal expert inputs. Te comparative performance using the ORANGE data mining tool [48] is depicted in Table 5. Te results are shown as a table and a graph. Te graph portrayed in Figure 11 displays the autism classifcation accuracy of K-nearest neighbor (K-NN), naïve Bayes classifer (NB), random forest (RF), and AdaBoost algorithms [49] on the four datasets with all the features.
It is to be recorded here that the previous work has reported the accuracy using diferent performance evaluation methods like train-test ratio, cross-validation, and realtime test datasets. Tis result is reported using the standard 10-fold cross-validation method and clearly reveals the higher performance of AutoML models in classifcation. Furthermore, the authors also highlight the role of feature signatures in classifcation using conventional learning methods. Te FCBF (fast correlation-based flter) algorithm uses the idea of identifying features that are highly indicative of the target and highly independent of the other variables as well. Te algorithm relies on a metric called Symmetrical Uncertainty (SU) that uses information theory [50]. Te diference here is that the authors need to choose the number of features to be ranked and these features are to be given to the classifer. Te FCBF algorithm generated slightly different feature signatures than the ones chosen by the AutoML models. Te results are tabulated in Table 6.
It is evident from the results in Table 6 that the feature signatures generated by AutoML models, aid in improving classifcation accuracy. Te authors aim to extend this work to biosignature detection for ASD and computationally  Figure 9: Histogram of Q-chat score for child data.  [24] 99 C4.5 [24] 96 Deep neural network [23] 92 SGD (stochastic gradient descent) [7] 99.6 Random forest [9] 97.2 Soft voting classifer [8] 94 10 Computational Intelligence and Neuroscience identify genetic variants/mutants that could aid in autism therapy and study the efciency of AutoML on a multiclass categorization of neuro-developmental disorders. Tis work will also be expanded to evaluate the role of feature construction methods on genetic/image data for autism classifcation and assess the performance of AutoML models on autism classifcation with the new features generated.

Conclusion
Research in the feld of autism and its causes, symptomatic variations, and therapy has been one of the extensively researched spheres in the recent past. Owing to the adverse efects caused by this neurodevelopmental condition, parents and caretakers are keen to ascertain the predictive features of this disorder. Machine learning and computational intelligence have infuenced several healthcare areas providing solutions in the form of predictive and diagnostic models, computer-aided assistantships to physicians, and recommender systems for therapeutic guidance. Tis research work fuses the competence of automated machine learning and computational intelligence to discover highly predictive features for autism that would enable possible early detection of the disorder. As the globe soars towards automated solutions for everyday living, exploring the possibility of automating the diagnostic procedure for  Q-chat score Figure 10: Class distribution based on Q-chat score-the higher the score, the higher the probability of "yes." Computational Intelligence and Neuroscience diseases through Artifcial Intelligence solutions is a great step forward in the feld of science and technology. Te authors believe the work undertaken in this research provides an informative insight into autism prevalence irrespective of the subject's ethnic, social, or family status. Te authors in the future would focus on biosignature detection for autism based on genomic data and the possibility of automatically extracting image-based feature signatures for autism diagnosis.

Data Availability
All the datasets utilized in this research work are publicly available.

Conflicts of Interest
Te authors declare that they have no conficts of interest.