Design of an Effective Archive Management System with a Compression Approach for Network Information Technology

Due to the recent advances of Internet and information technologies, massive quantity of archive data gets generated and it becomes di ﬃ cult to handle it using conventional techniques. Archive management is the ﬁ eld of management related to the maintenance and utilization of archives, once they have been sent from the client to the repository. The drastic increase in the size of archive data necessitates e ﬀ ective storage schemes, which can be accomplished by the use of data compression approaches. Generally, data compression techniques are used for reducing the count of data being saved from a system or network without compromising the data quality. With this motivation, this study designs an e ﬀ ective archive storage system with a compression approach for network management (EASS-CANM). The major intention of the EASS-CANM technique is to archive the textual and image data e ﬀ ectively in its compact form in order to reduce the storage area. In the context of archive management, the EASS-CANM technique might be considered a successful instrument. The proposed EASS-CANM technique involves a two-stage process: textual data compression and image compression. At the initial stage, neighborhood indexing sequence (NIS) with the Prediction by Partial Matching (PPM) technique was applied for textual data compression. Secondly, fruit ﬂ y optimization (FFO) with modi ﬁ ed Haar wavelet (MHW) is used for e ﬀ ective image compression where the optimal threshold selection process takes place utilizing the FFO technique. We improved the Haar wavelet ﬁ ltering process in order to preserve higher image quality and clarity (MHW). With the introduction of MHW, a new transformation is made possible, allowing for improved compression outcomes as well as improved PSNR and CR values. In order to demonstrate the improved outcomes of the EASS-CANM approach, a series of simulations are performed utilizing a benchmark dataset. The experimental results reported the supremacy of the EASS-CANM technique on existing approaches. The benchmark dataset is used to conduct a wide range of studies to see whether the EASS-CANM approach improves archival e ﬃ cacy. According to the full comparative result analysis, the EASS-CANM strategy is more e ﬀ ective than existing approaches in terms of numerous evaluation criteria. Therefore, the EASS-CANM technique can be used e ﬀ ectively in the administration of archives. ensures a high compression ratio. In contrast, the lossless algorithm guarantees the data integrity at the time of the com-pression/decompression method. This paper studies an e ﬀ ective ﬁ le storage model and compression method for network management (EASS-CANM). The presented EASS-CANM approach contains a 2-stage procedure: textual data compression and image compression. A primary step, neighborhood indexing sequence (NIS) with the Prediction by Partial Matching (PPM) approach, was implemented to textual data compression.


Introduction
File management and organization activities are manually done by archivists. When a massive amount of files appears simultaneously, efficiency of operation would be considerably decreased [1]. Generally, China's archive industry is still in the preliminary phase; also, there is some gap in comparison to the archive management system of developing nations. It does not matter if the conventional dual-track mode or dual-system mode is replaced; a massive amount of electronic records would be generated [2]. The original document created by the computer at the time of office pro-cess is generally distinct from the document formed after the digitization of archive. This kind of original electronic documents is loaded with additional data when compared to digital documents; also, it is highly suitable in processing. The previous one is manifestation of archive, and the last one is just a copy of archive. If this document reflects the original data that can be efficiently utilized, it will create values [3]. Since the last destination of this electronic document, the archive department accumulates the massive number of original data. Then, organizing these complicated archives systematically becomes a major challenge for application [4]. Data mining, which is frequently employed in business, relies on the application of complex mathematical algorithms as a tool. An associated notion is Knowledge Discovery through Data Mining (KDD). In the business world, big data mining is a technique for identifying and retrieving specific pieces of information from vast databases of information. Fortunately, the data mining technique is capable of processing this heterogeneous data, making file management interrelated activity more intelligent, decreasing the burden on archivists, resolving the problem that might be encountered, and simultaneously allowing public to get improved quality service. Figure 1 illustrates the process of the archive system [5].
With the tremendous growth of the Internet and the advancement of science and technology [6], the amount of archive data is very larger to employ conventional data analysis techniques and tools for processing. Ruddy et al. consider that the dataset is comparatively smaller, and when the information has heterogeneous features, it could not be processed by conventional models [7]. The data mining technique comes into existence under this scenario. It is depending on the integration of big data processing algorithms and traditional data analysis that provide the opportunity to examine the possible value contained in a massive number of information. Many of these archives are utilized by the "vouchers," and the ways to utilize them are single and relatively traditional [8]. Still, the method of manual system utilization and retrieval occupies the conventional position. But there are some errors in the experiment that results in wrong outcomes. To handle large amount of digitized and born-digital contents, computation methods and tools should consider the temporalities and interdependencies of distinct archival processes over the changing value and nature of open accession and the archival administrative system, dynamically developing archival collection [8]. Also, this application needs to fulfill overarching archival processes imperative to meet legal admissibility requirements, ensure transparency, and support accountability for material. The compression technique reduces the data size by using the data structure [9]. A data compression algorithm falls into lossy and lossless methods. The lossy algorithm generates a loss of data but usually ensures a high compression ratio. In contrast, the lossless algorithm guarantees the data integrity at the time of the compression/decompression method. This paper studies an effective file storage model and compression method for network management (EASS-CANM). The presented EASS-CANM approach contains a 2-stage procedure: textual data compression and image compression. A primary step, neighborhood indexing sequence (NIS) with the Prediction by Partial Matching (PPM) approach, was implemented to textual data compression. Secondary, fruit fly optimization (FFO) with modified Haar wavelet (MHW) was utilized for effectual image compression where the optimum threshold selection procedure occurs utilizing the FFO technique. The technique integrates. This modification supports compressing medical images for maintaining their purity and preserving fine details but keeping the compression ratio (CR) maximum. For examining the enhanced archival effectiveness of the presented EASS-CANM technique, a wide range of experiments were carried out against the benchmark dataset.

Literature Review
Lv and Shi [10] examine the development of universities' archive management in the data age. It employs the experimental method, review of the literature method, objectoriented method, and research method to analyze. We investigated the present scenario of universities' archive management system, enhanced the archive management work of universities and colleges, summed up the problem present in the archive management system, and carried it out by using the data management system. Rong [11] presents the theoretic concept of archive data resource and sharing-based network cloud framework, then analyzes the significance of data resource sharing from the method of archive data management, and lastly provides the strategy of creating the modes of data resource sharing in archive management.
Israel [12] examined the attitudes and perception of registry staff members toward archive management at the FUTA. A random sampling method was utilized for selecting fifty registry staff in different sections within the university. Odhiambo [13] evaluated the readiness of USIU-A for managing digital archives to propose a strategy for enhancing digital archive management at the organization. The research shows that the organization needing for digital archive management was not up to standard.
Park et al. [14] gathered articles based on archive management from 1997 to 2016 from four journals associated with library and information science in Korea and two journals associated with archive management. Also, find the direction of archive management in Korea via a comprehensive review of study in record management in the country. The gathered articles from the first and second halves of the decades have been subjected to 5-year cycle analysis. Consequently, study on record information services, electronic records, and archive methods of different records was enhanced gradually.
Wang [15] developed a Hadoop cloud framework for functional modules and electronic archive data management. Tsvuura and Ngulube [16] investigated the digitization of archives and records at 2 state universities in Zimbabwe and embarked on digitization of their archive and record resources corresponding to the technological trend of doing online business. We adopted a qualitative multicase study to offer deeper understanding of digitization of archives and records at the state universities. Information was gathered by purposive sampling via interviews.

The Proposed Model
In this study, a novel EAS999S-CANM technique has been developed for the compact storage of files in the archives. The EASS-CANM technique is mainly intended to archive the textual and image data effectively in its compact form in order to reduce the storage area. The proposed EASS-CANM technique initially utilized NIS with the PPM technique for the compression of textual data. Besides, the FFO-MHW technique is exploited for the compression of images in which the optimal threshold value selection using the FFO algorithm helps improve the compression efficiency.

2
Wireless Communications and Mobile Computing 3.1. Textual Data Compression Using NIS with the PPM Technique. At the initial stage, the NIS technique was applied to generate an optimal as well as the shortest codeword (CW) for the input textual data. The presented NIS approach is only a character encoded system that works on the principles of "navigating data founded on ones and 0s. Amongst the 2 short CWs created by one's and zero's traversal, the optimum CW is selected according to the minimal amount of bits needed for storing the CW of corresponding characters. For input sequence of length N, the NIS approach needs C bits for storing the compressed data and is equated by the following equation: where NIS opt denotes the amount of bits in a CW. Additionally, the presented approach requires further eight controller bits to the optimum amount of bits from reduced data. Next, the mean number of bits needed for storing a sin-gle character with the NIS approach is estimated as follows.
The lower the values of C bits and NIS ch av , the higher the performance of compression. Stimulatingly, equation (2) signifies that the approaches of NIS need a maximum of bits for storing characters. At first, the presented method loads the input text that might have special symbols and alphanumeric characters [17]. Lowering the values of C bits and NIS ch av enhances density operation. Surprisingly, equation (2) reveals that the approaches of NIS require a limit of four bits to collect a character. Next, the ASCII value is converted to the corresponding binary form. Then, the process "traversing data on the basis of ones and zeros" would be performed. The algorithm included in zero-based traversal and one-based traversal is equivalent accepting that the zero-based traversal searches for zeros in the binary digit where the one-based traversal finds ones in the binary digit. When the two CWs are created, the presented method   In order to further enhance the compression efficiency, the compression of CWs takes place using the PPM technique. The PPM is a sophisticated approach for a data compression-based statistical method and is amongst the more effective methods concerning compression without data loss. The PPM method generates a PPM tree which signifies a variable-order Markov method, where the last n character represents an n-order Markov method. Text compression, string sequence indexing, and prediction are all made easier with PPM. Construction of a PPM tree from a large volume of data requires lengthy sequential processing in a real-world setting. Partial matching (PPM) is an adaptive statistical data density technique that utilizes context modelling and prediction to reduce the size of datasets. Uncompressed symbol streams are fed into PPM models, which then use the past characters in the stream to forecast the next symbol. When performing a cluster analysis, PPM methods can also be used to arrange data into expected groupings. The PPM tree is utilized for the representation of route. Regarding this, all the road segments have an individual identification, i.e., an integer value [18].
(i) UM represents the uncompressed message, that is, the plain message which includes the symbol that represents the road segment (ii) CM denotes the compressed message, that is, the compressed message related to the plain message, after implementing the compression of the provided PPM tree At the same time, once the route that was tested comprises several symbols equivalent to a specific PPM tree, the compression procedure would activate the esc character [19], producing a higher CR. Simultaneously, once a route that was tested was significantly distinct from the PPM tree, the esc character would be activated, which produces a lower CR.

Image Compression Using the FFO-MHW Technique.
During the image compression process, the FFO-MHW technique has been executed to effectually compress the images. The presented method is an orthogonal wavelet transform; also, it is calculated by averaging and iterating variance between even and odd pixels of digital images. The MWH is applied in different forms to compress images by decomposing its matrix to sparser one [20]. For preserving more image clarity and quality, we use MHW. MHW presents a new transformation that can attain good compression results when compared to traditional one and attain greater CR and PSNR values. A data compression ratio, also known as compression power, is a measurement of how much a data representation shrinks when compressed using a data compression technique. It is usually expressed as the ratio of uncompressed to compressed size. The data compression ratio is defined as the difference between the uncompressed and compressed sizes. Thus, a compression ratio of 10/2 = 5 is often written as an explicit ratio, 5 : 1 (read "five" to "one"), or as an implicit ratio, 5/1, for a representation that decreases a file's storage capacity from 10 MB to 2 MB. For that, we utilize MHW to divide the original image of W × H dimension into 2 × 2 matrices as follows [21]: where A, B, C, and D values are estimated by For recreation, we gain a, b, c, and d in the following: Equation (6)  : ð8Þ For the optimal selection of threshold values in the MHW technique, the FFO algorithm has been employed. The FFO algorithm was proposed on the basis of foraging behavior of Drosophila. It is better than other species in visual sense and olfactory abilities; therefore, it can completely utilize its impulse to discover food [22]. In particular, even at a 40 km distance from the food source, the nose of FFs could select different food scents, i.e., distributed all over the air. Upon being close to the food source, the FF locates the food and the company flocking position by using  5 Wireless Communications and Mobile Computing their sensitive visual organ; later, they would fly in that direction. The optimal FF data would be allocated by the entire swarm in the iteration, and the following iteration would only depend on the data of preceding optimal FF. Figure 2 demonstrates the graphical representation of FFO. Based on the food search features of FF, the FOA is separated as in the following: Step 1. Initialized parameter.
Initialize the parameter of FOA, namely, the maximal amount of iterations, the population size, the random flight distance range, and the primary FF swarm position (X − axis , Y − axis).
X axis = rands 1, 2 ð Þ, Y axis = rands 1, 2 ð Þ: Step 2. Initialized population. Assume the arbitrary position ðX i , Y i Þ and distance for the food searching of individual FF, in which i denotes the population size.
Step 3. Population assessment. Initially, estimate the distance of food position to the origin (D). Next, calculate the smell concentration (SC) judgment values (S), i.e., reciprocal of distance of food position to the origin.
Step 4. Replacement. Substitute the SC judgment value (S) with the SC judgment function (known as the fitness function) for finding the SC of the individual position of the FF [23].
Step 5. Discover the maximum SC. Describe the FF with the maximum SC and the respective position amongst the FF swarm.
Step 6. Retain the maximum SC.
Keep the maximal SC value and x and y coordinates. Next, it flies to the position with the maximum SC.

Smellbest = bestSmell,
Step 7. Iterative optimization. In order to iterate the execution of steps 2-5, the circulation halts once the SC is no longer better than the preceding  LG_20 Temp, 897.4095 6 Figure 6: Compressed packet size analysis of the EASS-CANM technique.

Performance Validation
The performance validation of the EASS-CANM technique occurs utilizing different datasets such as text as well as image. For ensuring the enhanced compression efficacy on the textual dataset, the performance validation takes place on the benchmark dataset with different deployment models, namely, LUCE, HES-SO FishNet, and Le Gènèpi [24].  Similarly, the image compression effectiveness of the EASS-CANM model is validated compared to the benchmark image dataset [25]. A few sample images are demonstrated in Figure 3. Figure 4 shows the CR analysis of the EASS-CANM technique under different deployment models. The EASS-CANM technique has shown effective compression outcomes with optimal values of CR. For instance, the EASS-CANM technique has obtained CR of 0.1165, 0.1523, 0.1996, 0.1792, 0.1990, and 0.2350 under LU_84 Temp, FN_101 Temp, LG_20 Temp, LU_84 RH, FN_101 RH, and LG_20 RH deployment models, respectively. Figure 5 displays the CF analysis of the EASS-CANM method under distinct deployment systems. The EASS-CANM approach has revealed efficient compression results with optimum values of CF. For instance, the EASS-CANM method has attained CF of 8.5803, 6.5655, 5.0098, 5.5790, 5.0250, and 4.2546 under LU_84 Temp, FN_101 Temp, LG_20 Temp, LU_84 RH, FN_101 RH, and LG_20 RH deployment systems correspondingly. Figure 6 demonstrates the CPS analysis of the EASS-CANM system under distinct deployment systems. The EASS-CANM approach has exposed efficient compression outcomes with optimum values of CPS.   The MSE analysis of the EASS-CANM system with recent algorithms is offered in Figure 10. The figure reported the higher outcomes of the EASS-CANM system with the minimum values of MSE on all test images. For example, on image 1, the EASS-CANM technique has achieved lower MSE of 2.756, but the FFA-LBG and GWO-ECC methods have obtained maximum MSE of 4.567 and 6.408 correspondingly. In addition, on image 5, the EASS-CANM system has gained minimum MSE of 4.519, but the FFA-LBG and GWO-ECC approaches have gained maximum MSE of 7.081 and 9.197 correspondingly. Table 3 and Figure 11 show the comparative SS analysis of the EASS-CANM approach with other models [26]. The result shows that the EASS-CANM approach has revealed improved results with the improved SS values. For instance, on image 1, the EASS-CANM method has gained maximum SS of 86.54% while the FFA-LBG and GWO-ECC models have gained less SS of 74.32% and 64.30%, respectively. Likewise, on image 5, the EASS-CANM algorithm has attained maximum SS of 81.19% while the FFA-LBG and GWO-ECC models have attained less SS of 76.40% and 60.47%, respectively.

Conclusion
In this study, a novel EASS-CANM technique was established for the compact storage of files in the archives. The EASS-CANM technique is mainly intended to archive the textual and image data effectively in its compact form in order to reduce the storage area. The proposed EASS-CANM technique initially utilized NIS with the PPM technique for the compression of textual data. Besides, the FFO-MHW technique was exploited for the compression of images in which the optimal threshold value selection using the FFO algorithm helps improve the compression efficiency. For examining the enhanced archival effectiveness of the presented EASS-CANM approach, a wide range of experiments are implemented against the benchmark dataset. The comprehensive comparative result analysis highlighted the improved efficacy of the EASS-CANM approach on existing approaches with respect to various evaluation metrics. Therefore, the EASS-CANM technique can be treated as an effective tool for archives management. In the future, the performance of the EASS-CANM algorithm was extended to the design of lightweight cryptographic techniques to accomplish security. A novel lightweight cryptographic technique is helpful for encrypting data sent to the cloud storage, and the use of both symmetric and asymmetric encryption to encrypt data enables users to benefit from the efficient security of asymmetric encryption and the speedy performance of symmetric encryption while preserving users' rights to access data in a secure and permitted manner.

Data Availability
No data were used to support this study.

Conflicts of Interest
The author declares that there are no conflicts of interest regarding the publication of this article.