Monitoring Big Data Streams Using Data Stream Management Systems: Industrial Needs, Challenges, and Improvements

. Real-time monitoring systems are important for industry since they allow for avoiding unplanned system stops and keeping system availability high. Te technical requirements for such systems include being both scalable and online, as the amount of generated data is increasing with time. Terefore, monitoring systems must integrate tools that can manage and analyze the data streams. Te data stream management system is a stream processing tool that has the ability to manage and support operations on data streams in real-time. Several researchers have proposed and tested real-time monitoring systems which have the ability to search big data streams. In this paper, the research works that discuss the analysis of online data streams for fault detection in industry are reviewed. Based on the literature analysis, the industrial needs and challenges of monitoring big data streams are presented. Furthermore, feasible suggestions for improving the real-time monitoring system are proposed.


Introduction
In terms of availability, condition monitoring and fault diagnostics play important roles in increasing the quality of industrial products and systems. Early failure detection and diagnostics (FDD) have the potential to reduce failures of machinery and equipment, prevent major product breakdowns, reduce downtime, and allow maintenance to be scheduled, thus increasing product availability and reducing associated costs. [1][2][3][4][5][6][7] Te work process of an FDD system generally involves fault detection, diagnostics, evaluation, and response action [8]. Fault detection methods can be divided into three categories: analytical methods, knowledge-based methods, and data-driven methods [9]. If one integrates several fault detection methods, the resulting system is called a hybrid system [10].
Analyzing data produced during the time of operation (and during maintenance) is the most common way to detect, predict, and avoid failures [11,12]. According to the popular magazine Te Economist [13], the rate of growth of the generated data is increasing, and the available storage capacity may not be able to cope with all the generated data. Tis is normally referred to as big data, which is simply characterized by the three Vs (volume, velocity, and variety). Furthermore, advanced technologies such as the Internet of Tings (IoT) and cyber-physical systems (CPS) are signifcant sources of big data. As the number of connected devices and sensors continues to increase, so does the amount of data that is generated. An overview of the Internet of Tings and its challenges and applications in various felds can be found in [14,15]. Terefore, monitoring systems have to integrate tools which can analyze the data streams without the need to store it. A data stream is a continuous and ordered sequence of data that arrives in real-time [16]. It addresses the velocity dimension of big data. Several researchers have naturally discussed the issue of data streams in FDD, including [11,[17][18][19][20][21][22][23][24][25][26] (a more detailed review of related work is presented in Section 4).
In this paper, a summary of a literature review regarding processing data streams in real time for industrial fault detection is presented. Te scope of the paper is the industrial application of the data stream management system (DSMS), industrial needs, and consequent requirements on the design of the future DSMS. Naturally, the author is aware of various technical developments regarding DSMS design, including works in [27][28][29][30][31]. However, such developments are not within the scope of this paper. Te author has summarized and deduced the industrial needs and challenges that are presented when deploying the DSMS for monitoring big data streams. Furthermore, the author has had a longstanding collaboration with industrial partner companies regarding data stream management and analysis which is further discussed below.
Based on the analysis of the literature review and the author's previous work regarding industrial equipment monitoring with industrial partners, a scalable monitoring system is proposed. Te system integrates various aspects of the reviewed monitoring systems as well as functionalities previously developed by the author and is based on industrial needs. Te design rationales, i.e., the reasons for including certain functionalities to meet specifc industrial needs are summarized. Te system (see Section 5), representing part of the paper results, benefts from using historical data, the ability to handle online data streams, and forecasted data streams to answer queries from multiple engineers (examples may include queries regarding component wear, trends indicating possible future failures, increased energy usage, and increased temperature). In addition, the proposed monitoring system suggests applying and integrating several fault detection methods, i.e., analytical methods, knowledge-based methods, and data-driven methods, thus enabling the most appropriate query formulation depending on the datasets and the industrial applications where they originate.
Te author has a longstanding history in database research and experience from several international and national research projects (SmartVortex (http://www. smartvortex.eu), iStream (https://www.ltu.se/research/ subjects/maskinkonstruktion/Forskningsprojekt/iSTREAMS-Storskalig-sokning-av-datastrommar-1.66982?l=en), and SSPI (https://www.ltu.se/research/subjects/ maskinkonstruktion/Forskningsprojekt/iSTREAMS-Storskaligsokning-av-datastrommar-1.66982?l=en)) related to equipment monitoring using data stream management system (DSMS) technology and data stream mining (DSM) in collaboration with mainly three Swedish companies. Section 2 discusses these applications in details. Section 3 presents the research method. Section 4 comprises a literature review and an analysis of the reviewed work. Section 5 presents and discusses the proposed monitoring system, and fnally conclusions are presented in Section 6.

The Industrial Applications
To achieve products of high quality, industrial companies are interested in searching the data produced during the product's lifecycle. Te author had the opportunity to collaborate with three Swedish companies, namely Bosch Rexroth Mellansel AB (BRMAB, formerly Hägglunds Drives AB, http://www.boschrexroth.com), Volvo Construction Equipment (VCE, http://www.volvoce.com), and AB Sandvik Coromant (SC, http://www.sandvik.coromant. com).
Firstly, BRMAB manufactures low-speed, high-torque hydraulic drive systems. Tey are interested in continuously analyzing their log fles, future streams of raw data, and in some cases, in saving a summary of their data streams. In addition, they are interested in applying various fault detection methods to improve their equipment operation.
Secondly, Volvo CE develops, manufactures, and markets equipment for the construction and related industries. Tey are interested in monitoring sensor data streams including CANBUS data from, for example, wheel loaders or other products. If a deviation is detected a predetermined procedure is applied.
Tirdly, Sandvik Coromant is an engineering group involved in tooling, materials technology, mining, and construction. Sandvik Coromant is interested in analyzing streams of sensor readings from a mill. For example, they need to compare the expected power consumption of the mill using a mathematical model to the measured power consumption in order to detect any abnormal behavior. Te next section discusses the research method of this work.

Research Method
Tis research work aims at reviewing and analyzing online data stream fault detection systems to identify industrial needs and challenges of monitoring big data streams and also, to provide feasible improvements to the reviewed realtime monitoring systems.
Concerning the literature review, previous monitoring systems which have the ability to analyze data streams in real time, were reviewed. Te literature review was conducted using search terms such as "data stream management system," "monitoring system," "fault detection," and "data stream mining". Ten, the industrial needs and challenges that researchers previously identifed were deduced and summarized (see Section 4.1). Furthermore, the basis for the research presented in this paper is a longstanding collaboration (including collaborative development and workshops) with the industrial partner companies through several projects. Te collaboration with the industrial partner companies has given the researcher good insights into the challenges within the diferent industries (more elaboration is provided in Section 4).
Based on the analysis of the literature review industrial needs and challenges were identifed. Also, for each reviewed work, the fault detection method (i.e., analytical, knowledgebased, data-driven, or hybrid) and the data source (i.e., realtime data stream or forecasted data stream) were pointed out and used in a matrix. A matrix row represents one or more reviewed works, while a column represents the fault detection method or the data source (see Table 3 in Section 4.2). Based on the analysis of the matrix and the identifed challenges, feasible suggestions to improve the DSMS-based real-time industrial fault detection were identifed (further discussed in the last part of Section 4), and a DSMS-based monitoring system was proposed (see Section 5).

Literature Review and Analysis
Several researchers discuss the issue of searching data streams for fault detection and/or prediction. [11,[17][18][19][20][21][22][23][24][25][26]32] developed a monitoring system based on data stream mining and DSMS technology and tested it on data collected from hydraulic motors. [32] developed a vehicle data stream (VEDAS) mining system for real-time vehicle health monitoring and driver characterization. [32] used a data stream management system to control the raw data stream generated by the monitored vehicle. Te monitoring system proposed by [32] involved two modes: monitoring mode and training mode (algorithm training). [20] used a data stream mining method based on principal component analysis (PCA) and a binary support vector machine for cutting tool condition monitoring. [18] built and tested a monitoring system which is based on forecasted data streams for system fault prediction. [19] proposed a framework that combines fault detection, isolation, and correction (FDIC). Te proposed FDIC used both model-based and classifcation-based methods for fault detection. A comparison between a knowledge-based and a data-driven method in analyzing data streams for system fault detection was made by [33]. A general approach for defning the correct behavior of the monitored equipment either analytically or statistically using a stream validator (SVALI) was validated in [25]. SVALIs support the use of data-driven and analytical-based fault detection methods through the learn-and-validate and the model-and-validate functions, respectively. [34] proposed and tested a general method to manage the problem of concept drift when using one-class, data-driven models for condition monitoring [34]. Gu et al. [35] proposed an online failure detection system using the IBM System S stream processing. Te authors used the stream-based decision tree classifer as a fault detection method. Wheel loader slippage detection for Volvo Construction Equipment was tested and validated based on both knowledge-based and datadriven based models [36]. A distributed framework for streaming anomaly detection in embedded systems was proposed in [37]. Tey examine the efectiveness of their method using data from two sources: autonomous vehicle and advanced driver-assistance system (ADAS) platform. It was shown that the proposed method was able to detect anomalies with low latency. Te authors in [36] used a DSMS-based framework and on-board sensor technology for clutch slippage detection and diagnosis. Te Gaussian mixture model (GMM) and the logistics regression classifer were used for online anomaly detection while the diagnosis was done using case-based reasoning [38]. A visualization component was integrated into a DSMS in [39]. Tey implemented an operator that supports industrial analytics applications by executing query-based visualization methods over the data streams. Te authors in [40] proposed an algorithm that can handle multiple concurrent data streams, which can be used for detecting contextual outliers. Te performance of the algorithm was tested on real-world and synthetic datasets and showed a good performance. A hybrid deep learning classifer was used in [41] to detect the concept drifts in streaming data. Te proposed approach is able to handle the time and memory constraints. A data-driven model was utilized in [42] to detect faults based on large-scale data sets from Metro do Porto subsystems.
Te reviewed papers relate to a variety of industrial applications and have resolved several industrial needs and challenges. Te industrial needs and challenges, identifed through a focus on collaboration with industrial partners and a literature review, are discussed in Section 4.1. Further, the analysis of the applied fault-detection method is presented in Section 4.2.

Industrial Needs and Challenges.
A summary of the arguments, challenges, and applications is presented in Table 1. Table 1 shows that real-time monitoring systems are needed for many applications such as monitoring heavy diesel engines, missile defense systems, vehicles, tooling machines (e.g., cutting), spacecraft, the steel industry, the metal industry, hydraulic systems, and milling processes. Furthermore, real-time monitoring systems were found to be required for other applications rather than industry processes or machine monitoring, such as software, driver characterization, and drinking water networks.
Several industrial needs were presented in the reviewed papers. Most of the papers agreed on the need for real-time monitoring systems to increase the availability of industrial systems and decrease the consequences of system failures. Based on Table 1, the industrial needs can be summarized in List 1 as follows:

List 1
(i) Use resources efciently (ii) Manufacture products of high quality (iii) Increase production efciency (iv) Increase product and process availability (v) Save on costs and time includes the following:    Autonomous vehicle/Advanced driver-assistance systems (ADAS) [40] Detecting outliers in multiple concurrent data streams (i) Parallel processing for outlier detection in data streams Detecting contextual outliers [39] Analyzing data streams in industrial processes and industrial cyber-physical systems (i) Provide scalable capability to visualize the results from the analysis of data streams to support industrial needs Industrial analytics applications [41] A method to handle nonstationary and dynamic data streams where the distributions are altered with the time (i) Real-time applications with time and memory constraints Applied on standard datasets from literature [42] Utilizing data-driven models for anomaly detection in the industrial area (i) Large-scale data sets Metro do porto subsystems

Advances in Operations Research
(xiii) Processing data streams from controllers and sensors is critical for monitoring the functional product in use List 1 shows that there is a need for real-time monitoring systems in many applications. However, Table 1 shows that industrial companies face numerous challenges in implementing the required monitoring systems. In Table 2 challenges are listed depending on who-industry or technical development-is concluded to be most responsible for addressing the challenge.
In order to resolve the challenges of large dimensionality, high sampling rates, handling huge amounts of equipment, and real-time monitoring, a developed monitoring system must be scalable. Te use of a stream processing system such as DSMS can resolve the issue of scalability.
Te ability to implement multiple fault detection methods overcomes the challenges of complex systems, the limitations of expert knowledge, and fault patterns that are not predefned or that cannot be simulated. For example, these challenges can be resolved by using an appropriate data-driven method. Te concept drift challenge can be addressed by using an incremental algorithm or updating the fault detection model based on certain cases; see, for example, [34].
Reference [32] discussed how to search data streams in a resource-constrained environment. Tey proposed a framework which uses on-board PDA-like devices to run data stream mining and DSMS. Tey suggested using approximate algorithms to handle the limited computing power and memory. A central control station is also used to support the PDA in certain cases (i.e., not all the time to avoid the high communication cost and save energy).
Next, the applied fault detection methods found in the literature are reviewed and analyzed in Section 4.2.

Te Applied Fault Detection Methods.
Researchers have used a variety of fault detection methods based on the available resources for their applications. Stream processing systems have also been used to manage and control data streams online. Table 3 shows the fault detection methods and types of data that were used in the reviewed papers. Table 3 shows that most of the monitoring systems developed to analyze data streams are based on data-driven methods. Te data-driven methods are cheap, easy, and fast to implement compared to other fault detection methods [33]. Furthermore, fault detection functions that are based on data-driven methods can be updated automatically. Te matrix presented in Table 3 shows that several gaps or research opportunities exist for which solutions are needed to be developed and tested. Few papers discuss the use of knowledge-based, analytical-based, or hybrid models. Furthermore, only one identifed paper discussed the analysis of forecasted data streams for system fault prediction. Te forecasted data stream was analyzed using a data-driven model. However, at the time of writing, there was no work identifed that explored the analysis of forecasted data streams using the knowledge-based, analytical-based, or hybrid models used in conjunction. Also, no work proposed or discussed how to combine or utilize all the characteristics presented in Table 3. Tus, in this paper, a monitoring system which combines the characteristics presented in Table 3 is proposed. Te next section presents and discusses the proposed monitoring system.

The Proposed Monitoring System
Based on the challenges presented in Section 4.1, the gaps identifed in Section 4.2, and the insights gained from the collaborative development and workshops with several industrial partner companies, a monitoring system is proposed. Te proposed monitoring system is capable of overcoming the challenges in Section 4.1, covering the gaps identifed in Section 4.2, and providing the components that are important for engineers. Te proposed monitoring system, its features, and the connection between them are presented in Figure 1. Te data stream management system (DSMS) holds most of the components due to its ability to manage data stream, implement multiple functions and queries, and provide interfaces and connections between multiple components.
Te following subsections describe and discuss the functionality and usefulness of the proposed monitoring system in more detail.

Data Source and Data
Interface. Te terms data source and data interface refer to boxes numbers 1-3 and box number 4 in Figure 1, respectively. Te lifecycle data of a product is important and useful for product monitoring. Metadata, historical data, and data collected in the design phase (e.g., the specifcation of the product's material such as max stress and temperature) can be saved in the DSMS's local memory if needed. Te DSMS can control and manage the data generated during the operation and maintenance phases of the product. Te data generated by products in use can have difering formats based on specifc applications. For example, in the case of BRMAB Company, the data stream may arrive in the form of compressed log event fles, produced from a variety of measurement points with multiple clocks. Terefore, an interface was needed to make the BRMAB data readable by the DSMS. Furthermore, metadata is an important issue when monitoring several machines. Metadata provides extra information about the arrived data.
In the case of BRMAB, the metadata provided information about the name and place of the collecting unit, the number and names of the measured parameters, and the sampling rate of every parameter. Te active mediators object system (AMOS) [47,48] is an example of "mediato"' software that can be used to combine data from many diferent data sources, support the integration with other systems, and act as an intermediate level between data sources and their use in applications and by users.

DSMS: Data Stream Management
Systems. According to [49], data stream management systems (box number 17 in Figure 1) represent an extension to the database management systems (DBMSs) that have the ability to manage and support operations on data streams. Te structure of a general data stream management system is presented in Figure 2. Data stream management systems have the ability to handle data generated at high frequency in a feet. Several papers, such as [11,17,18,32,50], showed the ability and usefulness of DSMS technology when integrated with the monitoring system. Every DSMS has a query language which can be used to handle data streams and defne queries. Te DSMS's query language can be used to implement the data stream forecasting function (i.e., Box 5 in Figure 1), the fault detection functions (Box 6), the integrator (Box 10), the interpreter  (Box 13), queries from engineers and companies (Box 15), and interfaces (Boxes numbers 4, 14, and 16). Te implemented queries are executed by the DSMS's query processor, which applies continuous queries over the input data stream and streams the output to the user or to a temporary bufer. Te input streams in Figure 2 refer to the data streams generated from the data sources, i.e., Boxes 1-3 in Figure 1. Te output streams in Figure 2 are refer to the output at the postprocessing interface in Figure 1 (i.e., Box 14). A DSMS commonly has a local data storage which can be used for several purposes, such as temporary working storage, stream synopses, and metadata [16].

Data Stream
Forecasting. Data stream forecasting (Box 5 in Figure 1) uses historical data and/or one or more current data streams to forecast the future data stream. Data stream forecasting can be used for long-or short-term prediction. [17] proposed a fault prediction system based on data stream forecasting. Tey used a variety of data stream prediction approaches based on linear regression for short-and longterm prediction. System fault prediction was achieved by applying the forecasted data on a fault detection function. Te results of [17] showed good performance in short-term prediction.
A query can be implemented to apply the data stream forecasting method. Te query will then be applied continuously over the arrived data stream through the DSMS's query processor.
Data stream forecasting is useful as it can help in detecting failures early and increases the response time, which is important, especially in cases of failures which occur suddenly or in a short time (such as seizures in the case of BRMAB). Using data stream forecasting for fault prediction allows the support system to gain a longer reaction time to handle early warnings of impending failures and thus increase system availability. In addition, the communication consumption in distributed data stream processing can be reduced through the use of data prediction [51]. Figure 1) concerns the identifcation of fault occurrence, referring specifcally in this paper to industrial equipment faults or failures. According to [9], fault detection methods can be classifed into three categories: analyticalbased methods, knowledge-based methods, and data-driven methods. Te three categories of fault detection methods have difering advantages and disadvantages and may not be applicable in all cases [9]. For example, data-driven methods are applicable for both small and complex systems, whereas analytical-based methods are applicable for small systems [9]. On the other hand, analytical-based methods achieve higher accuracy than data-driven methods in detecting failures [9]. Knowledge-based methods are good at detecting preknown failures but not in detecting new types of failures, while data-driven methods are good at detecting new types of failures [52]. A comprehensive study and comparison between knowledge-based and data-driven methods can be found in [33].

Fault Detection Functions. Te fault detection function (Box 6 in
In data-driven methods, data stream mining algorithms are useful in monitoring systems which concern data streams. Tese algorithms have the ability to extract patterns from continuous and fast-arriving data streams. A review of data stream mining algorithms and their application in monitoring systems was conducted by [11].
With regards to the three industrial companies and applications developed by the author, the methods used were of varying types, due to reasons which are discussed below. An analytically based method was used with data from AB Sandvik Coromant [42], whereas data-driven methods were used with data from both Volvo CE and BRMAB [11]. Te author was also able to develop a knowledge-based fault detection model in the case of BRMAB [33]. A comparison between the results of applying the knowledge-based and data-driven methods for the BRMAB case can be found in [33]. According to [9], analytical-based methods achieve higher accuracy than other methods. In general, one might conclude that it is wise to consider using an analytically based method frst if possible. In the case of AB Sandvik Coromant it was possible to develop such a model. In the case of BRMAB, it was not possible to use an analytically based method due to the complexity of hydraulic motors; therefore, both a data-driven method and a knowledgebased method were used. In the case of Volvo CE a datadriven method was used.
In monitoring systems, it is common to use each fault detection method separately. However, a number of researchers, as discussed in Section 5.5, have integrated individual fault detection methods. Section 5.5 discusses the signifcant role of the integrator and how it could help improve the performance of the monitoring system. Figure 1) concerns the way in which the various fault detection methods will be incorporated. Te integrator can be seen as a predefned function that produces an output based on the applied fault detection methods. For example, [52] used Bayesian ranking inference to integrate a knowledge-based and a data-driven method. Tey used a knowledge-based method to detect preknown fault types, while the datadriven method was used to detect preknown and unknown fault types. Using expert system technology, Leung  Tese examples showed the importance of integrating multiple fault detection and diagnosis methods. Te integrator (Box 10 in Figure 1) is intended to be formulated into a CQ which will be applied to the arriving data streams. Note that in the proposed monitoring system of Figure 1, there are two separate sources of data for the fault detection function component. Te two input sources are the online real-time data stream and the forecasted data stream. Te CQs of the fault detection function (including the integrator) will be applied to both sources of data in parallel, producing two types of outputs: output based on real data and output based on forecasted data.

Queries from Engineers.
Te proposed monitoring systems can be used to answer other queries ("questions") from engineers (Box 15 in Figure 1), provide information needed for the industrial company, and provide several services on request. [18], discussed how the services can be obtained using data stream mining and DSMS technology. Tey proposed an approach has the potential to signifcantly support continuous availability awareness in industrial systems.
Te engineers' queries can be written using the DSMS's query language. Te interpreter through the DSMS's query processor can then be applied to the engineers' queries continuously over the input data stream. Te query output can be visualized through a GUI, dashboard, or used in the interpreter component of the proposed monitoring system, see Figure 1. Examples of such queries and how to visualize them can be found in [49]. Tey determined key queries (supporting industry partner engineers), such as monitoring and predicting the customer/operator usage and monitoring system/product performance, and showed how to visualize such tasks. Figure 1) is basically a set of queries which are executed by the DSMS's query processor. It receives several data stream inputs such as the outputs from the fault detection system (output based on real data and output based on forecasted data) and the monitored data stream, and reads queries from engineers. In addition, the interpreter may use information saved in the local memory of the DSMS.

Interpreter. Te interpreter component (Box 13 in
Te interpreter has several tasks. It has to interpret the output from the fault detection function and issue the corresponding response action. For example, if there is an uncritical problem such as the cooler functionality not being efcient, then the interpreter has to send an alert message to the operator. If there is a critical fault which may damage the machine, then the interpreter has to take a response action such as switching of the machine and also send a corresponding alert message. Te interpreter may also answer engineers' queries, as in [49] which require the outputs from the fault detection function, the monitored data, and/or information saved in the local memory. Te outputs of the interpreter are then passed to a postprocessing interface.
Te variant information that the interpreter receives can be utilized to gain several advantages such as follows: (i) Allows using one fault detection method or more to monitor the diferent system components (ii) Increases the accuracy and robustness of the prediction by comparing the output of diferent fault detection methods (iii) Allows short-or long-term fault prediction by utilizing the forecasted data (iv) Te forecasted data might be utilized to detect the concept drift (v) Increases the ability to answer the various engineers' queries 5.8. Interfaces. Interfaces (Boxes 4, 14, and 16 in Figure 1), are used to facilitate the interactions between the diferent components. For example, the interface between the data source and the DSMS is used to make the data format readable by the DSMS. On the other hand, a graphical user interface makes the interaction between the engineers or operators, and the monitoring system more user-friendly (Boxes 14 and 16 in Figure 1). [18] discussed the importance of developing a fexible GUI which is able to visualize the engineer queries and the corresponding results. Te postprocessing interface (Box 14, Figure 1) can be used to visualize the answers to the queries and the real and expected values of various parameters, to send alert messages and take necessary actions, and to create a customized dashboard. A proof-of-concept of such a component was implemented and evaluated in [39]. Te authors showed how to integrate query-based visualization methods into a DSMS to visualize the query results of industrial data streams.

Conclusions
In this paper, a summary of a literature review regarding processing data streams in real time for industrial fault detection is presented. Te scope of the paper is the industrial application of data stream analysis, industrial needs, and consequent requirements on the design of the future DSMS. For this paper, the author has summarized and deduced the industrial needs and challenges that other researchers have previously identifed. Most importantly, the author has had a longstanding collaboration with industrial partner companies regarding data stream managing and analysis.
Furthermore, suggestions for improving monitoring systems were discussed. Ten a DSMS-based monitoring system was proposed to show how the suggestions can be implemented. Te proposed monitoring system benefts from integrating multiple fault detection methods, i.e., analytical methods, knowledge-based methods, and datadriven methods, and using historical data, online real-time data streams, and forecasted data streams.

Data Availability
No underlying data were collected or produced in this study.

Disclosure
Tis work is based on conclusions and tests from the author's previous research (which was funded by the SSPI(SSF) and the iStreams (VINNOVA) projects).