Interface Data Modeling to Detect and Diagnose Intersystem Faults for Designing and Integrating System of Systems

In system of systems engineering, system integrators are in charge of compatible and reliable interfaces between subsystems. This study explains a systematic solution to identify and diagnose interface faults during designing and integrating systems of systems. Because the systems targeted in this study are real underwater vessels, we first have anatomized 188 interface data transferred between 22 subsystems of them. Based on this, two interface data models are proposed, which include data sets regarding messages and inner fields and transition and decision functions for them. Specifically, a structure model at the message level evaluates how inner fields belong to a message, and a logic model at the field level assesses how each field is interpreted and if the interpreted value is understandable. The software that supports the modeling is implemented using the following concepts: (1) a model-view-viewmodel pattern for overall software design and (2) a computer network for representing sequential properties of field interpretations. The proposed modeling and software facilitate diagnostic decisions by checking the consistency between interface protocols and obtained real data. As a practical use, the proposed work was applied to an underwater shipbuilding project. Within 10 interfaces, 14 fault cases were identified and diagnosed. They were gradually resolved during the system design and integration phases, which formed the basis of successful submarine construction.


Introduction
A complex system such as automotive, marine, or aerospace system of systems (SoS) contains diverse subsystems that must be designed and integrated to work together [1][2][3].In an underwater vessel, for example, an inertial navigation system (INS) receives speed over water and locational information from an electromagnetic log (EM log) and Global Positioning System (GPS), respectively [4].They enable the INS to enhance the computational accuracy of its orientation and velocity.In this context, the input data of the INS have been utilized for computing the INS's outputs precisely; besides, the outputs also are a basis of estimating the geographic location of the vessel in water.Thus, understandable and reliable interfaces between the subsystems are the main prerequisite to organize the subsystems as an integrated system at the corporate level [5][6][7].
When designing and integrating the complex SoS, a system integrator has difficulty figuring out interface faults for the following reasons.First, both subsystems of an interface are commonly developed by different manufacturers, which gives rise to disparate implementations of the same interface protocols [8].Furthermore, for easy modifiability and scalability, the manufacturers still prefer customized protocols to those that are standardized [9,10].In this regard, the protocols are occasionally revised during the system design phase as well as the integration phase.
This study suggests a systematic solution about how a subsystem successfully interacts with a counterpart one when they are designed and integrated for the whole system.Specifically, we have focused on resolving the interface faults (i.e., anomalies) during node-to-node delivery over the digital network.Our goal in this study is to ensure compatible and reliable interfaces by checking the consistency between the interface protocols and obtained interface data.
In order to transfer sensitive information, a sending system encodes interface data (i.e., messages) so only authorized receivers can understand [11].The encoding rules contain what payloads (i.e., fields in this study) are structured in a message or how each field is logically converted, which are described in the interface protocols [12].Because the protocols including the structural and logical rules are diverse and complicated, they should be preferentially explored for overcoming the interface faults.To this end, we have analyzed interface protocols used in domestic ship systems that are already in operation or under construction.Messages transmitted via radar, navigation, acoustic, and optical sensor systems as well as several control systems were investigated in this study.We atomized 188 message types that interacted between 22 subsystems for message structures and field logics.
Based on the preanalysis, we proposed two modeling formulas, which contain data sets and functions [13].A structure model at the message level formalizes how many fields belong to a message and what makes it to be structured, and a logic model at the field level assesses how each field is interpreted and whether the interpreted value of the field is understandable or not for receiving systems.The models fundamentally receive interface data as input.Then, they detect and diagnose the faults via the transition functions and output the results through the decision functions.In the proposed modeling, we practically classified five structure types at the message level and modularized several transition functions at the field level.With the proposed formalism, a modeler can specify the interface protocols and diagnose the faults containing messages and inner fields in a systematic rather than an ad hoc manner [14][15][16].
Over the last decade, several studies for fault detection and diagnosis methods have been developed for various systems and applications.Some researchers have developed system models for representing real systems by checking model-predicted outputs and obtained system outputs [17][18][19], and others have centered on output signals of the systems to analyze their features or patterns for fault detection and diagnosis [20][21][22].This study combined these two methods.We focused on input/output (I/O) signals within digital interfaces; at the same time, the signals are explicitly formalized in the two-level models to detect and diagnose interface faults of an arbitrary interface.To the best our knowledge, no work has been reported toward focusing on fault detection and diagnosis during the system design and integration phases.
To realize the proposed models in an effective way, we have used the model-view-viewmodel (MVVM) design pattern in Windows Presentation Foundation (WPF) technology [23].In the developed software, block libraries for modeling elements have been provided to illustrate the benefits of a graphical modeling environment.In addition, a computer network concept has been applied, which is based on the concept of using nodes and connections to create an overall logic modeling.Thus, it facilitates intuitive modeling via libraries regarding structural delimiters and logical operations and allows flexible modeling through their creation and revision.
As a practical use, the proposed work was applied to an underwater shipbuilding project, namely, a submarine renovation project [24].Ten digital interfaces connected to improved subsystems were examined to resolve the interface faults.Seven tests were performed at sea to find the faults for various operational situations, and two tests that allow the ideal preparation for sea-trial tests were conducted in a harbor.The empirical results showed that 14 fault cases, which are either structural or logical, were detected and diagnosed during designing and integrating the renovated submarine.These incorrect patterns in the interfaces were successfully resolved during this project.
The study is organized as follows.Section 2 describes our focus of fault scope, and Section 3 analyzes previous works.Section 4 proposes modeling methods and realization of the modeling as a software tool.Section 5 explains and discusses an application for the shipbuilding project.Finally, Section 6 presents our conclusions.

Background
A fault is defined as an unpermitted deviation of at least one characteristic property or parameter of a system from the acceptable, usual, or standard condition [25].Because the various cases of faults can occur when a system is under development as well as in operation, the fault scope interested in this study needs to be clarified here.

Interface Faults in Complex System Development.
Figure 1 shows a simplified illustration of how a fault is identified and diagnosed in complex shipbuilding engineering.As explained in the introduction, the basic concept for resolving the fault is to evaluate the consistency by comparing with the interface data and the interface protocol including structural and logical rules.
The ship system as an SoS is generally composed of diverse sensing equipment and dynamic systems, which are incorporated into an integrated system [26,27].It has been noted that the majority of end systems such as sensors or actuators do not plug directly into the central network [28].Instead, they connect to a local proxy with each point-topoint link, which in turn is distributed across a central bus network.The signal-processing unit, data integration system (DIS), and integrated management system in Figure 1 act as such proxies.
When designing and integrating the subsystems for the overall system, faults can be found in the central network as well as the outside of the network, specifically in the point-to-point links between the local subsystems [29].In this study, we focused on the local faults rather than the central faults for the following reasons.First, the local faults occur more frequently than the central faults due to disparate implementation of the same protocol.This problem accords with the current industrial tendency that interoperability testing for 2 Complexity communication between the connected systems becomes more important [30,31].
Next, the local faults need to be preferentially identified, because these failures obviously influence the central part [32].For example, if an EM log sends speed information with the wrong unit or resolution, other subsystems using the speed (e.g., navigation sensor or echo sounder) have abnormal behaviors sequentially.Thus, the local faults often have been ascribed to uncontrolled, unanticipated, and unwanted interactions between the subsystems [33,34].
In this respect, this study focuses on the local faults during node-to-node delivery over digital interfaces when designing and integrating the SoS.The local faults were classified into structural and logical levels, which are explained in the following subsection.

Classification of Local Faults.
Figure 2 illustrates exchanging digital data between two subsystems.In digital communication, the data are a sequential stream of bytes at the physical layer [35].In this study, we assumed that the byte stream was already transformed into manageable data units (i.e., messages).
An individual message has a common format to be distinguished with different types of messages [36].For example, a message may be determined to have a fixedlength structure or may include several fields for transmitted information as well as supplemental delimiters such as a header and a footer.Since the message is usually encoded for information security, it eventually needs a conversion process that returns into the original sequence of information [36].These structural and logical rules are comprised in communication protocol, which have to do with an agreement between both-sided manufacturers and a system integrator.
As shown in Figure 2, the faults in an individual interface data are hierarchically classified into two levels: a structural and a logical fault.The structural fault is incorrectness about the exterior of a message.The wrong length of the message or different delimiters in the message are the structural fault.On the contrary, the logical fault occurs when a field conveyed in the message is semantically incorrect.Uninterpretable data, unmeasurable values, or unspecified status information in the field corresponds to this level.
Because the structural and logical rules are diverse and complicated, they should be preferentially explored to detect and diagnose their faults.To this end, we analyzed interface protocols for real messages transferred in domestic naval vessels.Table 1 shows the preinvestigation regarding two types of real submarine systems.The first-generation submarines have been in operation domestically, and those that are third-generation are under construction.In total, 188 message types in 22 subsystems were anatomized for message formats and field logics.We generalized how the messages were constructed to distinguish from others and which rules were required to interpret the fields for meaningful information.
In summary, this study introduces a practical concept for identifying and diagnosing the structural and the logical faults during the design and integration of the complex SoS.Based on the preanalysis, we formalize a structure model at the message level and a logic model at the field level.The proposed models are implemented to a software tool, which facilitates intuitive and flexible modeling to detect and diagnose the interface faults.

Literature Review
Over the last decade, several studies for fault detection and diagnosis have been developed for various systems and applications.In this section, we have classified them into three approaches, which are summarized in Table 2.
In model-based approaches, system models are developed to describe the relationships among main system variables [39][40][41].Based on the models, fault diagnosis algorithms have been developed to monitor the consistency between the measured outputs of the practical systems and the model-predicted outputs [42].For example, Cai et al. [17] used object-oriented Bayesian networks (OOBN) to model complex systems.The OOBN-based modeling is classified into structure and parameter modeling that are built with sensor historical data and expert knowledge.Lamperti and Zhao [18] focused on the diagnosis of active systems, and the diagnosis of rules in the proposed finite system machine (FSM) has been specified based on associations between context-sensitive faults and regular expressions.Poon et al. [19] used a model-based state estimator to generate an error residual that captures the difference between the measured and estimator outputs.These model-based approaches require explicit models, whose accuracy determines the diagnosis performance.
On the contrary, signal-based approaches decide diagnostic decisions based on features or patterns of the extracted signals rather than the system models [43,44].For example, Loza et al. [20] proposed a nonhomogeneous high-order sliding mode observer to estimate sensor signal faults in finite time and in the presence of bounded disturbances.In Do and Chong's work [21], the vibration signal was translated into an image; the local features were then extracted from the image using scale-invariant feature transform for fault detection and isolation under a pattern classification framework.Pan et al. [22] proposed an acoustic fault detection method, which was addressed for the gearbox based on the improved frequency-domain blind deconvolution flow.The signal-based approaches generally extract the major features of the output signals for fault diagnosis, but they pay less attention to system inputs [45].
This study has combined the two approaches.To detect and diagnose interface digital signals, signal patterns under a normal status were generalized, which were known from the interface protocols.Input signals as well as output signals were targeted because the interfaces interested in this study generally had both-sided signals.Then, the patterns were formalized with mathematical models (i.e., interface data models).The proposed models had explicit sets and functions, and the fault diagnosis within the modeling was carried Subsystems to be analyzed are (1) radar, navigation, acoustic, and optical sensor systems and (2) control systems such as plotting boards and weapons control systems.

Data integration system (DIS)
Depth sonar transmitter-receiver  The complex-valued fixed point algorithm was used for frequency domain signal.

Power transfer system
Interoperability testing

Vijayaraghavan et al. [49] To provide a common means for communication between devices
A data exchange standard was proposed.
Manufacturing system

Shin et al. [38]
To analyze the operating situations of the systems at the system-integration phase A message-description language was used to convert the raw interface data into the interpreted data format.
Ship system 5 Complexity out by checking the consistency between the structural and logical patterns and the measured signals.In short, we focused on the interface signals and developed explicit data models with deterministic criteria [46,47].
Most of all, the above studies have been utilized in operating the complex systems.For system design and integration phases, the interoperability testing is a similar concept to our approach [37,48].For example, Vijayaraghavan et al. [49] proposed an open communication standard for data interoperability.The proposed open protocol provides the mechanism for process and system monitoring and optimization concerning resources.Shin et al. [38], which is similar to that of the present study, involved the development of an analysis tool to confirm the integrated performance of the complex system.To analyze the data, a message-description language was used to convert the raw interface data into the interpreted data format.Despite their practical contributions, however, they cannot diagnose the faults within the interfaces.To the best our knowledge, no work has been reported toward focusing on fault detection and diagnosis during the system design and integration phases.

Proposed Work
4.1.Software Architecture.Having introduced the concept of interface data and their faults, the overall design of the developed software will be introduced in this subsection.The focus of the software is to (1) represent hierarchies of the interface data (i.e., project, interface, message, and field) and ( 2) realize structure and logic modeling with graphical user interface (GUI).
To provide flexible GUI for modelers, a specific software design pattern was used.The MVVM design pattern in WPF technology facilitates to decouple the GUI from model logic and data [50,51].The model in the MVVM pattern is an implementation of the application's domain model that includes a data model along with business logic, and the view is responsible for defining the layout and appearance of what the user sees on the screen.The view model acts as an intermediary between the view and the model and is responsible for handling the view logic.Because the view model retrieves data from the model and then makes the data available to the view, in this subsection, we will focus on view models to realize the proposed modeling.
A class diagram for major view models of the developed software is described in Figure 3. Two ViewModelBase classes serve as base classes for other view model classes.The left part of Figure 3 indicates hierarchies for interface data modeling from a project to a field, and the right part shows specific elements to model a field logically.
The ProjectViewModel class takes charge of resolving interface faults for a particular project.After a project is determined, TagViewModel manages multiple tests in the project.In the following application section, nine tests for the submarine renovation project were managed by this class.The InterfaceViewModel ensures operations regarding evaluating an individual interface (e.g., loading interface data, analyzing them with interface protocols, and visualizing fault results).Therefore, it is mainly composed of the following properties: Messages for the interface data, MessageSpecs for the interface protocols, and MessageReports for the fault reports.Finally, the MessageModelingViewModel facilitates structure modeling at the message level, and the FiledMode-lingViewModel enables logic modeling at the field level.On this wise, the architecture fundamentally facilitates a hierarchical modeling: an interface provides the means for an arbitrary number of messages and a message also comprises multiple fields.

Complexity
In particular, the logic model at the field level recognizes logical rules and diagnoses faults by comparing received field data to the rules.The logical rules generally contain multiple steps to decode the raw data to interpretable information.Therefore, to model a field in stages, the developed software provides eight elements for the rules, which are illustrated in the right parts of Figure 3 (from ConversionFunctionView-Model to FlowJunctionViewModel).The remaining view models (i.e., InputFieldViewModel, NormalOutputViewModel, and FaultOutputViewModel) are for inputs and outputs of the logic model.Because these elements have their own views in the MVVM pattern, the developed software offers a graphical modeling approach that helps the modelers visualize every element to model a field.In the following subsections, we explain methodological aspects for structure and logic modeling at the message and the field levels, respectively.4.2.Structure Modeling at Message Level. Figure 4 illustrates elements of the proposed model that is either a structure or a field model.The model fundamentally receives interface data (Χ in Figure 4) as input and sends fault results (Y) as output.Inside the model are one total state and three functions.These are depicted with circle and squares in Figure 4.The model state S is updated after performing two transition functions, which contain received data, conditions for the rules, and functional results.
If an input occurs, δ ext interprets the input data with the rules and updates S (① and ② in Figure 4).As explained in the previous subsection, the logical rules need multiple interpretations.In this instance, after δ ext , δ int updates S without any input.Note that δ int is carried out sequentially until the interpretations are completed (③ and ④).This is a general situation for logic modeling at the field level, which will be explained in the following subsection.Finally, when all δ ext and δ int are fulfilled, ω decides fault results based on the updated state and outputs the decision (⑤ and ⑥).
The proposed models are derived from the discrete event system specification (DEVS), which is a general mathematical representation for discrete event systems [52][53][54].The main difference between the DEVS and our models is targets to be modeled.The DEVS focuses on the system itself.Thus, it should represent behaviors of the system as time passes.On the contrary, because our models aim at the interface data rather than the system, they have no concept of time.In other words, our models are static in which the output depends on the input at the same time.
According to Figure 4, the proposed structure model is 5-tuple consisting the following: X is an input set of n fields comprised of a message, where f i is the value of the i-th field within the message, l i is the length of f i ; Y is an output set of structural faults, where Components of SM Message are contained within .Every notation in SM Message is based on set theory.For example, means a set; × indicates the Cartesian product (i.e., all possible ordered pairs); 2 X means the power set of X; and → means the function mapping.
The structure model evaluates how many fields belong to a message and what distinguishes the message.Specifically, the external transition function, δ ext , receives all the fields comprising a message and appropriate delimiters.

Complexity
And it updates the transition result, indicating if the message has a correct format.Because one or more fields are mapped into one delimiter, δ ext needs X in the form of the power set.Note that the internal state transition function is not required in the structure model.Finally, the fault decision function, ω, checks the current state and produces a fault result.
Based on the preinvestigation in Table 1, the message structures were divided into five types, which is shown in Table 3.The structure types are classified depending on how to use the following delimiters: a header, a footer, and a message length.In case of types with the header and the footer, ω checks whether the first and the last fields are satisfied with the header and the footer, respectively.The length is known in two ways: (1) it is computed by adding length of all the fields, or (2) it can be found within a specific field, for example, f i , l i in Table 3.The structure model is relatively simple to design because it decides the correctness of the message exterior.As explained in the previous subsection, the MessageModelingViewModel in Figure 3 realizes the structure model.
The following specifications are a modeling example for a message in the GPS that will be explained in Section 5.3: X is an input set of fields, where v tgt is the initial value of the target field, v ref is the initial value of the reference field; Y is an output set of logical faults, where r is the fault result of the target field; S = F × P × R is a total state set, where for the target field, where f int is the initial value of the target field, f dec is the decoded value of f int after the previous transition function, , where r t is the transient result after the interim transition function, r f is the final result after the last transition function; A key element of LM Field is the field transition functions, (i.e., δ ext and δ int ); thus, we have summarized their practical 8 Complexity types in Table 4.In common with Table 3, the functions also were induced from Table 1.
The five categories in Table 4 show that three types of operations are basically provided to (1) convert the data type of the field to another, (2) compute it arithmetically, and (3) compare it with a criterion.To express the complicated patterns in the fields, we identified 3 special functions: user-defined equation, regular expression [55], and ASCII expression (δ 12 to δ 14 in Table 4).For example, we assume that a bearing field has three ASCII characters to represent a three-digit number.The field is additionally promised that the seventh bit of the third character is always assigned if the bearing is not newly updated.Otherwise, under normal states, the field is interpreted as the three-digit number from zero to 359.In this case, we simplified logic modeling by using two regular expressions: to compare the field with patterns, that is, 0 − 3 0 − 5 0 − 9 for a normal state and p − s 0 − 5 0 − 9 for an abnormal state.Finally, these operations are parallel in using a flow control operation to make a conditional statement.Parameters for the operations are specified in P in LM Field .
Let us explain how the transition functions in Table 4 are used in a real case.As explained previously, an Electronic Support Measure (ESM) sends bearing information to be encoded with three ASCII characters.The following steps show the overall procedures that the receiving system decodes the bearing information: (1) The receiving system first identifies availability of the bearing field by checking another reference field within the same message (δ 10 ext , δ 15 ext ).
(2) If the bearing field is not available, the system confirms that its hexadecimal values are all 0x20 (δ 14 int ).(3) If the bearing field is available, the system next identifies if the field is newly updated by comparing its values with predefined patterns (δ 13 1 int ).(4) If the bearing field is proved to be newly updated, the system checks if the numeric value is within the valid range from zero to 360 (δ 13 2  int ).(5) The system finally converses its current data type, that is, ASCII characters, into unsigned integers (δ 1 int ).
These steps are performed sequentially according to the results of the previous step.In this case, for logic modeling, six transition functions were used including one conditional flow function.The overall specifications are as follows (the bold fonts in δ mean updated parts): To realize the sequential characteristics of the logic modeling as a software, a computer network concept is applied.Figure 5 shows a schematic illustration of a network configuration, which is a collection of nodes and connections.The nodes are linked to each other by connections, and the connectors in the nodes are anchor points to attach connections between the nodes.For example, Node 1 corresponds to X of LM Field , Node 6 and Node 7 are relevant to Y, and the others are represented by two transition functions: δ ext or δ int .A major difference from the typical computer network is that the network in Figure 5 is a one-way communication and not a two-way interaction; that is, all the connections have directions to pass the data to the node at right.
Figure 6 shows a class diagram for logic modeling based on the network configuration.The DiagramViewModel visualizes and edits the overall modeling of a field.Nodes and Connections as properties of this class specify the collections of nodes and connections to be displayed in the logic modeling.In NodeViewModel, InputConnectors, and OutputConnectors are the collection of connectors that specify the node's connection anchor points, and AttachedConnections retrieves a collection of the connections that are attached to the node.The Element determines the type of the node.The ConnectionViewModel describes a connection between both-sided nodes, specifically two connectors in each node (i.e., the SourceConnector and the DestConnector).This connection continuously monitors its source and destination connectors.Finally, the ConnectorViewModel indicates an anchor point on a node for attaching a connection.The ParentNode in this class references the node that owns the connector.
Figure 7 shows the modeling execution of the bearing field previously described, that is, LM bearing .The developed software provides two views: a list view in the form of the ribbon command bar and a model view for building the model.The list view provides block libraries of modeling elements, in particular transition functions in Table 4 (the red box in Figure 7).Using the libraries, a modeler can  For example, he/she can choose an appropriate element from the list view based on the model design and drag it to the model view.By connecting the two-sided elements with lines, the modeler can easily build a sequence of transitions and decision functions.In Figure 7, yellow boxes are realizations of LM bearing .In this manner, the developed software facilitates intuitive modeling via the block libraries and allows flexible modeling through the addition and deletion of the modeling elements.

Application
The objective of this application is to demonstrate how we can detect and diagnose interface faults when designing and integrating a complex SoS.The targeted system is an underwater vessel (i.e., a submarine system).The faults were (1) incorrect interface protocols during the design phase and (2) abnormal values in interface data during the integration phase.
5.1.Shipbuilding Project Overview.Due to budget constraints, the Navy can no longer afford to build new ships beyond its existing military force [28].In this context, the product improvement program (PIP) is a good alternative.
The PIP incorporates improvements of partial systems to enhance overall system performance.Because it reduces the procurement time and lowers the maintenance costs compared to the development of an entirely new system, PIP has become an industrial trend in several industrial fields [56][57][58][59].
Since late 2014, South Korean shipbuilder, Daewoo Shipbuilding and Marine Engineering (DSME), has undergone a PIP for three submarine systems that corresponded to the first-generation class of Table 1 [24].The submarines' onboard subsystems including navigation, acoustic, optical, and radar sensors as well as combat systems have been renovated.The PIP will be finalized in 2019.
For this PIP, the compatibility of the improved subsystems with existing ones at the I/O level is a key consideration.Thus, two phases in the system development life cycle (SDLC) require interface fault-handling activities.To be specific, the validity of the interface protocols needs to be assured to accurately represent the interface data between the linked subsystems.Then, the developed subsystems should be verified at the I/O level by comparing the interface data with the valid protocols.In this respect, the DSME has carried out several tests to resolve interface faults using the proposed method and software at the design phase as well as the integration phase.More detailed descriptions for this PIP were informed in our previous work [12].5, the proposed modeling and software have been utilized for nine shipboard tests over the last three years.Until the first half of 2017, preliminary and critical design phases had been proceeded for the first renovated submarine.During this period, eight tests were conducted.Thereafter, all the subsystems were  yy and mm in (yy.mm) mean year and month, respectively.Subsystems in the fourth column are connected to local proxies such as the DIS, the integrated management system, or the signal processing system.These tests are extension from our previous study [12].

Design of Tests. As shown in Table
12 Complexity completely developed; thus, they have been integrated to the submarine system as of late 2017.During this time, we carried out one fault test.Most tests were conducted at sea to resolve the faults for various operational situations.Among 10 subsystems to be tested, eight were sensor systems including acoustic and optical sensors (e.g., echo sounder, depth sonar, periscope, CTD device, EM log, and ESM) and the navigation suite of sensors (e.g., INS and GPS).Two control systems were a plotting board system and a weapon control system.The weapon control system was an additionally renovated system during the PIP.This means that test 3 in Table 5 was unexpected and belatedly determined just two months before the test.

Interface Data Modeling.
For detection and diagnosis of the interface faults, we first modeled message structures and field logics in each interface.Figure 8 shows some modeling results using the developed software.Ten interfaces between the 10 subsystems and local proxies were targeted, which are based on serial communications such as RS-232 and RS-422 (Figure 8(a)).As an example of the interface between the GPS and the DIS, 15 groups of messages were transferred (Figure 8(b)).One message, Data Block 2, was modeled with nine fields including d hdr and d ftr (Figure 8(c)).These figures are the realization of SM GPS−1 described in Section 4.2.
Figure 8(d) shows logic modeling of the velocity field in the EM log message.In this field, two main interpretations are required: (1) to convert a hexadecimal number to a floating-point number and (2) to check the valid range of the value.Specifically, the field data has initially a hexadecimal number; thus, it needs to be first converted to a decimal number.Then the decimal number is multiplied by 0.01 to represent two decimal places.The value is finally checked whether it is within the valid range from −90 to +90.If the result is out of range, the field is diagnosed with a logical fault.The logic modeling for this process can be expressed in a combination of various transition functions.In Figure 8(d), four transition functions (i.e., δ 1 ext , δ 4 int , δ 12 int , and δ 15  int ,) were used to model the logic of the velocity field.13 Complexity Table 6 summarizes key results of overall structure and logical modeling.Thirty-two messages in the interfaces were modeled whose lengths are 3 to 284 bytes.The number of fields in each message, that is, N X ∈ SM , increases if the message length is longer or if the fields are separated by bit units.For example, Msg G-2 , the longest message, has 85 fields in 284 bytes.Whereas, Msg J-1 has 90 fields in only 18 bytes because it is divided by bit units as a typical example of customized communication protocols.Now, let us examine Msg B-2 and Msg E-1 to explain specific types of message delimiters.First, since Msg B-2 has a fixedlength without any header and footer, it should be classified with a message length (i.e., d length ).On the other hand, the length of Msg E-1 is variable because its depth field has a floating-point number with 3 to 6 bytes.The variable length was not actually recognized before test 5 , which will be explained in the following subsection.After test 5 , Msg E-1 was accepted that it cannot be classified with the message length; instead, it should be classified into the header and the footer.Except for these cases, all the messages are generally modeled with d hdr and d ftr .To sum up, the messages in this study used two structural types: (1) d hdr and d ftr and (2) d length .
For logic modeling, the ninth column of Table 6 shows the average number of transition functions (i.e., δ ext and δ int ) to be used for modeling fields in each message.For example, 5.29 in Msg D-1 means that more than five transition  Interface names consist of subsystems connected to local proxies and their identifiers: subsystem identifier."O" in columns for the delimiter and interpreted value means to be applicable as a positive answer.
14 Complexity functions were used to model each field.In this column, most messages have numbers larger than 2, which means that a group of transition functions was used to interpret the field and diagnose faults.Finally, the decoded values could be either numbers such as velocity, yaw, pitch, and depth or characters (e.g., textual message or behavioral mode).9 shows some results of test 5 and test 6 using the developed software.All the figures except Figure 9(e) are relevant to the GPS messages.
Figure 9(a) shows the messages transmitted between the GPS and the DIS.During the 72 hours, 26,157 messages were monitored, which are chronologically arranged in the main table.Because the messages had not been evaluated yet, two columns-Message and Analysis Result-are empty, and the number of faulted messages at the bottom of the table is also zero.By pushing the Analyze All icon in the ribbon bar, the messages were analyzed.In the main table of Figure 9 To examine where and why the faults occurred in each message, the message can be opened out so that every field is displayed.In Figure 9(c), the opened message has a logical fault at the fourth field due to the unexpected value.To be specific, the fourth field was modeled not to send STS.However, during test 5 , the relevant system actually sent that value, which leads to a contradiction between the modeling and the real data.Finally, the overall results were shown by pushing the View Report icon.In Figure 9(d), more than 80% of messages in Data Block 32 were faulted during test 5 .
The software also provides a time series chart for an interpreted value of each field.Figure 9(e) shows a numeric chart for the velocity value in the EM log message, and Figure 9(f) represents a chart for tracking status in the GPS message.These charts facilitate the trend of the values according to the progressed time at a glance.For example, test 5 was for acquiring navigation data at various velocities, and the chart exactly visualizes when the velocity is changed.The INS needs the GPS data to calibrate the navigation data, and the GPS status can be found on the chart regularly.
Table 7 summarizes (1) how many messages were acquired from interfaces of all the tests and (2) how many faults were detected among them.The numbers of the obtained messages from each interface (i.e., N total in Table 7) are all different for the following two main reasons.First, each interface has different message types as well as the types have different transmission cycles.For example, Control G has two message types (i.e., Msg G-1 and Msg G-2 ) and they are transferred every eight seconds.On the other hand, Sensor I has four message types with 0.125-second cycles.In this case, NI total of Sensor I is arithmetically 128 times more than that of Control G if they operate in the same amount of time 128 = 4/2 × 8/0 125 .Next, because scenarios of the tests are all distinguished, the subsystems are operated situationally.The periscope system for Sensor A is normally operated during the vessel moves above the specific depth (i.e., periscope depth).This means that any messages in Sensor A will not be transferred when the vessel dives below the depth.In this context, we can assume that test 2 was carried out below the periscope depth over a longer period than test 1 .

Complexity
In Table 7, the interfaces where local faults were detected are marked italics.The number of the local faults (i.e., NI fault ) was counted if r j ∈ SM Y or r ∈ LM Y has False once in an individual message.For example, in Sensor C of test 5 , 9509 messages were detected to be structurally or logically faulty among 26,157 messages (this is the case of Figure 9(d)).Synthetically, six interfaces except Sensor A , Sensor D , Sensor I , and Control J were faulted.Note that Sensor C and Control G had local faults in more than two tests.This implies that the causes of the faults are distinct according to the test, which will be explained in Table 8.
To evaluate the relative magnitude of the detected faults in each interface, Figure 10 illustrates fault ratios.In Sensor E , Sensor F , and Control J , more than half of the messages were faulted.In Control J of test 9 , the 100-percent ratio means all the messages in this interface failed to be interpretable.Although Sensor C in test 6 has more faulted messages than the case of Sensor F in test 1 , the fault ratio of Sensor F is twice higher than that of Sensor C .
Table 8 summarizes diagnostic results of the faults in the overall tests.In total, fourteen fault cases were diagnosed within seven message types: three cases are for structural faults and 11 are relevant to logical faults.The structural faults, which r j ∈ SM Y is False, came from incorrect headers and length.The logical faults have three diagnoses: (1) wrong field interpretations, (2) missed status information, and (3) incorrect relations between neighboring fields.Specifically, Msg E-1 had structural and logical faults simultaneously.The modeling should be revised that all the fields regarding a specific section are full of "0x20" if no targets are detected in the section.The modeling should be revised that the interpreted value of the sign field can be "+" although the precondition field is unavailable.03-01-01-00-00-00-00-…-1B-C3-79-58-84-8A-00-00-00-3F

Logical fault
Sensor B needs to be refined to send "0x00" or "0x54" for the test field although the corresponding system is initialized.

35-03
Sensor H should be refined to send the message with an accurate header.

Structural fault Complexity
Before test 2 , Msg E-1 was modeled with a header and length for delimiters.However, real messages could not be classified with the current delimiters due to their variable lengths; thus, the modeling was revised to use a header and footer.Then, we looked over logic modeling, focusing on which field influenced the variability.It was proved that the pressure field could be represented with 4 to 7 bytes, including sign characters.Figure 11 shows how the fault cases are influenced on the messages with the same type.Because NM total in Figure 11 is regarding the same message type, it is a subset of NI total in Table 7. From Figure 11, we summarized the following findings.First, eight fault cases cause more than half of the faulted messages.For example, fault 5 brought about more than 14,000 faulted messages during 72 hours.Next, fault 2 and fault 3 cause the same number of faulted messages, which means that they were complementary and occurred at the same time.Additionally, fault 7 and fault 12 were relatively difficult to detect and diagnose because they were scenario-dependent faults.If the scenarios are different, the results will be different.This means that test scenarios to cover all the cases are also important.Finally, the number of faulted messages increased as time passed since one fault case occurred.5.5.Discussion.For synthesized analysis, Figure 12 summarizes how many fault cases were diagnosed and resolved in each test.Note that the numbers in this graph are not total numbers of faulted messages.
Invalid interface protocols led to unforeseen incompatibilities between subsystems that could not be revealed until they were integrated.The first eight tests were carried out to validate the interface protocols at the system design phase.During the tests, the structure and the logic models had been gradually revised by fixing the current faults for the next test.For example, Msg G-2 in Table 7, whose field has "0" as an interpretation value before test 5 , modified the interpreted value after the test.Consequently, the number of fault cases decreased as the tests progressed.Because the seventh and eighth tests had no interface faults, the interface protocols were almost fully assured to be validated.
Let us examine test 3 as a special case.As explained in Section 5.2, it was not considered initially in our tests.Nevertheless, the interface data for the weapon control system could be evaluated because we analyzed various types of interface data in Table 1, generalized their properties, and formalized them with a mathematical form.Fortunately, no faults were detected in this interface.Indeed, formal representations of interface data and flexible modeling using the software are particularly beneficial in arbitrary system developments.
Finally, the faults in test 9 made the corresponding subsystems resolve their unexpected behaviors at the system integration phase.For example, Sensor B needed transient time for initialization, and during this period, it should have been revised to send an appropriate value in the test field.As the system development progressed, integration problems became harder and more expensive to solve, so it was paramount to figure out potential faults as early as possible [21].In this application, only two faults were found in the ninth test, which means that the previous eight tests significantly reduced the integration problems.
To sum up, the faults were identified until test 8 induced a revision of the structural and logical modeling.In other words, the DSME as a shipbuilding integrator continuously revised and validated the communication protocols based on the results of the eight tests.After that, the DSME verified the developed systems via resolving the faults in test 9 .These fault-resolving activities had been conducted during the system design and integration phases, which is a clear difference between the previous fault-resolving studies.The proposed work played a vital role in the overall submarine renovation project.

Conclusion
In this study, we are mainly concerned with intersystem faults whose results are observable outside the systems.Our goal was to find patterns in the interface data that do not conform to expected behaviors.
The main contribution of this study is theoretical and practical.From the theoretical viewpoint, we categorized the interface faults into structural and logical levels, and they were evaluated based on mathematical modeling formalism.The core concept in the formalism is to support explicit functions for transitions and fault decisions.Thus, the proposed formalism could be applicable to customized protocols as well as standardized ones, which is suitable for arbitrary system development.From the practical perspective, the developed software facilitates graphical modeling via creation, arrangement, and revision of the modeling elements.The system integrator could constantly evaluate and supplement the interface protocols at the design phase and the interacted subsystems at the integration phase.It has been successfully utilized for a submarine renovation project.
All the works in this study were based on real data acquired from submarine systems.The interface faults regarding incorrect design and abnormal implementation can be resolved during designing and integrating complex systems.The proposed work have facilitated to reduce system development time and avoid dangerous situations during a shipbuilding project.The faults interested in this study are relevant to individual interface data; thus, detection and diagnosis of a sequence of multiple interface data will remain for future work.

Figure 1 :
Figure 1: Local faults in SoS-based ship system.

Figure 2 :
Figure 2: Classification of local faults in point-to-point link: structural and logical faults.

Figure 3 :
Figure 3: Simplified class diagram of developed software for fault detection and diagnosis.
r j is the fault result for d j ; S = D × R is a total state set, where D is a set of delimiters for structural rules, R = true, f alse is a set of transition results; δ ext 2 X × D → S is the message transition function; ω S → Y is the fault decision function.

Figure 4 :
Figure 4: Elements of proposed interface data model.
r hdr , d f tr , r f tr ; S = d hdr , d f tr × true, f alse , where d hdr = 0x3A58 as a hexadecimal number, d f tr = 0x0D0A as a hexadecimal number; where v brg is the initial value of the targeted bearing field, v ref is the initial value of the reference field for checking the availability of v brg ; Y = v brg , true , v brg , f alse ;
edit the logic model in the model view.

Figure 6 :Figure 7 :
Figure 6: Class diagram for logic modeling using network configuration.
(a) Modeling of 9 groups of interfaces in shipbuilding system (b) Modeling of 15 groups of messages transferred in GPS (c) Modeling of 9 fields in GPS position message (d) Modeling of velocity field in EM log message

Figure 8 :
Figure 8: Interface data modeling using developed software.

Figure 9 :
Figure 9: Results of fault detection and diagnosis using developed software.

Figure 11 :Figure 12 :
Figure 11: Number of faulted messages and fault ratios at message level.

Table 1 :
Preinvestigated interface protocols in 2 types of submarines.

Table 2 :
Summary of related works.
tr ; ω d hdr , true → d hdr , true ; d hdr , f alse → d hdr , f alse ; d f tr , true → d f tr , true ; d f tr , f alse → d f tr , f alse 4.3.Logic Modeling at Field Level.The logic model assesses if each field is understandable for receiving systems.Because the interpretation process is complicated, the initial value of the field data needs a sequence of transition functions to decode to an understandable information.The proposed logic model is formalized as follows:

Table 3 :
Examples of X and D depending on type of message structure.1 and f n mean the first and last fields, respectively.f i , l i means a specific field containing the length information of the message. f

Table 5 :
Overall design of tests during system design and integration phases.

Table 6 :
Main results of interface data modeling.

X
Decoded value d hdr d ftr d length (b), during 15 seconds, two message types (Data Block 05 and Data Block 11) were identified just once, and two types (Data Block 30 and Data Block 31) were distinguished continuously.After the analysis, all the messages regarding Data Block 31 were diagnosed with logical faults, of whose rows in Analysis Result are shaded in red.Of the 26,157 messages, 9509 messages were faulted, which are indicated at the bottom of the table.

Table 7 :
Overall test results: fault-detection results.NI total is the number of obtained messages from each interface.NI fault is the number of faulted messages from each interface.Some interface is not applicable for a specific test, which is represented by N/A.

Table 8 :
Overall test results: fault-diagnosis results.