A Framework for Visualizing Heterogeneous Construction Data Using Semantic Web Standards

. 3D Visualization provides a mean for communicating diﬀerent construction activities to diverse audiences. The scope, level of detail, and time resolution of the 3D visualization process are determined based on the targeted audiences. Developing the 3D visualization requires obtaining and merging heterogeneous data from diﬀerent sources (such as BIM model and CPM schedule). The data merging process is usually carried out on ad hoc basis for a speciﬁc visualization case which limits the reusability of the process. This paper discusses a framework for automatic merging of heterogeneous data to create a visualization. The paper describes developing an ontology which captures concepts related to the visualization process. Then, heterogeneous data sources that are commonly used in construction are fed into the ontology which can be queried to produce diﬀerent visualization scenarios. The potential of this approach has been demonstrated by providing multiple visualization scenarios that cover diﬀerent audiences, levels of detail, and time resolutions.


Introduction
A typical visualization paradigm in construction utilizes Building Information Modeling (BIM) data along with project schedule to depict the construction progress.is approach, known as 4D visualization, focuses on high-level details only, in which each element simply appears in its final position when the corresponding task in the schedule is complete.is type of visualization provides an appealing interface to illustrate the sequence of activities in the schedule and helps align objectives between different stakeholders [1], understand the schedule, and identify potential problems [2].However, it fails to give more details about site conditions (e.g., congestions due to temporary facilities) and the interaction between personnel and equipment on-site [3].
A more detailed visualization scenario is operational visualization, which focuses on intricate details rather than the high-level progress.
is type of visualization depicts details such as material movement, interaction with cranes, etc. [4].A successful implementation of operational visualization requires more data than the typical 4D visualization approach.For example, it requires scaffold logs along 3D model and schedule to visualize scaffold operations.
Bearing in mind that the construction industry is characterized by large volumes of data that come from heterogeneous data sources [5], creating operational visualization is a tedious task that requires mapping and merging these data which is usually performed on an ad hoc basis per specific case study.
Heesom and Mahdjoubi [6] have stated that the flow of data is one of the most critical issues in the development of visualization tools.
ey argued that most visualization applications require manual input from different data sources, which is a potential reason that it is not widely used in the construction industry.
is paper discusses a framework for generating visualizations of different construction activities with different levels of detail.e proposed approach is data-centric that focuses on merging and processing data rather than the visualization application.
In order to focus on merging and processing data, the authors developed an ontology [7] that conceptualizes information related to the visualization process. is ontology formalizes data flow from different data sources that need to be visualized.
Afterwards, SPARQL queries [8] have been used to retrieve and manipulate the stored data and automatically generate input files for the visualization engine.Separating the data from the visualization engine allows changing the visualization application without having to edit the ontology or the input data.Additionally, it makes it possible to add new data sources to an existing visualization process without breaking compatibility with the visualization application, as will be shown later.
e remainder of the paper is structured as follows: first, we discuss previous work related to visualization in the construction domain, then we outline the research objectives and methodology and finally discuss in detail our proposed framework and demonstrate it with real case scenarios that cover different applications and different ranges of activities commonly used in the construction domain.
1.1.Visualization in Construction.3D visualization of construction projects is challenging due to its complexity and unpredictable nature [9], which leads to many customized applications that are applicable only to a certain type of construction project or even to a specific project.
ere are many examples of using visualization in construction domain.For example, it has been stated that a typical construction simulation consists of eight federates [10] and an essential federate of these eight federates is a visualization federate, which shows the simulation progress and results to the end user.
Many examples of using visualization in simulation can be found in the literature.ese include tunneling [11], training [12,13], and crane operations [14,15].
In the following sections (1.2-1.4),three previous projects that are closely related to the visualization process and developed by the authors will be discussed.ese projects were built upon High-level Architecture-(HLA-) distributed simulation [22,23].
In HLA simulation, the simulated problem is broken into smaller components, known as federates, that interact with each other during the simulation.Each of the mentioned project here has a visualizer federate that shows progress made by other federates.Here, we will focus on the visualization federate in each case by discussing the challenges and lessons learned to show the necessity of a generic visualization process that can accept data from different sources: whether from simulation components or standalone applications.

Pipe Manufacturing Visualizer.
e first project simulates the construction of oil refineries and petrochemical plants, which follow modular construction paradigms.
Different modules are fabricated by assembling components from different trades (e.g., pipes, structural steel, and equipment) in an off-site module yard; then they are shipped to project site and installed using heavy-lift cranes.is is a complex process that involves multiple parties and requires careful planning and coordination.
An HLA-distributed simulation has been developed to model this operation [24,25].e simulation focuses on the piping manufacturing process; it simulates constructing piping spools and assembling them into modules in a module yard, transportation to construction site, and final installation.In addition, it tracks the associated schedule and ensures that predecessors have been fulfilled before installation.
is distributed simulation contains five federates: ( e visualization federates enhance results' readability, give more insight on the piping manufacturing process, and provide an easy way to validate the simulation.e visualizer has been used to provide the following: (i) Visualize the logical sequence of the schedule as it displays the installation sequence according to the provided schedule.(ii) Display the utilization of the module yard, which is divided into bays where different pipe modules can be assembled in parallel.e visualizer also helps determine whether the bays are over-or underutilized.(iii) Combine the schedule logic with the spatial data to show the congestion on the construction site.

Earthmoving Visualizer.
e second case tackles the earthmoving process.Due to its repetitive nature, an earthmoving operation is a good candidate for simulation.Many simulation models have been built to study the effect of different factors on earthmoving operations. is includes fleet optimization [26][27][28]; decision support [29][30][31], and utilizing real data [32][33][34].
Most of these simulation models considered an earthmoving operation as one model, which limits scalability and extensibility (i.e., adding new functionality in next developing cycles); in this project, distributed simulation 2 Advances in Civil Engineering was used to overcome these limitations by breaking the earthmoving operations into six federates: controller, loader, mover, breakdown, weather, and visualizer- [35].e controller federate is responsible for creating a federation, defining a scenario (e.g., fleet composition, road length, and hauling material), and displaying statistical results (e.g., production rate and utilization).e loader and mover deal with equipment movements in the mine and road, respectively, and this separation allows different teams to focus on different conditions; for example, while mover focused on tire wearing due to rolling resistance, the loader considered queuing trucks in the mine pit.
e breakdown federate simulates the breakdown effect on the production rate by breaking down trucks and excavators based on distributions drawn from historical data.e weather federate studies the effect of weather conditions (e.g., precipitation, wind speed, and snow depth) on the earthmoving process, and based on the earthmoving location, the weather federate sends weather conditions to all other federates.
At the simulation inception, the visualizer federate loads terrain and roads from a "3ds" file format.Afterwards, it loads trucks and excavators when they are registered by other federates; the visualizer is equipped with 3D assets library for many trucks' and excavators' models, and it loads the required model based on other federates' requests.During the simulation, the trucks' position and state (i.e., loaded vs. empty, working vs. broken down) are interpreted and displayed based on other federates' updates.
e visualizer federate provides insight into the earthmoving operation; it shows the truck movement and state as seen in Figure 1. e user can manipulate (i.e., zooming, panning, and rotating) the model.He/she can also hover over each piece of equipment to show its properties.
is project shows how important and necessary a visualizer is to introduce simulation results to a nonexpert in a more intuitive way.Nonetheless, this project shows several drawbacks in its visualization structure.For example, it has been built upon a custom visualization interface rather than a COTS visualization engine, which in turn leads to a relatively slow visualizer with inability to handle sophisticated 3D assets.It also has been built for one visualization scenario which limits its expansion to other scenarios; this can be noticed in its data model that describes specific classes such as trucks and excavators.Additionally, it shows simulation results, but they cannot be filtered or queried to visualize different visualization scenarios.

Distributed Observer Network (DON).
e Simulation Exploration Experience (SEE) [36] is an annual event organized by e National Aeronautics and Space Administration (NASA) and e Society for Modeling & Simulation International.SEE invites students from different universities around the world along with industry and professional associates to develop a distributed simulation for a space mission.e SEE challenge, with a time frame of around six months, gives students an inspiring way to learn and apply HLA standards while collaborating with students from other universities around the world.e teams developed 18 federates that simulated different tasks on the moon surface.e tasks included mining, an asteroid warning system, and transportation using rovers.
e authors' team developed a federate that simulates the erection of a facility on the moon surface.At the construction site, there were two cranes that were controlled from the Earth and were used to assemble the facility's modules based on a provided schedule and spatial data as seen in Figure 2 [37].
e DON federate, developed by NASA team [38], provided 3D visualization capabilities for the distributed simulation.Unlike previous project (mentioned in Sections 1.2 and 1.3), this project utilizes a COTS visualization engine which overcomes many of the drawbacks noticed in the previous project.However, the visualization paradigm was coupled to the HLA simulation, and it cannot be utilized outside this scope.Additionally, it does not provide querying capabilities which means each visualization scenario has to be prepared individually.

Lesson Learned and Research Motivation.
e variety of applications shows how important visualization is in the construction domain.However, most of the visualization applications demonstrated earlier focus on a specific case study which limits the visualization applications' usage in different contexts and using them interchangeably is impossible as each one was customized for a specific scenario.For example, in the earthmoving case, to empty/fill or move a truck, the visualizer assumed a specific format that might not be used in other simulation models.
Additionally, the visualizer shows all data generated by these scenarios, which means that changing time resolution or level of detail will require rerunning the scenarios with a new configuration.Table 1 shows a summary of the characteristics and limitations of these visualizers.
Accordingly, this research tries to overcome these limitations by providing a visualization framework that can be used with different applications and case studies.It considers using semantic web technology to develop a general data store that merges data from different sources, and then SPARQL queries retrieve data related to a visualization Advances in Civil Engineering scenario and format it according to the corresponding visualizer input schema.
More specifically, the proposed approach tries to address the following questions: (i) What are the main concepts to be included in the ontology?e key concepts and taxonomy in visualization-such as time, position, and orientation-must be carefully investigated.(ii) Will the semantic web be able to merge different data sources and reformat them for visualization? e construction domain uses different applications, including schedule engines, relational databases, BIM, spreadsheets and, to lesser extent, simulation engines.Specialized connectors for some common applications that can retrieve the data and convert it to the proposed ontology schema will be developed.
(iii) Can the stored (resource description framework) RDF data be formatted to the visualizer format?For now, the stream of data coming from different sources has been converted to the RDF format.
Another connector is required to process this data and feed it to the visualizer to be shown.(iv) Can the RDF format and its query engine enhance the visualization process by providing a way to query and display objects based on the required level of detail?Different audiences require different visualization levels of detail.For example, a project owner or an engineering firm might be interested in 4D or 5D visualization, while a contractor might be more interested in a more detailed visualization that shows operational activities such as crane movement and scaffold erection.ese different scenarios require different time resolutions and objects.Our ontology should be able to  store and show all these scenarios using a querying engine.

Research Methodology
e research methodology requires three main processes as shown in Figure 3: (1) combine the heterogeneous construction data into one data storage, (2) provide a query engine for the data storage, and (3) link the query results to a visualization engine.
is procedure allows obtaining different visualization scenarios by changing the query statement.Additionally, the query is independent of the heterogeneous data sources as it runs on the merged data storage.
e following steps will be followed to pursue the proposed methodology: (i) Evaluate previous work related to construction visualization.(ii) List the capabilities required for the framework and how they can be achieved.(iii) Propose the framework.(iv) Implement the framework components.(v) Test the framework with real-case scenarios.

Proposed Framework.
Figure 4 shows the proposed framework for visualizing heterogeneous data coming from different data sources.
e scope of work includes developing the ontology-backed by a triple store-preparing connectors that take raw data from different applications that are widely used in the construction domain and converting it to an RDF format and developing a hub that retrieves data from the triple store and converts it to the visualizer format as shown in Figure 4.Although one visualization application has been utilized, the same ontology can be used with different visualization applications with minimal effort to create a new connector.
e framework consists of two main parts: (1) existing components that are widely used in the construction domain and (2) developed components that stream data between different sources and the visualization engine.e following subsections discuss the developed components and their relationships with the existing components.

Ontology.
Ontology is the key component in this framework as it formulates the relationship between data sources and the visualization.It should be generic enough to capture various data sources, but at the same time it should be structured to be able to export temporal and positional information.
Researchers suggested defining ontology requirements in the form of questions.ese questions, which are known as "competency questions" determine the scope of the ontology [39].Our competency questions are: (i) Will the ontology be generic enough to receive data from different sources?
(ii) Will it be able to convert the data to the visualization format?(iii) Will it be able to filter by item type?(iv) Will it be able to filter by levels?(v) Will it be able to filter by different time resolutions?(vi) Will it be able to filter by time intervals?(vii) Will it be able to show temporary structures (such as a scaffold)?
First, to ensure interoperability, the ontology was built upon existing ontologies by importing ifcOWL [40] and W3C time ontology [41].ifcOWL contains concepts and definitions related to BIM objects (e.g., walls and doors), while time ontology defines time concepts such as time positions, before, and after.
Next, the ontology defines a new class, named "Model Object," for any object that has to be shown in the visualizer.
is class encompasses definitions for position coordinates and units, 3D orientation in a quaternion format, object scale, and file path for the 3D asset.
Any object from any data source has to inherit this class to be shown in the visualizer.For example, we asserted that "IFC4:IfcElement" (from ifcOWL) is a subclass of "Model Object." is means that if any instance in ifcOWL provides positional input and 3D asset's file path, the instance can be displayed in the visualizer.
e previous step will show a static 3D scene, and to add animation, different positional properties should be provided at key time frames.Hence, the ontology contains another class, "Object Time Stamp." is class captures the relationship between the model object and the time instance as defined in the W3C time ontology.In addition, this class contains information about object orientation and position at a specific time instant.
e ontology is backed by a triple store to handle the expected huge number of triples.Fuseki version 2.4.1 [42] has been used which can handle millions of triples and also provides a web-based interface and SPARQL endpoints that can be used to process the data.

Data Mapping (Data Sources to RDF).
is section describes how data from different data sources can be exported to an RDF format based on the proposed ontology; this process is known as data integration [43].It presents the integration of some of the common applications in the construction domain.However, it is worth to mention here that this list is not exhaustive, and additional data sources can be easily added.
(1) BIM Models.BIM models are widely used by engineering firms and in the construction domain to improve project management and collaboration [44].In our context, BIM models provide rich information for visualization. is includes 3D assets and positional data.To export 3D assets from the 3D model, a customized plug-in that creates a separate OBJ file for each object in the model has been used.Afterwards, we used the visual programming tool Advances in Civil Engineering (Dynamo) [45][46][47] to export: (1) the location, (2) the orientation, and (3) OBJ le path for each object in the model.e data are exported in spreadsheet format which can be converted to RDF triples as will be shown later.e BIM models' connector has been tested with a steel structure frame, shown in Figure 5, which will be animated based on an associated construction schedule later.
(2) Schedules.Another key component of a construction project is the schedule.
e schedule contains temporal information that can used to animate objects from other sources.A customized plug-in for Microsoft Project ® (Figure 6) has been used to convert tasks nish/actual nish to the RDF format; this plug-in can write the RDF triples to a local le or a triple store through an HTTP connection.Clearly, an ID that links the task with items from other sources is required.
(3) Simulation Models.Simulation is a powerful tool to capture and model dynamic systems with a large number of variables that are hard to model using mathematical models.It provides an experimental frame for testing a real world system e ectively and cheaply [48].Visualization is arguably one of the most suitable ways to not only interpret results but validate and accredit the simulation model [49].
In our case, simulation models were used as another data source to give more details about a process modelled using RDF triples.For example, merging a BIM model with its associated schedule will create a stand-alone 4D visualization in which objects will appear in their nal location when the associated tasks are complete.However, a simulation model allows for the addition of more details about object hauling from the storage area and crane lifting and swinging.

Merged data storage
Visualization engine Query is template exports the results as RDF triples.
(4) Spreadsheets.Many data sources come in a spreadsheet format or at least can be converted to this format, such as data from BIM models as shown earlier. is section describes how the data in the format have been mapped to RDF triples.
RDF123 [52,53] is an open-source tool that exports tabular data to an RDF format through a mapping graph.
e mapping graph should be structured based on the spreadsheet structure and the ontology.
As an example of this conversion, a scaffold requests log for an oil and gas project in Alberta was obtained.e log is in a tabular data format with the following relevant columns: Request ID, Required Elevation, Erection Date, and Dismantle Date.
is data are converted to RDF triples which were added to an existing 4D visualization to show scaffolds during the visualization process, as will be shown later.

RDF to Visualizer Connector.
e previous section discussed our work regarding converting different data sources to the RDF format.Now, the generated RDF triples have to be converted to the visualizer application data format as shown in Figure 4. is requires a customized connector that converts from RDF to the visualizer format.is section describes the conversion from RDF triples to visualize them in DON.
DON accepts XML files as an input file [54].In general, two XML files are required to visualize a process in DON. e first file is a "Mission File" which constructs the visualization scene by providing information about environment, cameras, lights, object hierarchy, and a reference for the second XML file [54].e second XML file, known as the "Data File," contains two main sections: (1) initialization and (2) time section.
e initialization section contains metadata definitions and a list of objects that will be referenced in the time section.e time section captures the time steps in chronological order.Each time step might specify a new position or orientation for any object defined in the initialization section.
e visualizer will interpolate the animation between each two consecutive time steps.e current XML schema of the input file is "MPC3," and it is documented in [54].
e authors developed a connector that takes RDF triples and converts them to XML files according to DON schema.e connector utilizes dotNetRDF to execute remote SPARQL queries and create the two XML files based on the query's results.
is structure allows us to filter data based on customized queries as shown in the following section.

Results
After describing different data sources that have been used, this section illustrates how to merge data from different sources to get an animation for the whole process with different levels of detail.e following sections describe three scenarios.e first is a merger of a schedule with a BIM model to display 4D animation.e second adds scaffold erection and dismantling times and locations, while the third focuses on a finer time resolution and visualizes the handling of a module using a crane.
3.1.4D.4D animation is a high-level visualization that focuses on the big picture by showing the erection of a facility based on the actual or planned progress.Figure 7 shows a steel frame erection.Figure 8 shows module installations for an oil and gas project.
In both cases, the following steps have been executed: first, export the 3D assets from Revit in the first case and Blender in the second, and then, convert spatial information into an RDF format.Afterwards, map each item to the corresponding task in the schedule and convert the schedule to an RDF format.Finally, use the developed connector to create the XML files which are visualized in DON.

4D with Scaffold.
is scenario demonstrates the advantages of the proposed framework as it automatically adds another data source (spreadsheet) to an existing visualization process without any required editing.
e authors obtained the scaffold log for the same oil and gas project.e log is maintained by general foremen in the site.
ese are not the same personnel who maintained the 3D model.Nonetheless, merging these heterogeneous data sources went smoothly.Figure 9 shows that scaffolds were in the right place at the right time.Because 3D assets for the scaffold were not available, scaffold has been visualized as a transparent 3D box.

Crane Movement.
e same framework can be used to show a more detailed visualization using simulation data as shown in this scenario.is scenario captures the module lifecycle starting from the storage area.A trailer moved the module to the pickup point.Afterwards, the crane lifted the module, swung it, moved to the drop point, and then dropped the module in the final location as shown in Figure 10.

Limitations and Future Work
e previous section demonstrates that the proposed framework is able to visualize different scenarios by merging  Advances in Civil Engineering heterogeneous data.Nevertheless, there are some limitations with this approach which includes the necessity of mapping different data to RDF format. is mapping carries two challenges.First, some data such as temporal data pose challenges in mapping to RDF [55][56][57].Second, generated RDF triples are counted in millions which lengthen processing and querying time.
Additionally, using the same triple store for different level of details requires writing complex SPARQL queries to select triples related to a specific level of details.A future work will investigate adding more semantics to the generated triples to streamline filtering by the required level of details.

Conclusion
Visualization of construction activities is a complex process that requires significant effort to select the visualizer, prepare 3D assets, retrieve data, and transform it according to the visualizer specification. is ad hoc methodology led to the creation of many visualization models that are only suitable for one or two application cases.is paper presents a new framework for visualizing construction projects and activities.e framework focuses on the data rather than the visualization engine.By using an RDF data format as a data hub, data from different sources and formats can be merged  Advances in Civil Engineering into one triple store.is reduces the problem of visualization to the selection of a visualization engine and development of data bridges between the triple store and the visualizer and between the data sources and the triple store.e methodology is tested with different scenarios that demonstrated the ability to visualize different ranges of activities with different levels of detail.
e SEE 2015 challenge (SEE 2015) required developing a distributed simulation of a lunar mission.Teams from eight universities (University of Alberta, University of Bordeaux, University of Brunel, University of Calabria, University of Genoa, University of Liverpool, University of Munich, and University of Nebraska) participated.

Figure 1 :
Figure 1: Earthmoving operation as displayed in the visualizer.

Figure 2 :
Figure 2: Erecting water treatment facility on the moon surface as rendered by DON.

Figure 3 :Figure 4 :
Figure 3: An overview of the proposed framework.

Figure 5 :
Figure 5: A sample BIM model for a steel structure frame.

Figure 6 :
Figure 6: e interface of the schedule connectors.

Figure 7 :Figure 8 :
Figure 7: A sequence of screenshots that shows steel frame erection. 8

Figure 9 :
Figure9: is is the same project shown in Figure8, but we added the scaffold (the transparent objects in the second and third frame).

Figure 10 :
Figure 10: Handling of a module using a crane that shows lifting, swinging, hauling, and dropping.

Table 1 :
A summary of the characteristics of three developed visualization engines.