A Data-Driven and Knowledge-Driven Method towards the IRP of Modern Logistics

Inventory Routing Problem (IRP) is a typical optimization problem in logistics. To reduce the total cost, which contains the product transportation cost, the inventory holding cost, the customer satisfaction cost, etc., a wide range of impact factors have to be taken into consideration. Since more and more intelligent devices have been adopted in the management of modern logistics, the amount of the collected data (relevant to those impact factors) increases exponentially. However, the quality of the collected data is su ﬀ ering from a certain number of uncertainties, such as device status and the transmission network environment. Considering the volume and quality of the collected data, the traditional data-driven distribution optimization methods encounter a bottleneck. In this paper, we propose a hybrid optimization method which combines data-driven and knowledge-driven techniques together. In our method, a domain ontology, which has better scalability and generality, is built as an extension of data-driven optimization algorithms. Knowledge reasoning techniques are also combined to handle data quality issue and uncertainties. To evaluate the performance of our method, we carried out a case study, which is provided by a French company “ Pierre Fabre Dermo-Cosmetics ” (PFDC). This case study is a simpli ﬁ ed scenario of the practical business process of PFDC.


Introduction
Supply chain is a network of organizations that are involved in collaborative processes that generate value as products and/or services in the end for the ultimate consumers [1]. Supply chain management [2] refers to the optimal supply chain operation, i.e., all activities from supply chain procurement to meeting the end customer at the lowest cost. Supply chain management contains several segments, such as supply chain strategy, supply chain planning, procurement, product life cycle management, and logistics. Supply chain management is aimed at integrating and coordinating the network [3].
Logistics, as one segment of supply chain management, refers specifically to the planning and implementing the flow of goods (or services). Traditionally, logistics is triggered by the inventory procurement. However, this kind of triggering mode has three main disadvantages [4]: (i) increase the burden of the enterprise investment, (ii) bear the risk of losing market opportunities, and (iii) force enterprises to engage in business activities which they are not good at. This results in a simple buy-to-sell relationship between suppliers and demand companies that does not solve some supply chain problems involving global strategic.
Today is an era of data explosion, in which every aspect of society is overwhelmed by the sheer volume of data generated [5]. Modern logistics, which can be regarded as a scenario of Internet of Things (IoT), generates and consumes a huge amount of data. Along a logistics, data is generated mainly from sensors, RFID, and production equipment. Furthermore, data from other sources such as social media, newspapers, and weather forecast reports may also affect logistics. To help make efficient decisions and forecasts about logistics, leveraging these data with big data analytic techniques becomes a common practice for enterprises.
Towards the optimization of logistics, many data-driven methods (algorithms) such as "Vendor Management Inventory (VMI)" and "Collaborative Planning Forecasting and Replenishment (CPFR)" are proposed. As a typical issue of logistical optimization, the "Inventory Routing Problem (IRP)" attacks researchers' attention.
However, data-driven methods have inherent disadvantages. One of the typical disadvantages lies in handling uncertainties. For instance, a required data is missing or becomes incredible due to some unexpected reasons. In another instance, a decision is made by taking a set of impact factors into consideration, but some of the impact factors cannot be quantified (as computable data). Consequently, this kind of impact factors becomes uncertainties to datadriven methods. Therefore, concerning the optimization of modern logistics, three main challenges can be summarized as follows: (i) Ch1: how to collect and make use of the heterogeneous data (on both syntax and semantic aspects) from different sources?
(ii) Ch2: concerning the data quality issue, how to ensure and improve the credibility of the collected data?
(iii) Ch3: for the impact factors that cannot be quantified and the uncertainties in logistics, how to measure and evaluate their influences?
Focusing on the three challenges, we propose a hybrid data-driven and knowledge-driven method for the optimization of modern logistics: DKDM4L. In DKDM4L, we adopt knowledge modelling and knowledge reasoning techniques to enhance data-driven methods. In the context of logistics, we formally define the relevant concepts, attributes, and relations by creating a domain ontology. In this domain ontology, concepts and attributes are defined with precise semantics, and constraints are added to the attribute values. By adopting knowledge reasoning techniques, the data incomplete issue and inconsistent issue are addressed. Furthermore, comparing to the pure data-driven methods, DKDM4L has stronger extendibility and generality. To evaluate the performance of DKDM4L, we carry out a practical use case, which is provided by PFDC. The main contributions of this work are as follows: (i) Con1: in the context of logistics, we create an extensible domain ontology to formally describe domain concepts, attributes, and relations among them (ii) Con2: by defining and applying knowledge reasoning rules on this domain ontology, we measure and evaluate the influence of uncertainties (e.g., weather conditions) (iii) Con3: by cooperating with PFDC, we propose a practical use case (scenario) of modern logistics, which can be used as a baseline in this domain The structure of this paper is as follows. The second section presents the motivated case provided by PFDC. The third section shows an overview of DKDM4L. The case study with evaluation is given in the fourth section. The fifth section illustrates the related works while a conclusion is given in the sixth section.

Motivated Case
2.1. Project Origin. The research work presented in this paper was initially triggered and founded by the European Horizontal 2020 project: Cloud Collaborative Manufacturing Network (C2Net). C2Net is aimed at providing a scalable real-time architecture, platform, and software to the supply network partners. The potential users of the C2Net platform are the small and medium-sized enterprises (SMEs), which do not currently have access to advanced management systems and collaborative tools due to their restricted resources.
Totally, there were around 20 partners taking part in this project. These partners came from both academics (research centers and laboratories) and industry (enterprises). C2Net had 7 work packages to cover the entire supply chain considering all stages of manufacturing, distribution, and sales. The research work presented in this paper originally belonged to work package 4, which focused on the optimization algorithms of logistics.

Practical Scenario.
Pierre Fabre Dermo-Cosmetics (PFDC), as one partner of the C2Net Project, provided a practical scenario to simulate and evaluate logistics optimization algorithms.
PFDC is a French multinational pharmaceutical and cosmetic company. PFDC supply chain sources make and deliver products for a dermo-cosmetic market. PFDC manages 10 brands, more than 3500 product references, in around 140 countries over the world. PFDC supply chain concerns the following stakeholders: suppliers, manufacturing plants (in France), central distribution centers (in France), local subsidiaries or partners, and final customers (drugstores). Figure 1 shows a general overview of the PFDC business process.
In the local subsidiaries, local DRP (Distribution Requirement Planning) supported by FuturMaster solution is used to manage the forecasts and replenishments. In the central distribution center, central DRP (FuturMaster solution) and MRP II (Material Resources Planning) by SAP/ERP are used for the distribution and production planning.
2.3. Simplified Use Case. In order to focus on the distribution phase and simplify the real scenario, four hypotheses are defined in a simplified case, which is shown in Figure 2.
(i) H1: there is only one local distribution center (LDC), and it is in charge of delivering five kinds of products to five drug stores (DSs) (v) Express: it is the fastest one (one day lead time), and a one-to-one (the LDC to one specific DS) service. It is expensive, light load, a limit-number kind of products, etc.
(vi) Daily truck: it is a regular delivery mode, which has two days lead time. Meanwhile, it is a one-to-several (the LDC to several DSs). It is cheap, and the distribution route is fixed For PFDC, the main goal of managing supply chain is to improve customer satisfaction. This means, at any time, stockouts are strictly prohibited in each of the five DSs' warehouses for all five kinds of products. On the other hand, considering the limited storage space and inventory holding costs, the number for all five kinds of products has an upper limit. A brief illustration of the constraints is shown in Figure 3.
There are several factors that are needed to be considered while setting the minimum and maximum thresholds. These factors concern both internal data and external data of PFDC. The internal data is always structured, such as retailers' sales reports, stock status, and historical sales. The external data can be both structured and unstructured, such as competing markets launch new products (from newspaper  Figure 2: A simplified situation of the practical case. 3 Wireless Communications and Mobile Computing or video), new relevant guidance policies of government (from policy documents or TV). In order to set precisely specific thresholds for products in each DS, big data analytic techniques shall be used on all the data mentioned above.
2.4. IRP to Be Optimized. Towards IRP, the following three aspects of optimization have to be taken into consideration.
In order to reduce the costs of inventory holding in retailers (DSs), the inventory thresholds of various products stored in each DS should be set according to the sales situations. If in a specific period, the demand of one kind of products increases, the inventory thresholds should be raised to increase the delivery volume. Otherwise, the inventory thresholds should be lowered to reduce the delivery volume. Therefore, the thresholds of inventory shall be adjusted dynamically.
Delivery recommendation: DKDM4L suggests delivery plans for the inventory replenishment. Based on the sales forecasts of each DS, considering the required quantities (and volume) of all the products and the transportation costs, several potential delivery plans shall be made and recommended. The suggested plans concern on product packaging (quantities and weights), the transportation mode, and the distribution time.
Delivery route recommendation: if the daily truck transportation mode is triggered, a route planning is required for the truck. This recommendation concerns the optimization mainly on time and gas costs. Comparing to the former two issues, the route recommendation is not vital, and we do not take it into consideration in this paper.

Main Work
3.1. An Overview of DKDM4L. Considering the simplified use case provided by PFDC, we design a framework for DKDM4L. As shown in Figure 4, this framework contains four layers: (i) the physical layer that contains diverse sensors (e.g., from the drug stores, the LDC, and transportation vehicles), production machines, and IT instruments (e.g., servers and PCs), (ii) the data layer that is in charge of collecting and merging data, (iii) the model layer, which can be regarded as a knowledge model setting the unified syntax and semantics constraints of the collected data, and (iv) the reasoning layer, which defines the reasoning rules. The rules can be separated into two groups: one group to check (and correct) the consistency, integrity, and correctness of the collected data and another group of rules used to deduce the optimization distribution plans.
The first two layers mainly focus on data collecting, analyzing, and merging, while the last two layers concern more on the knowledge representing and reasoning. Therefore, DKDM4L is both a data-driven and knowledge-driven method.
First, the physical layer transmits the collected data to the data layer. Then, the data layer analyzes, validates, and transforms the received data to the unified forms and patterns that are defined in the model layer. Next, the model layer adds the formal semantics and relations to those well-prepared data. A large number of (knowledge) triples are generated on this layer. Finally, the reasoning layer uses these prepared triples to deduce the delivery decisions (optimized distribution and scheduling plans). Four layers, from bottom to top, are layer-by-layer dependent. The implementation of upperlayer functions relies on the services provided by its lower layer. Furthermore, the verification mechanism follows a top-down sequence.
The objective of designing this architecture is to improve the whole performance of supply relationship management. By separating different layers, staffs working on specific positions can focus only on their own roles. The data (and information) transition and verification between different layers shall be done automatically with mature protocols and software tools. This architecture supports the implementation of DKDM4L.
In the simplified use case, the product distribution happens only between a local distribution center (LDC) and five drug stores. The physical layer contains mainly the IT instruments (e.g., PCs, servers, intelligent sensors, and RFID) in the five drug stores, in the LDC, and on the transportation vehicles. In this paper, we focus mainly on the data layer, the model layer, the reasoning layer, and the connections among these three layers. By combining these three layers, the target of DKDM4L "calculating distribution plans that can reduce the total cost containing transportation cost, inventory holding cost and customer satisfaction cost, etc." can be achieved. The following three subsections present the details of the three layers, respectively.

The Data
Layer. Nowadays, a huge amount of data is being generated at a high speed. IoT devices have been employed to provide new opportunities for sensing-based ubiquitous recognition and communication capabilities. However, since the diverse data sources and heterogeneous data (structured data, half-structured data, and unstructured data), making good use of these data becomes a tough task.
In the data layer of DKDM4L, there are four main data sources: data collected from LDC, data collected from retail (drug) stores, data collected from transportation vehicles, and the weather data. Table 1 is a general illustration of these data.
This table shows the collected data sources and the ways of collecting data. "Irrelevance" means the discrete data that are observed by sensors or manually input into computers, which is unrelated to each other. "Relevance" means that the value of the data is calculated based on other data rather than collecting from the direct data sources.

Wireless Communications and Mobile Computing
Concerning the data about the weather, we partially employed "Roussey Catherine's weather ontology" [6]. Roussey Catherine describes a new meteorological dataset based on the SOSA/SSN ontology. This work is the first to publish meteorological data with the new version of the SOSA/SSN ontology. The network of the ontologies in [6] is composed of the following:  instance to an instance of the class "time:Interval." The properties "time:hasBegining," "time:hasEnd," and "time:has-Duration" specify the beginning, the end, and the duration of the interval, respectively.
The ontology describing the different types of sensors and the ontology describing the units of measurement are employed in DKDM4L to identify data sources. In addition, the domain ontology proposed in this paper also contains some weather factors, such as sunshine intensity and ultraviolet intensity, which are defined in [6] and are related to the skincare products (considering PFDC business).

The Model
Layer. The modern logistics system involves many objects (classes), such as the LDC, retail stores, transportation vehicles, delivery plans, and weather types. Each object may have a certain influence to the system. To represent and simulate the numerous requirements and scenarios, modelling is a widely accepted engineering technology, which can achieve the management of the system [7]. Furthermore, the model layer is also in charge of identifying threats and vulnerabilities in the physical world based on relations and attribute values. Thus, models play a functional role not only in helping people understand the systems being developed but also in the management and detection of systems.
The ontological model is suitable for describing the dynamic environment of the Internet of Things applications. Furthermore, this kind of models can monitor, learn, and adapt to abnormal situations. A predefined adaptive knowledge base, as parts of the ontological model, can alert threats existing in the Internet of Things scenarios.
The definition of ontology [8,9] contains four meanings: (i) conceptualizing domain knowledge, (ii) the concepts should be clear and unambiguous, (iii) formalizing the concepts, and (iv) the concepts should be good for sharing.
Ontology is defined as a five-tuple [10]: (i) concept; (ii) relationships-concepts are not isolated, they are interrelated; (iii) axiom-rules of reasoning; (iv) function-the mapping relationships between concepts; and (v) instance-unit objects that cannot be redivided.
Some of the existing mainstream knowledge representation languages are RDF [11], OWL [12], KIF [13], CycL [14], and OIL [15]. There are several main knowledge representing methods, such as the logical representation, the production representation, the frame representation, the objectoriented representation, the semantic web representation, the XML-based representation, and the ontology representation.
On the model layer, we propose a domain knowledge model "Inventory Routing Problem Ontology": IRPO. IRPO contains six aspects: the LDC knowledge representation, the drugstore representation, the transportation vehicle knowledge representation, the delivery knowledge representation, the weather knowledge representation, and the product knowledge representation. IRPO  IRPO is constructed using Protégé. The structure of IRPO is shown in Figure 5. "owl: Thing" is superclass, which includes five modules "Organization," "DeliveryCost," "Deli-veryModel," "Weather," and "DailySelling." "Organization" is the organizer who can include other supply modes by means of extending. "Organization" has three instances the "LDC," the "Drugstore," and the "DeliveryCenter". The "LDC" maintains "Product" and "Inventory"; each "Product" has "Inventory" as an attribute. Similarly, "Drugstore" also has "Product" and "Inventory" as attributes. A matter of concern in "Drugstore" is "DailySelling," which is affected by the "Weather." The "Weather" has five instances "Shineintensity," "DeliveryModel," "Haze," "Humidity," and "Ult_inensity." "DeliveryModel" has two instances "Express" and "Daily_Truck" as delivery modes. Both "Delivery_Plan" and "DeliveryModel" are scheduled by the "DeliveryCenter."
In IRPO, there are three kinds of relations defined among concepts. Table 2 lists the three relations and gives explanations about each of them. For each relation, an example is given to illustrate it.
3.4. The Reasoning Layer. Price and tax management [16] has been considered as a new management technology after supply chain management and customer relationship management (CRM). Its main idea is that a company should optimize the prices of its products and services based on a full understanding of the costs of the supply chain. At the same time, supply chain operations should also be optimized to reflect the revenue generated by different product types and customers. Therefore, prices and supply chain decisions should not be as independent as in the past but should be well integrated, which is another way to inject intelligence into supply chain management. Therefore, we propose the following three questions: Q1: according to the sales of each retail (drug) store, how does the LDC distribute the product quantity?
Q2: according to distribution tasks, how to plan out a route with the lowest distribution cost?
Q3: for a retail (drug) store, considering the sales that are affected by weather factors, how to dynamically adjust the quantity of products delivered?
3.4.1. A Mathematical Model of Distribution Algorithms. The problem in the motivated case considers a local (subsidiary) distribution center that is in charge of delivering a set of products (pi ∈ P) to customer (drug stores) warehouses (i ∈ W) on a finite horizon (t ∈ T) through a VMI process. Customers are independent but admit to share the visibility on their demands (d t i,p ). According to a CPFR midterm process, promises (Pr t i,p ) of product availability have been made to each customer. Moreover, the current visibility of final consumer demands may no more fit to the one planned at the CPFR time. The problem thus arises when all the promises or all the demands cannot be fulfilled because of insufficient supply or production (R t p ) at the local distribution center. From the vendor's perspective, the problem is to share the shortage (IL t i,p ) among the customers while avoiding stockouts (I −t i,p ) at customer warehouses, satisfying as much as possible promises, synchronizing delivering tours, and respecting delivering frequencies (fr).
Various routes (r ∈ R) (i.e., multicustomer routes and emergency quick routes) have been defined beforehand for   Cr). This assumption dramatically reduces the IRP complexity, so that an optimization procedure can be used.
The underlying model is inspired from the [17] formulation but adds some specific constraints in order to model the supply limits and CPFR promised constraints. The model considers continue variables for inventories (I t i,p ), transported quantities (TR t,r i,p ), low-level inventories (IL t i,p ), stockouts (I −t i,p ), and nonsatisfied promises (NPr t i,p ). Integer variables formalize the decision of launching a transport on a route on a given time (z t r ). The objective function minimizes the total cost which contains transportation costs, various warehouse costs associated with inventory holding, nonrespect of the VMI minimal inventory costs, stockout costs, and nonrespect of the CPFR promises costs.
Five of the general constraints are listed below. Constraints (2)and (3) express the balance of flows at drug store warehouses and the local distribution center. Constraint (4) defines the stockouts. Constraint (5) expresses VMI lowlevel inventory. Constraint (6) models the nonrespect of the CPFR promised quantities.
Some of the unitary costs are hard to be quantified and balanced. Inventory holding costs (hc) and freight costs (fc) can easily be measured, but low-level inventory and promised nonsatisfaction costs are rarely defined in the agreement. Thus, they can be defined as compromises in comparison to the other costs. Moreover, from the vendor's perspective, all the customers are rarely equivalent. So, holding costs are adapted so that important customers are favoured.
In the CPFR context, a demand is sensible to promotions and other market effects. That volatility makes it difficult to forecast. Thus, getting data from the market and modelling its impact on the demand forecasts become a crucial issue.
The PFDC algorithm is built upon a mathematical model to calculate and obtain the optimal distribution plans. However, the mathematical model is not suitable for ontology modelling in the first place. The mathematical model only relies on numerical calculation and has no semantic functions. Considering the weather factors, it cannot be dynamically programmed, so a more appropriate distribution optimization method is needed. In the field of knowledge engineering, rules are an important means to achieve reasoning [18].
The sales of cosmetics (PFDC products) are closely related to the weather, which can be divided into the following situations. When it is cloudy and rainy, the sunshine will weaken and the humidity will increase, which will restrain consumers from buying moisturizers and sunscreens. Sales of sunscreens and hydrating skincare products increase during the uV-heavy months. Promotional activities held to stimulate consumer consumption should also consider weather conditions to determine promotional products. In addition, studies have shown that consumers of combined products have higher sales than those of single products. Therefore, when considering combined products, weather factors should be taken into consideration to combine suitable products together, such as high-temperature weather, vigorous cleansing facial cleanser, and refreshing hydrating facial masks can be combined products.

The Definition and Division of Reasoning Rules.
Reasoning refers to the process of introducing conclusions from existing facts according to certain rules. Knowledge-based reasoning rules emphasize the choices and applications of knowledge.
By adding semantic information to entities, semantic reasoning can be carried out to better realize the use of information. Ontological reasoning, with semantics as a prerequisite, can be automated by machines instead of manual reasoning. The following five types: (i) class hierarchy relationships, (ii) class equivalents, (iii) individual identity, (iv) compatible, and (v) classification, can be deduced automatically. The important role of ontological language in supporting reasoning includes checking the compatibleness of ontology and information, checking the implicit relations between classes, and automating the classification of instances. Automatic reasoning can check more content than manual reasoning, which is very beneficial to the large-scale ontology design or the fusion and sharing of data from different sources.
Ontological reasoning machines can be divided into two categories: special and universal. For the special ontology reasoning machines, some examples are Racer, FaCT, Pellet, etc. They support the main ontological languages, such as RDFS and OWL. For the universal ontological reasoning machines, one typical instance is Jess. At present, there are four main ways to implement ontological reasoning.
First, the reasoning methods are based on traditional description logic. Typical ones are Pellet [19], Racer [20], and FaCT [21], which are ontological reasoning machines designed and implemented based on traditional tableaux algorithms. Furthermore, many tableaux algorithm optimization techniques have been introduced to make efficient reasoning.
Second, the reasoning methods are based on rule-based approaches. Ontological reasoning, as a kind of application, can be mapped to the rule reasoning engine for reasoning. 8 Wireless Communications and Mobile Computing There are many ready-made conversion tools to implement OWL as reasoning rules. The ontological reasoning machines currently implemented as rule-based ones are Jess [22], Jena, etc. Third, the reasoning methods are based on program editing. Based on the implementation of the deductive database technologies, two typical system projects are F-OWL and KAON2.
Fourth, the methods are based on the first-order predicate prover. Because OWL declaration statements can be easily converted into first-order logic, it is easy to use traditional first-order predicate provers to implement ontological reasoning for OWL, such as Hoolet's ontological reasoning machine, which uses Vampire's first-order predicate prover to implement ontological reasoning.

The
Reasoning Rules Adopted in DKDM4L. We use SWRL [23] rule language to define the reasoning rules in DKDM4L. SWRL (Semantic Web Rule Language) is a language that renders rules semantically. Parts of the concepts of SWRL's rules are evolved by RuleML and combined with OWL ontology. SWRL is already a member of the W3C specification. SWRL can be regarded as a combination of rules and ontology. Through the combination of the two, the relationships and vocabulary depicted in ontology can be used directly while writing rules. While the relationships between these categories may otherwise require additional legal descriptions, ontology descriptions can be used directly in SWRL.
A total of ten rules with explanations are established as follows. Table 3 9 Wireless Communications and Mobile Computing data as testing input. A research team in PFDC provided the data captured from their daily business.
Parts of the sensitive records are replaced by particular items. A specific string of numbers is used to replace the real names of both drug stores and products. Table 4 shows the information of the LDC, drug stores, and products.
Due to some reasons, such as cooperation relations and purchase quantities, PFDC sells products to different DSs with different prices. The selling prices directly affect the inventory holding costs in DSs. Table 5 shows the selling prices of all products to each DS.
The weight and the volume of one product are two key issues to consider when making delivery plans. In the simplified situation, only the weights of each product are taken into consideration. The total weight directly affects the load and the cost of each distribution mode (especially the express mode). Table 6 shows the weights of each product. Here, only the net unit weights, which are provided by the producers of these products, are recorded.
As illustrated above, two kinds of distribution modes have been employed by PFDC. Each of them has floating costs based on both transportation distances and loads (weights) being carried. Table 7 shows the express distribution costs (concerning only the weights being distributed), and Table 8 shows the floating costs of the daily truck distribution mode. As shown in Table 7, the express distribution is used to deliver light loads and the cost increases with each kilogram.
If the delivery weight is more than ten kilograms, the express distribution is not a preferable mode due to its high costs. As shown in Table 8, the daily truck distribution is used to deliver medium loads; its cost increases with every ten kilograms. If the delivery weight is more than one hundred kilos, the distribution cost increases by 27.91 Euros per one hundred kilograms. Normally, this distribution mode covers all five DSs in one route.
Together with the transportation costs, the inventory level is another important driving factor in making distribution decisions. As illustrated in Figure 3, two thresholds are defined to limit the inventory for all five kinds of products in each DS. Table 9 shows all the threshold pairs.
Considering the distribution driving factors from the inventory aspect, besides these threshold pairs, there are two other items "current inventory records" and "daily sell outs." The current inventory records and daily sell outs are real-time data that are automatically generated. PFDC provided the current inventory records (CIR) and daily sell out (DSO) data collected within a period of one month. As an example, Table 10 shows parts of the data collected from drug store 2 during a five-day period.

The Testing
Process. The original collected data is onemonth sale data of PFDC, including initial inventories, daily sales of each drug store. The original data has 5 stores (DS1, DS2, DS3, DS4, and DS5) with 5 items to sell (P1, P2, P3, P4, and P5). Using the original nearly one-month sale data as input, the daily inventories and sales of products and inventory restrictions can be obtained through model processing.

The
Results and Evaluation. DKDM4L recommends the optimization distribution plans for PFDC, considering the delivery costs, the delivery time, the drug store inventory thresholds, etc. Figure 6 shows the comparison results between original distribution plans and suggested distribution plans by DKDM4L. The horizontal coordinate indicates the date while the vertical coordinate represents the total costs of delivery. According to this chart, the direct distribution costs of DKDM4L suggested plans are slightly higher than the original plans. Particularly, on the first day, the distribution cost contributed almost one-third of the total costs. However, the original plans can not strictly satisfy the "minimum and maximum inventory" restriction, which threatens the stable supply of products. The two charts, shown in Figures 7 and  8, about "DS1-P2" and "DS4-P2" inventory changes briefly demonstrate this point.
As shown in Figures 7 and 8, the plans suggested by DKDM4L satisfied perfectly the "maximum and minimum" inventory restriction.
Actually, the distribution plans suggested by DKDM4L largely enhances the stability of product supply, benefiting the drug stores in the long term as well as earning them a good reputation. To PFDC, the customer satisfaction always comes first.
(i) Optimization of the product delivery. Input the oneday sales data of each product in each drug store, and DKDM4L gives the total amount of required products for the next day. DKDM4L can generate distribution alerts before (and after) the        P1  36  6  30  5  25  7  18  5  31  8  P2  35  8  27  9  18  10  8  8  28  0  P 3  3  1  3  0  3  1  2  0  2  0  P4  21  1  19  1  18  2  16  3  13  0  P5  23  1  21  2  31  2  considers also the weather influence that can cause delivery delays (iii) Recommendation of the optimization of drug store inventory thresholds. DKDM4L takes the weather conditions into account, predicting the delivery arrival time. If the weather conditions may trigger a sold-out warning, both the minimum and maximum inventory thresholds will be raised, and vice versa To sum up, DKDM4L has the following two advantages: ensure the continuous supply of products and bring customers (drug store) a better being served experience. One point to be emphasized, PFDC is a pharmaceutical and cosmetic company, and the selling of these kinds of products is

12
Wireless Communications and Mobile Computing sensitive to the weather. DKDM4L is aimed at serving this kind of companies to optimize their logistics. This is also a limitation on the usage of DKDM4L.

Related Work
We present the related work from two aspects: traditional data-driven supply chain management (especially focusing on the logistics issue) methods and modern knowledgedriven optimization supply chain management methods. For the traditional data-driven logistics management methods, "Collaborative Planning Forecasting and Replenishment" (CPFR) and "Vendor Managed Inventory" (VMI) are identified as mutually benefiting good practices [24,25].
CPFR structures long to mid-term planning processes so that partners jointly plan a number of promotional activities and work out synchronized forecasts, on the basis of which the production and replenishment processes are determined [26]. VMI is based on an agreement where vendor and buyer agree on a process for sharing data (product sales, forecasts about future sales, and inventory levels) so that the vendor monitors the customers' inventory organizing replenishments (deciding order quantities, shipping, and timing) [27]. The vendor can take advantage of these data to dynamically adapt lot sizes, synchronize deliveries to several customers, and adjust the delivery frequency.
From the decision support point of view, VMI falls under the Inventory Routing Problem (IRP). Knowing a planned demand on some customers, the IRP objective is to decide on delivery quantities and maintain the customers' inventories in an agreed range while organizing distribution tours in order to minimize the total cost of the supply chain [28]. The complexity of the problem depends on the number of products, the horizon of decisions (1 period, finite or infinite horizon), the nature of demands (planned or stochastic), the existence of routing alternatives (exist or must be built), and the vendor constraints (finite quantities per product, finite or infinite production capacity). To solve this problem, many heuristics and optimization data-driven procedures have been proposed depending on the specificities of the problem [7]. To acquire good performance, both CPFR and VMI require the support of large quantity and high-quality relevant data. The process of collecting, storing, retrieving, and processing data is important to apply the two practices.
However, since the IoT theories and techniques become mature, more and more intelligent devices are employed in logistics. These devices generate a large volume of heterogonous data with a high speed. Meanwhile, considering the devices themselves, the network transmission environment, and data processing techniques, the quality of these data is difficult to ensure. Furthermore, some impact factors that cannot be quantified and certain uncertainties also affect the distribution decisions in logistics. Considering the above factors, the applications of data-driven optimization algorithms have encountered a bottleneck.
The advanced data transmission and storage technologies, such as wireless sensor networks (WSNs), enabled modern logistics. A large number of research works focusing on sensor data management are published. In [29], the authors focus on the technologies of optimizing the data storage of wireless sensors (WSNs); blockchain technology is introduced to save the storage space of network nodes. In [30], the authors focus on the data redundancy problem; a twostage data simplicity method for the sensor network is proposed.
Even though data processing technologies are improved, uncertainty issues still cannot be handled well by data-driven optimization methods. Therefore, knowledge-driven methods have been proposed in both academics and industry. The adaptation of knowledge representation (with domain ontologies) and knowledge reasoning in IoT applications (e.g., supply chain management, smart home, and e-health) becomes quite common now. In the medical Internet of Things, which aims at realizing the local ontological semantic expansion by being associated with open correlation data sources, research work [31] proposed an ontology model that is designed and applied to multiple sensors to collect vital sign data. Since the concept of intelligent supply chain [32] was put forward, more and more researchers have paid attention on the combination of supply chain management and IoT. Research work [33] reviewed the applications of big data analysis technologies in supply chain management. In [34], the authors built the "TOVE Traceability Ontology" to trace the source of products. There are other domain ontologies built in the context of supply chain management, such as works presented in [35,36]. Research works presented in [37,38] are also knowledge-driven methods focusing on traceability of delivering products. Reference [39] focuses on configuring blockchain architectures for supply chain. In [40], a noteworthy effort develops the EAGLET ontology for ensuring data interoperability between diverse IoT devices over a supply chain.
Benefiting from the rapid development of information technologies, the cost of logistics has been greatly reduced. In order to make good use of those relevant technologies, domain-specific adaptation is required. Traditional datadriven distribution optimization algorithms have to be adapted and enhanced to face new challenges (e.g., uncertainties brought by big data era). Ontology, as a typical way of representing knowledge, has been adapted widely in the combination of supply chain management and IoT. Focusing on the specific IRP, this paper focuses mainly on improving the performance of mature data-driven optimization algorithms with knowledge-driven theories and techniques. Particularly, knowledge reasoning is introduced to handle uncertainties and factors that cannot be quantified.

Conclusion
This paper proposes a hybrid data-driven and knowledgedriven method "DKDM4L" to optimize the IRP of modern logistics. A four-layer theoretical framework is proposed, and as the core of this framework, specific domain ontology is created on the third layer. This domain ontology is built upon two mature optimization algorithms of IRP, and the mechanism of handling factors that cannot be quantified and uncertainties has been integrated in as functions and reasoning rules.

Wireless Communications and Mobile Computing
Compared to the traditional data-driven IRP optimization methods, DKDM4L owns three main advantages. First, based on the formal precise semantics of the domain ontology, DKDM4L can better handle data quality issues, such as believability and completeness. Second, as an inherent characteristic of knowledge-driven methods, DKDM4L has better scalability and generality. This means DKDM4L can be tailored or extended easily for other applications. Third, uncertainty (especially considering weather conditions) handling mechanism has been integrated in DKDM4L. With the editable reasoning rules defining on the fourth layer of the framework, the product distribution decisions made by DKDM4L are more reasonable. Based on the three advantages, DKDM4L can be a better potential solution to modern logistics, which can be regarded as a practical scenario of IoT.
The original trigger of this research work is the C2Net Project. At first, we focused only on proposing a new VMI optimization algorithm (a pure data-driven one). As one partner of the C2Net Project, PFDC provided a practical business scenario with real data as the test case. During the project, we found that a pure data-driven optimization method had encountered many limitations. Therefore, we extended our work to the current status. Again, the practical scenario (a simplified version with four hypotheses) from PFDC is the foundation of proposing DKDM4L. The performance of DKDM4L has also been tested and evaluated with the collected real data. The testing results approve that DKDM4L is a potential solution to IRP of modern logistics.
We will extend DKDM4L in three aspects in the future. First, enrich the domain ontology (knowledge model) to improve the generality. The current domain ontology is built mainly considering the scenario provided by PFDC. More scenarios and industry standards will be taken into consideration to enrich this ontology. Second, more uncertainty handling mechanisms will be included. Besides the influence of weather conditions, other uncertainties such as the launching of new products, the promotion of other competitive products that also affect on distribution decisions should be well addressed. Third, by analyzing industry standards, more reasoning rules are necessary to be defined both to better control the quality of collected data and better address other uncertainties.

Data Availability
The whole testing dataset provided by PFDC is available. We can share it with researchers providing a formal application.

Conflicts of Interest
The authors declare that they have no conflicts of interest.