An IoT-Aware Architecture for Collecting and Managing Data Related to Elderly Behavior

The world population will be made up of a growing number of elderly people in the near future. Aged people are characterized by some physical and cognitive diseases, likemild cognitive impairment (MCI) and frailty, that, if not timely diagnosed, could turn into more severe diseases, like Alzheimer disease, thus implying high costs for treatments and cares. Information and Communication Technologies (ICTs) enabling the Internet of Things (IoT) can be adopted to create frameworks for monitoring elderly behavior which, alongside normal clinical procedures, can help geriatricians to early detect behavioral changes related to such pathologies and to provide customized interventions. As part of the City4Age project, this work describes a novel approach for collecting and managing data about elderly behavior during their normal activities.Thedata capturing layer is an unobtrusive and low-cost sensing infrastructure abstracting the heterogeneity of physical devices, while the data management layer easily manages the huge quantity of sensed data, giving them semantic meaning and fostering data shareability. This work provides a functional validation of the proposed architecture and introduces how the data it manages can be used by the whole City4Age platform to early identify risks related to MCI/frailty and promptly intervene.


Introduction
In recent years, the progress of social well-being is leading to an increase in life expectancy, with a prolonged old age.In fact, as reported by the European Commission, by 2025 more than 20% of Europeans will be over 65s, with an increasing number of over 80s (https://ec.europa.eu/health/populationgroups/elderly en).Aging citizens, however, are at greater risk of social vulnerability and exclusion because they are prone to the onset of physical and cognitive disorders related to age, such as mild cognitive impairments (MCI) and frailty.MCI, in particular, causes tangible cognitive changes that can be noticeable during normal social relationships, although they are not severe enough to interfere with daily life activities.People with MCI, especially MCI involving memory problems, are more likely to develop Alzheimer's disease if these symptoms are not timely addressed (https:// www.alz.org/dementia/mild-cognitive-impairment-mci.asp).This implies significant negative consequences for the quality of life of both elderly people and their relatives, but also for national health services, because they must face the growing requests of intervention.Because older people have different healthcare requirements, health systems will need to adapt so they can provide adequate care and avoid, in the long term, the risk of unsustainability for social healthcare systems (https://ec.europa.eu/health/state/glanceen).Therefore, ageing could also become an economic concern for all the citizens in Europe and one of the greatest social and economic challenges for European societies in the 21st century.
A proper approach to address these issues should not consider the elderly person as an individual with special needs, but it should take into account the context in which he/she lives and the relationships he/she maintains with relatives, friends, and caregiver, on a city scale, not only at home level.Recently, smart cities are emerging as a paradigm 2 Wireless Communications and Mobile Computing to take advantage of concentration of resources and services to improve their citizens' lives.Smart cities use the sensing architecture deployed in the city to provide new and disruptive city-wide services both to the citizens and the policy makers.The large quantity of data available allows improving the decision making process, transforming the whole city in an intelligent environment at the service of its inhabitants.In this context, a valuable role could be played by Information and Communication Technologies (ICTs) for data sensing from the physical environment and for data management and provision, with the aim of providing interoperable modules upon which added value and customized services can be built.
The definition of innovative solutions to improve the well-being of the elderly population (and of their informal caregivers) is one of the major challenges faced by various international research projects [1].They address the problem by developing frameworks based on ICTs to monitor elderly behaviors and to provide corrective interventions after the analysis of detected data, with the main goal of fostering their independent living.Technologies used in such projects for monitoring elderly people's behavior are heterogeneous, ranging from wireless sensor networks [2] to wearable devices [3], portable interactive devices [4], vision systems [5,6], and augmented reality [4].Thus, in this context, there could be the risk of adopting solutions based on expensive sensors and devices, making such systems not affordable and hindering their adoption on large scale.Another issue to consider in this case is the management of a large amount of data.This process involves the following: (i) the modelling of data handled by the system, with the proper level of abstraction, (ii) the data gathering process from heterogeneous sources, and (iii) the data integration and enrichment phase to give them a semantic meaning shareable among other system components (e.g., reasoner and/or risk detection modules).
To address these and other issues, the "City4Age -elderlyfriendly city services for active and healthy ageing" project (http://www.city4ageproject.eu/), a research and innovation project funded by the European Commission under the Horizon 2020 programme, aims to create an innovative framework based on ICT tools and services in order to enhance early detection of risk related to MCI and to provide personalized intervention.The main innovation of the project is the attempt to collect data about elderly behavior by using unobtrusive and low-cost sensing devices (smartphones, wearable, Bluetooth low energy based beacons, smart devices, etc.) that do not interfere with their normal activities.In this way, an up-to-date personal profile of the monitored people is defined, which can be used by domain experts (like geriatricians and neurologists) to assess or evaluate the evolution of MCI and to take the proper countermeasures.
This article presents the work partially done within the City4Age project and it describes the first two layers of the architecture used to collect and manage data related to elderly people and their daily activities, in both indoor and outdoor scenarios.These data are then used by upper layers of the system to recognize behavioral changes in the elderly and eventually trigger proper interventions.The work has dealt with three main aspects.Firstly, the main concepts involved in the data capturing phase have been modelled with a quite high level of abstraction, preventing the whole system from handling raw data with uncertain interpretations.Secondly, a Personal Data Capturing System for collecting data related to elderly people has been developed.It exploits innovative technologies enabling the Internet of Things (IoT) to create an unobtrusive, low-cost, and low-power sensing infrastructure that abstracts the heterogeneity of physical devices and communication technologies.The third component is a data management architecture, called Personal Data Store and Management System, that combines a high-performance REST application programming interface (API) and a Linked Open Data (LOD) API.The REST API allows easily managing of large quantities of data, while the LOD API maps the information in the database to OWL, providing semantic meaning to the stored data and making them easier to share.The interaction of these three logical components has been validated through a sample use case and the performance of the data management system has been assessed through some stress tests.
Thanks to its innovative approach, trying to combine many features like unobtrusiveness, low-cost devices, highlevel abstraction, semantically rich and linked data, and scalability, the proposed system can be adopted on a large scale by many cities, allowing the creation of a huge dataset that can be cross-analyzed by several domain experts, in order to identify new correlations among the collected data.That is the reason why the City4Age project foresees six pilot sites to validate the provided solution.
The article is structured as follows.In Section 2 a brief overview of the state of the art about low-level sensing and middleware technologies and data management platform at smart city level is presented.In Section 3 the City4Age project is briefly introduced, including its data modelling, whereas City4Age global platform and the architecture and the details of the proposed system are described in Section 4. Section 5 deals with the system validation and an evaluation discussion is carried out in Section 6.Finally, in Section 7 the conclusions and future developments of the work are drawn.

Related Works
Many research works carried out in the field of Ambient Assisted Living (AAL) can be exploited in the context of IoT-based platform oriented to the behavioral monitoring of elderly people.
Regarding low-level sensing technologies, recent advancements in mobile and wearable sensors fostered the implementation of AAL systems.All recent mobile devices are equipped with different sensors such as accelerometer, gyroscope, and Global Positioning System (GPS), which can be used for detecting user mobility.Furthermore, recent advances in electronic and microelectromechanical sensors (MEMS) technology promise a new era of sensor technology for health [7].Researchers have already developed noninvasive sensors in form of patches, small holter-type devices, wearable devices, and smart garments to monitor health signals.For example, blood glucose, blood pressure, and cardiac activity can be measured through wearable sensors using techniques such as infrared or optical sensing.
Indoor and outdoor localization is another key component in AAL systems that allows tracking monitoring, and providing fine-grained location-based services of the elderly.GPS is the most widespread and reliable technology to deal with outdoor localization issues.Nevertheless, in indoor scenarios, GPS has a limited usage due to its limited accuracy and the impact of obstacles on received signals.Therefore, a number of alternative indoor positioning systems have been proposed in the literature [8].Among these technologies, vision techniques guarantee high accuracy levels.In the context of homecare, in [9] a video-based monitoring system for the elderly care is proposed.The main objectives of this system are to preserve the elderly independence and increase the efficiency of the homecare practices.The main disadvantage of the video technology lies in the cost, which is still too high, especially for systems with very high precision.For this reason, with the spread of mobile devices, the interest in indoor location systems using smartphones equipped with video camera ( [10,11]) is increased.The infrared (IR) technology is also widely used for indoor localization, as shown in [12,13], though multipath effect drastically reduces the localization accuracy.
Radio frequency identification (RFID) is one of the most popular wireless technologies for tracing and tracking ( [14][15][16]).The main advantage of this technology is the ability to work in absence of Line of Sight (LoS) condition.Bluetooth (BT) technology represents a valid alternative for indoor localization [17,18].It is able to guarantee a low cost since it is integrated in most of the daily used devices such as tablets and smartphones.Moreover, the spread of the emerging Bluetooth low energy (BLE) technology makes the BT also energy-efficient, which is a key requirement in many indoor applications.The recent rise of iBeacons by Apple has contributed to the rapid spread of this technology, used to provide information and location services [19] in a completely innovative way.
Because of the heterogeneity of low-level sensing technologies, middleware solutions are currently used to provide a certain level of abstraction to hide all end-user low-level details.Linksmart [20] is general-purpose middleware that aims to solve the complexity of a pervasive environment in order to support medical care routine of patients at home.In [21] the authors propose middleware enabling the development of context-aware applications, which is also used for an e-healthcare solution.In fact, the middleware is used in a reference application scenario for patient conditions monitoring, alarm detection, and policy-based handling.A solution for tracking the daily life activities, by using mobile devices and cloud computing services, is discussed in [22].
The system permits collecting heterogeneous information from sensors located in the house and sharing them in the cloud.The system monitors the elderly people and generates reminders for scheduled activities along with alerts for critical situations to caregivers and family members, thus reducing the health expenditures.In [23], an IoT-based architecture for providing healthcare services to elderly and incapacitate individuals is proposed.As the underlying technology for implementing this architecture, 6LoWPAN is used for active communications, and radio frequency identification (RFID) and near-field communications (NFC) are used for passive communications.Another platform based on the IoT is proposed in [24].This platform resolves different limitations (e.g., interoperability, security, the streaming quality of service).Its feasibility has been verified by installing an IoTbased health gateway on a desktop computer as reference implementation.A solution for monitoring patients with specific diseases such as diabetes using mobile devices is discussed in [25].This system provides continuous monitoring and real time services, collecting the information from healthcare and monitoring devices located in the home environment and connected to mobile devices.Again in this area, in [26] the authors discuss the potential benefits of using m-IoT in noninvasive glucose level sensing and the potential m-IoT-based architecture for diabetes management.In [27] the authors report on Mobile Sensor Data Processing Engine (MOSDEN), a plugin-based IoT middleware for mobile devices, that allows collecting and processing sensor data without programming efforts and integrating plugins allowing MOSDEN to communicate with sensor hardware.Its architecture also supports sensing as a service model.Moreover, MOSDEN is developed in such a way that it is interoperable with other cloud-based middleware solutions such as GSN.As a last instance of service oriented middleware, SOCRADES [28] abstracts physical things as services using devices profile.The middleware simplifies the management of underlying devices or things for enterprise application especially in the industrial automation.It is an extension of two previous work projects [29,30].
In the literature, there are set of different references about the importance of intelligent cities and their impact on the governments and citizens lives.The aim of having a system that can connect cities directly to citizens is a concept that is evolving in recent years [31,32].This concept, called "smart cities" are a new opening way to make a deep contact between citizens and cities.However, this contact requires the creation of a solid infrastructure to fulfil the requirements of the citizens demands [33].The IoT paradigm is opening new ways to connect different types of electronic devices to humans to improve their environment [34] and provide useful services that can satisfy a huge set of basic requirements.In addition, the IoT paradigm opens a new gate to gathers user's data an analyze them to obtain useful information, for example, behavioral patterns, physiological conditions, and potential health risks, or know their level of education.This information can be used to create intelligent systems that can adapt some services to the citizen comprehension, or create risk detectors that can respond to some health diseases on time.
The concepts of Linked Open Data and smart cities are establishing a new way of obtaining source of data to share knowledge between computers and create different services [35].The cities play a pivotal role in extracting data directly from citizens and covering a wide range of areas with their personal information.These concepts opened a research area in which some projects are emerging to study the impact of the smart cities in the citizens lives while  [36]; other approaches create semantic frameworks to annotate streaming sensors data by using the semantic web to share geographic positions in real time [37].If the source of data is a smart city, it is possible to create smart city data to gather semantic data from citizens and create an API to connect different applications [38] and provide useful information.
There are other approaches such as the creation of a cloud of things by combining different IoT platforms, cloud computing, and semantic data to combine different IoT middleware technologies and have a visualization tool of gathered data [39], or a platform to design an ontology based on a conversion of SPARQL Protocol and RDF Query Language (SPARQL) queries into Structured Query Language (SQL) queries to gather data from relational database with a semantic meaning by using some rules as a primary conversion tool [40].
In all different approaches, the IoT technology acts as a primary source of data and Linked Open Data is the final tool to represent that data.In general terms, the majority of the solutions try to construct new middleware or a combination of middleware technologies to extract data through IoT devices and share it using the semantic web to find an innovative way of data representation.Some of them use rules to try to map some semantic queries into relational queries but without giving additional meaning to semantic data; others try to create a useful API to connect different applications and use semantic data for different purposes but without expanding data or creating new semantic statements.In all situations, none of them try to apply a reasoner to generate new knowledge and expand it into the context of the smart cities.

City4Age Project Description
City4Age is a Horizon 2020 research and innovation project with the goal of enabling age-friendly cities.The project aims to create an innovative framework on ICT tools and services that can be deployed by European cities in order to enhance the early detection of risk related to frailty and MCI, and provide personalized intervention that can help the elderly population to improve their daily life promoting positive behavior changes.It also includes six pilot sites to test the outcomes of the research, which are located in Athens (GR), Lecce (IT), Birmingham (UK), Madrid (ES), Montpellier (FR), and Singapore.
In the context of the City4Age risk modelling, a set of geriatric factors (GEFs) and geriatric subfactors (GESs) have been defined as quantitative indicators of the MCI/Frailty risk associated with an elderly person.These indicators derive from the most commonly used tools in current geriatrics practice, which measure MCI and frailty based on behavior and human activities monitoring.A partial list of the defined GEFs and GESs is shown in Table 1.
The numerical values of GEFs and GESs result from the aggregation of data with a lower level of abstraction and a larger basin of sources.Therefore, in order to address issues related to heterogeneous data sources, low-level technologies, semantic interpretation, and so on, the City4Age project has defined the notion of low-level elementary actions (LEAs).A LEA is the finest grain atomic information used to detect behavior of elderly people.It is related to start/stop events of user basic actions and contains additional information about time and position of the action that is being taken.All this information is enveloped in the defined Common Data Format (see Section 4) and sent to the upper layer of the City4Age platform.
LEAs can be grouped in the following macrocategories: (i) Person LEAs: for tracking user states about motility, like standing, moving, and walking, but also for collecting data about the usage of smartphone for calling and the number of visits payed or received.
(ii) Home LEAs: for tracking user positions inside his/her home environment; for collecting data about the usage of home appliances and furniture, like fridge, TV, washing machines, and cabinets; for monitoring ambient parameters, like temperature, humidity, and noise.
(iii) City LEAs: for tracking user positions, both inside monitored places in the city being part of the City4Age pilot scenarios (shops, offices, pharmacies, etc.) and in outdoor spaces in the city (streets, parks, etc.); for tracking the interaction of user with public transportation systems.
LEAs are collected as soon as they happen and sent to the Personal Data Store and Management System for further elaboration (Section 4.2), such as the execution of activity recognition algorithms.Since a large number of LEAs occurrences can be generated during the day, the City4Age project has introduced the concept of measure as a daily indicator that synthetizes a set of occurrences of a given LEA.For example, by analyzing all LEAs related to the entering and exiting of the user in a room (such as the bathroom), it can be computed how many times the user went to bathroom in a day and the average time of permanence.These are typical examples of measures, generated on a daily basis, which make sense from a geriatric point of view to assess changes of behavior relevant for MCI/frailty.Starting from these and other daily measures, GEFs and GESs indicators can be computed in order to define a risk profile of each elderly person on a monthly basis.

System Architecture
Figure 1 illustrates a high-level overview of the City4Age architecture [41].In few words, the City4Age platform deals with the detection, through unobtrusive technologies, of elderly behavior during their everyday life, in both indoor and outdoor environments, at home and city level.Collected data are then integrated and stored in a central repository, so that complex behavioral analysis and risk detection algorithms can be performed on these data.The result of this phase is a list of possible customized interventions for each subject, which can be directly administered to the elderly or after the evaluation of a multidimensional assessment team.The smartphone plays a central role in this architecture, acting as a gateway for data transmission and as a terminal for interventions.
All the data collected within the City4Age project are used to derive elderly people's behaviors and then relate them to the geriatric factors (GEFs and GESs) described above.According to the City4Age data collection model, the possible ways to collect data for geriatric evaluation of MCI and frailty problems are as follows: (i) From sensors: sensors can be at home, in the city, on mobile devices (including special devices or phones), wearable devices, and so on.Sensors generate lowlevel signals that need interpretation before becoming meaningful from a geriatric point of view.(ii) From external systems and Apps: external systems may generate useful data (e.g., the way transportation is used, the way people use electricity, the way people participate to events, etc.).These data can be of low level, therefore requiring "interpretation," or can be already meaningful from a geriatric point of view.Also existing external apps can be used for collecting useful data.Apps already exist from the market or can be developed by research projects.Data from apps can be of low level (therefore requiring "interpretation") or can be already meaningful from a geriatric point of view.this purpose, after manipulating their output to adapt it to the LEA-measure-GEF data model.(iii) From direct observation: direct observations can be carried on in various ways: interviews, questionnaires, taking videos, and so on.Data from direct observation can be of low level (therefore requiring "interpretation") or can be already meaningful (from a geriatric point of view).
The focus of this work is mainly on the first category of data gathering methods, characterized by its unobtrusiveness, since sensors do not require direct interaction with the user.Therefore, this work defines a general architecture for unobtrusively collecting data coming from a heterogeneous sensing infrastructure.In particular, within the City4Age architecture, this work is focused on the first two layers, the so-called Personal Data Capturing System (PDCS) and Personal Data Store and Management System (PDSMS), shown in Figure 2 [42].
The main task of the Personal Data Capturing System is to gather raw data from sensors spread in physical environments (independently of both their specific technologies and communication protocols) and process them to calculate LEAs and measures to be sent to the Personal Data Store and Management System.The PDCS is internally composed of two main logical blocks.The Local Environment Building Block (LEBB) provides a modular set of software components (generally installed on smartphones or embedded devices acting as gateways), which are able to communicate with different sensing technologies according to the respective standards and protocols in a uniform way.This capability abstracts the heterogeneity of the physical devices and provides a high degree of expandability to include upcoming technologies.The LEBB core logic translates raw data into LEAs and send them, through a well-defined REST APIs, to the Cloud Building Block (CBB).It is in charge of completing the data message object if any other information is missing, since the CBB has access to a wider range of information.Furthermore, the CBB performs other computations in order to calculate measures based on the given LEAs.Finally, the CBB is in charge of sending both LEAs and measures to the PDSMS.
The Personal Data Store and Management System (PDSMS) integrates the data received from the IoT infrastructure and provides a semantic meaning to that data following the Linked Open Data paradigm.This process also enriches the gathered data applying spatial and temporal knowledge eliciting rules, which improve the semantic knowledge and ease the inference and querying processes.The PDSMS is composed by two modules.The REST API, which allows managing large quantities of data in an efficient manner, and the Linked Open Data API, which performs the semanticization process over the stored data.
In order to abstract low-level details without losing information during the data acquisition process, the concept of Common Data Format (CDF) has been introduced with the aim of defining a data object to exchange data and information with a uniform and shared meaning, hiding all technological low-level details.In this way, data gathered by different devices can be treated in the same manner, avoiding concepts misalignment and loss of knowledge.The CDF, along with a well-defined and shared vocabulary of LEAs and measures labels, provides a first level of abstraction with respect to raw data generated from sensors.There exist two different CDFs, shown in Tables 2 and 3, used to transmit LEAs and measures to the Personal Data Store and Management System.

Personal Data Capturing System. The Personal Data
Capturing System is in charge of collecting large quantities of data that can be detected from the surrounding environment through a sensing infrastructure, both in the home and in the city environments.Relevant types of collectable data can be roughly grouped into the following categories: (i) User motility: data related to the ability of the user to actively move parts of the body to perform complex activities, such as walking, running, and motion in general, but also resting, laying down, sleeping, and so on.
(ii) Indoor/outdoor localization: data involved in the process of determining the position of the user inside a private or public indoor place, such as user's homes or shopping malls, pharmacies, and churches, or data related to the position of the user in outside places, like streets and parks.
(iii) Ambient parameters: data concerning the quality of living condition in indoor and outdoor environments, like temperature, humidity, luminosity, and weather conditions.
(iv) User/environment interaction: data related to user interaction with the surrounding environments, especially with home appliances (TVs, HVACs, etc.) and public services, for example, public transportation.
Several technologies can be potentially involved in the process of gathering data for the categories listed above; therefore, the first step has been to analyze sensing requirements expressed by the six pilot sites being part of the City4Age project.
Thanks to a continuous interaction with all pilot sites, it has been possible to identify the main technologies used by pilots to gather data related to the above categories: (i) User motility: the motility detection process is based on two main solutions: the use of wearable devices, like BLE wristband (Lecce) or smartwatch (Birmingham, Singapore, and Montpellier), and the use of the MEMS motion sensors available on the smartphone (Athens and Madrid).In these cases, both public available apps and APIs or custom apps can be used.Vision systems, like Kinect, can be used for walking pattern recognition (Montpellier and Singapore).Typical outputs of this module are the BODY STATE START/BODY STATE STOP LEAs indicating the timestamp when the user enters and leaves a particular body state (i.e., still, walking, sleeping, etc.) [43].
(ii) Indoor/outdoor localization: the indoor home monitoring is based on two main solutions, one based on BLE beacons interacting with smartphone or wristband (Lecce and Madrid) and one based on motion and contact sensors (Montpellier and Singapore).BLE beacons are also used to monitor indoor public places in the city (Athens, Birmingham, Lecce, Montpellier, and Singapore), with this technique being more precise and reliable than the one based on geolocated POIs.POIs definition, in fact, is a technique also used for this purpose, but it does not provide the certainty that the elderly people are actually inside the place of interest.Moreover, the definition of POIs as well as the interaction with smartphone's GPS receiver is the most adopted solution to track elderly people position in outdoor environment (all Pilots).All of these types of event can be captured by triggering couples of POI ENTER and POI EXIT LEAs, indicating the location type and/or the GPS coordinates.
(iii) Ambient parameters: they are gathered mainly through wearable sensors (Lecce) or domestic weather stations (AMBIENT REPORT LEA).
(iv) User/environment interaction: several and heterogeneous technologies are involved in this task.For example, the activity of meal preparation can be inferred by using vibration, motion, and contact sensors installed on furniture and tools (FURNITURE OPEN/CLOSED LEAs), like those in Montpellier and Singapore; the usage of public transportation means can be detected by beacons at bus stops and Wi-Fi connections within buses (TRANSPORT ENTER/EXIT LEAs), as in Madrid; the usage of home appliances can be detected by using BLE smart plugs (Lecce) or with unobtrusive smart meters (APPLIANCE ON/OFF LEAs).
By analyzing all pilots' scenarios, two main solutions are used as gateway, for gathering data from the sensing infrastructure and sending them to a local platform for a first stage of elaboration.The first solution uses the smartphone, which interacts with physical devices mainly through BLE connection and relays data by using its cellular data connection.The second solution, adopted by Montpellier and Singapore pilot sites, uses a wired home gateway, which receives data from the home sensing infrastructure and relays them through a DSL or cellular data connection.Both types of gateways implement the LEBB module inside them.In all cases, instead, the CBB is implemented in a (local or cloud) server, locally managed by pilots' managers, which directly interact with the PDSMS with the communication protocol described above.It is managed by local administrators and it deals with the collection and forwarding of LEAs and with the calculation of measures to be sent to the Personal Data Store and Management System.Although each municipality is free to implement its own CBB service, this work proposes the WoX framework discussed below as reference implementation.In order to fulfil its requirements, the WoX implementation of the CBB includes also a module that acts as a smartphonebased implementation of the LEBB.This reference implementation is called Local WoX (L-WoX) and it is explained later in this section.
WoX Reference Model.The capturing system should define sensing middleware, which provides access to the underlying technologies, hiding their intrinsic heterogeneity and complexity.Furthermore, in order to make the architecture highly scalable, the structure of the middleware should be modular, so that new technologies can be easily integrated into the system.The sensing middleware should be free from any business logic, since it only has to act as an interface for collecting data coming from physical devices.To make this kind of approach possible, the middleware should be equipped with appropriate software modules, called adapters, which are able to communicate with the sensing technologies according to the respective standards and protocols.A software framework able to fulfil all the above requirements is the WoX (Web of Topics) [44,45] IoT platform and here proposed as a reference model for the implementation of the Personal Data Capturing System.The WoX model requires a robust ICT architecture capable of facing the extremely high numbers of IoT entities and topics, the intense exchange of messages, and the heterogeneity of the IoT technologies.Due to the WoX nature, the best-fit architectural design pattern is the publish-subscribe (pub/sub) pattern which centralizes the core information separating who provides data from who consumes it.IoT-based apps perform a one-time subscription to the topic(s) of interests, without perceiving the hardware layer.In Figure 3 the WoX technical architecture is shown.A pub/sub architecture alone cannot satisfy the requirements of hardware abstraction, event filtering, and standard-compliant persistence of IoT middleware.For this purpose, the pub/sub architecture sits on top of a component that acts like middleware towards physical technologies in order to guarantee abstraction and transparency.
The component is called Hardware Abstraction Layer (HAL) and the whole architecture is composed by the following: (i) The environment level: it comprises the physical layer as well as any virtual environment that can generate events.Social Networks chats can be source of events too.(ii) The middleware core (HAL): it is responsible for querying/piloting the environment level and packing event reports for the upper layers.In particular, the Data Aggregation Module is tasked with the following operations: (i) Listening for WoX Topic updates, which will be temporarily stored in different collections of a MongoDB database as they are received, according to the LEA's macrocategory they belong to.(ii) Extracting LEA-formatted records from the appropriate MongoDB collection, according to a predefined schedule.(iii) Transforming the information so that it is compliant with the City4Age information model, by removing unnecessary data and performing data aggregation operations, with the final aim of generating a set of measures.(iv) Loading the resulting measures in the PSDMS repository through the provided endpoint.
Since LEAs are grouped in several macrocategories (like person, home, and city LEAs), the Data Aggregation Module provides a different ETL workflow for each one of these classes.This approach greatly increases the modularity and ease-of-maintenance of the system, while at the same time reducing the number of records processed by each pipeline, thus increasing the overall performance.Furthermore, the Data Aggregation Module is also tasked, through another ETL pipeline, with loading batches of LEAs on a daily basis into the PSDMS repository, so that they can be used for further analysis.
L-WoX Reference Architecture.The Local WoX (L-WoX) is a subset of the whole WoX architecture running on the personal user device.In L-WoX, the personal device works as an aggregator towards local WoT entities.Allowing the retaining and managing of some topics locally on the device is more efficient in some circumstances.For example, multiple mobile apps on the same device may be interested in the same topic, and its value is updated by an onboard sensor (e.g., the accelerometer).Another example consists in two mobile apps interested in two different topics, updated using the same onboard sensor.
In L-WoX we have the following components: (i) The L-WoX service, instanced as a singleton in the mobile operating system, retaining the topic instances.(ii) The mobile apps, subscribing to the local topics after binding to the L-WoX service.(iii) The native sensors APIs, offered by the operating system to access the available sensors.(iv) A set of WoX adaptors, which use the values incoming from the sensors to update some specific topic.(v) L-WoX API, a library that provide developers with methods necessary to communicate with the service.
A developer can access WoX model by using this library.
A legacy app talks directly to the physical layer, using its own application layer protocol.In order to share its existing services, it should use the WoX APIs to translate the event into the common WoX topic.If necessary, the app forwards the new topic information to the global WoX architecture.New IoT apps (not legacy) can directly use all the L-WoX advantages.

Personal Data Store and Management
System.The IoT infrastructure described in the previous sections captures a large set of different LEAs, activities, and measures.This information needs to be stored while guaranteeing the citizen privacy and ensuring that their personal information will be used only by authorized entities or persons.This information needs also to be normalized and given sense to be properly consumed by the stakeholders and third parties.This situation gives the motivation of finding a potential solution that (i) gathers the data in a secure manner, (ii) provides the needed technologies to store and persist data over the time, (iii) normalizes the data, providing a semantic structure, and (iv) shares the information in a flexible way to allowed third parties.
In order to achieve these goals, the Personal Data Store and Management System combines the use of traditional REST interfaces with a Linked Open Data (http://www.w3.org/standards/semanticweb/data)approach in order to provide a common shared repository.Figure 4 depicts the overall architecture of the proposed solution.The system is composed of two main blocks: (1) A RESTful application service called REST API Interface (RAI) that provides an access point to gather and store the LEAs, activities, and measures captured by the IoT infrastructure.(2) A Linked Open Data service called Linked Data Interface (LDI) which obtains the stored data from the database and, using a set of different tools, transforms the relational data into semantic data to be shared using a SPARQL (http://www.w3.org/TR/rdf-sparqlquery/) endpoint.

Wireless Communications and Mobile Computing
Both interfaces share the same relational database with information related to the users.Figure 5 depicts the workflow of the proposed solution to manage the data.The RAI uses a web server called Nginx (https://nginx.org/)which receives the pilot's data and sends it to a Unix internal socket.This socket is a listener of a uWSGI (https://uwsgi-docs.readthedocs.io/en/latest/)process that reads and writes the requests and executes the needed Flask (http://flask.pocoo.org/)methods.Flask provides a web service that executes various pieces of Python code to handle user requests and uses an internal ORM to store the user requests in the relational database.
When the storing process is finished, the LDI is in charge of semanticizing that data.The D2RQ (http://d2rq.org/)engine recovers the data from the database and gives them a semantic meaning using a mapping file which connects each table and column of database to a class and properties of a previously designed ontology.Then, the loaded semantic data is processed using a semantic rule engine to make explicit the implicit knowledge, following a process similar to the one proposed by Almeida and López-de-Ipiña [46].The reasoner infers new statement based on a set of spatiotemporal rules.Table 4 shows an example of two different rules used in the system.As it is possible to see in the first rule, the "when" declaration contains a statement which relies of having a subject with a location, a predicate with a boolean indoor value of true, and an object with an id of a pilot.If the forward chain rule is satisfied, then it executes the "then" declaration to create a new statement which shows a new relationship.This relationship provides new information about the location registered, saying that this location is a "building."The second rule is quite simple: if a given action was performed in a time interval, then the system will consider it as a registered action; thus it will create a new statement to make the logical connection.Finally, all the generated knowledge is sent to a web server called Fuseki (https://jena.apache.org/documentation/serving data/).This server contains both SPARQL and HTML endpoints that can be used to query the stored information.The server provides single access point to send a set of requests to extract the loaded knowledge and use these data as a source of analytical data for different purposes.

Functional Validation of PDCS and PDSMS.
This section aims to describe and validate the data flow of the Personal Data Capturing System (PDCS) and Personal Data Store and Management System (PDSMS), by illustrating a simple use case focused on the detection of the still/moving body state.This task is directly related to the "motility" geriatric factor, because knowing the quantity and time of particular user body activities (at least how much time the user is still) is an important factor to define the behavioral profile of the elderly and to assess his/her MCI risk.This use case is mainly based on the user motility subsystem; however, the same principles apply to the remaining subsystems (indoor/outdoor positioning, ambient parameters, and user/environment interaction), which offer likewise important information for the MCI risk detection process.
As briefly explained in Section 4.1, each city or pilot site can deploy its own sensing infrastructure, provided that it is fully compliant with the reference architecture model.This example deals with the generation of the BODY STATE START and BODY STATE STOP LEAs based on the sensing infrastructure setup in the Lecce pilot of the City4Age project.Considering only the indoor environment, the sensing infrastructure of the Lecce pilot consists of a set of BLE beacons for indoor positioning (Easibeacon mini), and a couple of smart plugs for detecting the usage of domestic appliances, like TV and washing machine.Moreover, the elderly person is equipped with a wristband based on the SensorTag (http://www.ti.com/ww/ en/wireless connectivity/sensortag/) for motility detection, indoor positioning, and ambient parameters collection, and with a smartphone acting as a gateway for data gathering and forwarding (LG G3).
More precisely, the motility subsystem consists of two main components: the SensorTag, able to detect the user's state, and a background service installed on the smartphone that collects the data from the SensorTag, preformats them as LEA, and provides the right information to the Cloud Building Block.From the hardware point of view, this represents an interesting solution, because by using only a low-cost and open wristband associated with a common smartphone, it is possible to gather data related to user body motility among other important tasks (like the collection of data related to user indoor position and ambient parameters, not in the scope of this work).
The classification algorithm to identify these body states is based on a machine learning approach.In particular, the triaxis acceleration signals are sampled at a frequency of 25 hz.Each sample is then filtered with a median filter to eliminate noise and the signal magnitude vector is computed to make the signal independent of device orientation.Samples are collected for a 3 s wide time window, after that the standard deviation of the samples is extracted as a feature characterizing the whole sample set.The classification is done upon this parameter.If it is above a given threshold, set after the training phase, then it is established that the user is moving, otherwise he is still.
The result of such classification algorithm implemented on the SensorTag is the update of the BLE movement characteristic, whose value is equal to "0 × 0000" when the absence of movements is detected.This datum is the output of the sensor device and it is notified to the smartphone's app subscribed to this BLE characteristic.
Once the notification is received by the smartphone's app, it recognizes that this value represents a BODY STATE START LEA, so the Common Data Format object is created and all available information, such as the user id, the instance id, and the timestamp of the action, is inserted (Figure 6).
At this point the LEA can be sent to the CBB by calling a proper method of L-WoX running on the smartphone app.The CBB is based on the WoX platform and its deployment includes the installation of the following components (running on the same machine): (i) An Apache Tomcat v7.0 (JDK 7) servlet container where the capturingapp.warweb application archive is deployed.(ii) An Ubuntu Server machine (14.04 LTE) with WSO2 ESB, BRS, and AS modules.
On the CBB, the WoX platform can complete the CDF with further information, such as the ones related to the current location of the user, gathered by the indoor/outdoor positioning subsystem of the PDCS (Figure 7).The full CDF data object is then temporarily stored in the CBB, waiting to be sent, at a given time of the day, to the City4Age platform.The same instance id value for the two data objects allows matching the BODY STATE START LEA with the related BODY STATE STOP LEA.By computing the difference between the two timestamps, it is possible to calculate the duration of this "session" of body state (in this example the duration is 4:40 min).
A specifically designed ETL pipeline of the Data Aggregation Module on the CBB then extracts the LEA-formatted records, transforms and aggregates the data, and generates a set of high-level measures, such as the sum of all "still" sessions' durations during the whole day, so the STILL TIME value can be computed and sent to the City4Age platform as a measure, as shown in Figure 8.
The steps illustrated above (showing how a BLE characteristic update is transformed, after some elaboration, into a daily measure) demonstrate the goals of the Personal Data Capturing System.Starting from raw data coming from heterogeneous devices, each with its own technologies and data formats, the LEBB module installed on the gateway (being a smartphone or a home gateway) creates a proper LEA, inserting information with a high level of abstraction, independent of any low-level technology.Then the Common Data Format object is sent to the CBB, which can add further information, before sending it to the Personal Data Store and Management System, according to the defined communication protocol.
To perform these tests, a dedicated server based on Intel(R) Xeon(R) E5606 at 2,13 GHz of clock speed has been used, with 8 GB of RAM memory at 1333 Mhz clock speed (dual channel mode) and 500 GB ATA disk with a maximum speed ratio at 300 MB/s and 7200 nominal media rotation rate.

Stress Test.
In the stress test simulation, Apache JMeter (http://jmeter.apache.org/)has been used in order to make a massive simulation of add action API endpoint.The idea is to simulate a set of different clients sending LEAs using the add action endpoint in a short period of time.The API must store all data without losing it and it needs to be online after finishing all requests.In the stress test, a sample file with 30 random samples of add action has been used, simulating 800 clients sending information to the system in intervals of 0.5 seconds.Each client sends three different requests.
The first request is a GET function to the main page of the API.In this test, the API only returns a status code of 200 (connection ok) and the API returns a piece of HTML code containing a welcome message with the information of available endpoints.The second request is a POST function where the user sends the login credentials to be authenticated by the API.If the given credentials of the user are correct, the system will return a status code of 200 and an encrypted cookie with the needed login credentials in the API.The third request is a POST function in which the user sends its login credentials and a list of JSON values containing the 30 instances of add action.Figure 9 shows an example of one of the mentioned JSON samples.If this operation is done successfully and the API performs a commit in database, the API will return a status code of 200.The total execution time took 6 minutes and 41 seconds to complete.The test was repeated five times to ensure that results were consistent in each test.The final value exposed in this document is the mean value of all performed tests.In all iterations, the stress test was finished successfully without errors and the server stored all sent data into database.
Reviewing the gathered results, it can be seen that the login process takes a longer time to be performed compared to the rest of the tests.This is a normal behavior since the login process needs to create an encrypted cookie and returns the needed information to the end user; thus the time responses of the server are slower.Once the user has its login credentials, the process of adding 30 samples of add action data is faster and the returned results confirm the reliability of the system.
It is important to remark that the login process is done only once and then the user does not need to login again into the system; thus with this experimentation it can be confirmed that the API can handle massive requests from different users in a short period of time without problems.

Pilot
Deployments.The proposed system has been deployed in the cities that are part of the City4Age project: Athens, Birmingham, Lecce, Madrid, Singapore, and Montpellier.Each city integrated their sensor architecture with the system presented in this paper, providing data from their citizens.
Table 5 depicts the data gathered from the involved pilots.In the test, 135 real users participated in real environments, each city with a different sensor configuration.These sensors detected their actions and helped in the acquisition of different types of measures.The relevant data was managed and filtered by the IoT middleware.This test was executed in a period of 6 months.During this time, the middleware was gathering relevant information from the users.In addition, each user was supervised by a geriatrician to collect their behavioral data and their personal statuses.This data was sent to expert caregivers and, using their experience plus a collection of computational algorithms, the risks related to MCI and frailty were assessed.The results gathered in the table show the total information sent by the pilots from the users in each city.The "LEA" subtable instances contain the total number of add actions sent to the API, and the "MEA" subtable instances contain the total number of add activity sent to the API.

Discussion
The City4Age architecture differs from other AAL approaches in that it tackles the problem of creating an assistive environment at a multicity level.In order to do this, the proposed system presents a novel data model for AAL systems and works on several modular layers that allow capturing and processing heterogeneous data.The proposed data model allows the different cities that are part of the City4Age project to choose the level of abstraction at which they will use Communications and Mobile Computing 15 for the integration.In this way one city can provide more detailed information in the form of low-level elementary actions, while other city will use aggregated information that describes the behavior of the user in a week (i.e., the geriatric measures).This allows the cities to reuse the existing IT infrastructure already deployed in order to maximize resources (e.g., if a city has already deployed an AAL system which provides user activities or geriatric measures, it can be easily integrated with the City4Age architecture).In cases when cities are deploying a new IoT network, the proposed system provides a low-cost and low-power sensing infrastructure that abstracts the heterogeneity of physical devices and communication technologies.This, coupled with the data abstraction layer, allows easily creating an IoT deployment which captures the users' geriatric factors starting from LEAs.The proposed data abstraction mechanism is also able to properly model the uncertainty in the measures, allowing for a richer inference in the higher layers.
The data captured by this city-level IoT infrastructure is then integrated in a common shared repository that provides multicity data integration, allowing the reuse of the geriatric developments among multiple cities.The shared repository normalizes the received data using a semantic representation following the Linked Open Data paradigm.It also uses a semantic reasoning process to extract and make explicit the implicit information in the data.In this way, the integrated data can be easily queried and reused by cities' administrators or by third parties in order to conduct wide scale studies related to frailty and MCI by analyzing large populations.This is facilitated by the multiple data access interfaces provided by the shared repository, allowing users with different expertise levels to query the stored data.
The data provided by the different subsystems of this architecture are used to define a MCI risk profile of the elderly user (also known as care recipient) according to as many GEFs/GESs as possible.Starting from LEAs generated when the events occur, measures are computed on a daily basis and sent to the common shared repository, for the purposes explained above.Once per month, a risk detection algorithm computes the numerical values of each GEF/GES by elaborating the related daily measures, which are finally shown in a visual dashboard used by caregivers and domain experts, mainly geriatricians.The dashboard provides them with a synthetic and multidimensional vision of the whole story of the care recipient, which can be deeply analyzed in order to define the proper intervention for the subject.If the behavioral change of some GEFs/GESs between two adjacent time periods is more or less evident, the corrective action can be represented, for example, by an informative communication related to current behavior, by some life-style improving suggestions, or by invitation to further medical analysis.
As it can be seen, the risk evaluation and intervention phases are not fully automatized, because one of the main goals of the City4Age project is to assess whether the unobtrusively gathered data are actually useful for the MCI risk detection or they could represent an overload of (useless) information for the caregiver.That is why the setup and running of the six pilot testbeds are important to validate the project's outcomes and the transversality of the adopted approach.Nevertheless, the provided system architecture is flexible enough to be adapted to other application scenario with minimal effort.

Conclusions
In this article, a smart city-oriented infrastructure for capturing and managing data related to elderly people behavior has been presented.The infrastructure, developed within the City4Age project, combines the Internet of Things and Linked Open Data paradigms to provide a scalable and responsive system able to provide services for multiple cities concurrently.This infrastructure is composed of the first two layers of the general project infrastructure: the Personal Data Capturing System (PDCS) and the Personal Data Store and Management System (PDSMS).Since the presented architecture is flexible enough to be deployed in different scenarios, supporting thousands of users in several cities, in this case it has been applied to the analysis of behavioral changes of elderly people related to MCI problems.In this context, a considerable effort has been made to define a data model that can properly abstract the domain complexity and the technological heterogeneity.The proposed solution allows the cities to integrate their data on different abstraction levels, providing a semantic endpoint that offers an expressive data format for inference and querying purposes.
Currently the system is being deployed in five European cities (Lecce, Madrid, Montpellier, Athens, and Birmingham) and in Singapore, as pilot sites related to the City4Age project.The aim of these testbeds is to verify, in the long run, the correctness of the proposed system, both from a technological point of view and from a domain point of view.In fact, it will be interesting to evaluate if the system correctly manages the flow of gathered data coming from heterogeneous devices and to assess how the behavior analyses carried out for elderly of different cities are consistent among them.Outcomes of this testing phase will be the object of future works.

Figure 1 :
Figure 1: System architecture of the City4Age project.

Figure 2 :
Figure 2: System architecture of the Personal Data Capturing System (PDCS) and the Personal Data Store and Management System (PDSMS).

Figure 4 :
Figure 4: Architecture of the Personal Data Store and Management System.

Figure 5 :
Figure 5: Workflow of the Personal Data Store and Management System.

Figure 6 :
Figure 6: CDF of the BODY STATE START LEA produced by the LEBB.

Figure 7 :
Figure 7: CDF of the BODY STATE START/STOP LEAs after the elaboration made by CBB.

Figure 8 :
Figure 8: CDF of the STILL TIME measure produced by the CBB.

Table 1 :
Excerpt of GEFs and GESs list.

Table 4 :
Example of rules used in the test.
shows a line chart containing the results of the test.The  values define the elapsed time in minutes and seconds and the  values define the response time in