Augmented Reality Experience : From High-Resolution Acquisition to Real Time Augmented Contents

This paper presents results of a research project “dUcale” that experiments ICT solutions for the museum of Palazzo Ducale (Urbino). In this project, the famed painting the “Città Ideale” becomes a case to exemplify a specific approach to the digital mediation of cultural heritage. An augmented reality (AR) mobile application, able to enhance the museum visit experience, is presented. The computing technologies involved in the project (websites, desktop and social applications, mobile software, and AR) constitute a persuasive environment for the artwork knowledge. The overall goal of our research is to provide to cultural institutions best practices efficiently on low budgets. Therefore, we present a low cost method for high-resolution acquisition of paintings; the image is used as a base in AR approach. The proposed methodology consists of an improved SIFT extractor for real time image. The other novelty of this work is the multipoint probabilistic layer. Experimental results demonstrated the robustness of the proposed approach with extensive use of the AR application in front of the “Città Ideale” painting. To prove the usability of the application and to ensure a good user experience, we also carried out several users tests in the real scenario.


Introduction
In the cultural heritage (CH) valorisation and communication, information and communication technology (ICT) offers an easier access and a multiperspective view of artefacts and can also increase cultural heritage education, thanks to the use of innovative learning/teaching methods [1].Visitors of a museum can interact with the content in the most interesting way for the introduction of ICT.In the last several years, many researches aimed to provide museum of interfaces and software tools.Their intention is to develop web-based virtual museum devices integrating 3D computer graphics and augmented reality (AR).Sylaiou et al. [2] demonstrate that there is a relationship between a high level of presences and the satisfaction and gratification of visitors, while interacting with a museum simulation system and living a more attracting experience.In the next years, the diffusion of smartphones and the growing use of social networks on mobile suggest integrating social, AR applications and digital storytelling.In this way, a specific goal of the project becomes to translate the people communicative empowerment [3], due to the social networks, in art communicative empowerment.The free use of painting copies, sharing and editing, might seem downplaying their importance.In this context, a new read of the Benjamin's essay on "The Work of Art in the Age of Its Technical Reproducibility" is possible.Nowadays many reproductions that reactive the object, as Benjamin suggested, are social reproduction and sharing.In this way the "aura" is not eliminated but augmented [4].
Through digital tools, the museum experience becomes a lay experience of culture and then the art seems democratic.The "watch and tweet" custom, the social interaction on the web during a medium use, is growing and engaging more and more the public.Google art project represents a reference standard in the field of social digital libraries (DL) but is not available and suitable for many Italian museums.The "Explore-Collect-Share" practice is adopted in many web sites of world museums.The five most popular museums of the world in 2012 embraced this social communications strategy.But the first Italian museum (Uffizi) is placed in 2 Advances in Multimedia 19th (The Art Newspaper), with some experimentation in mobile apps, only recently.Our project wants to introduce an "Explore-Collect-Share" approach [5] in a mobile AR environment that it increases museum attractiveness and it allows easy content management for public cultural institutions, very fragmented in Italian panorama.This project is conducted by the Soprintendenza per I Beni Storici Artistici ed Etnoantropologici delle Marche (SBSAE Marche), Università Politecnica delle Marche (UNIVPM), Università di Urbino (Uniurb), and Alma Mater Studiorum Università di Bologna (Unibo).The demo within "dUcale" project, presented here, experiments ICT solutions for the Palazzo Ducale of Urbino and its museum and explores the potential of image-based systems (IBS) as a technical platform for creating a connection between technology and setting in a use situation.
The overall goal of our research is to provide to cultural institutions the best practices efficiently on low budgets (a detailed comparison with state of the art is reported in the result section).Therefore, we present a low cost method for high-resolution acquisition of paintings that is used as a base of knowledge for an image that uses an AR approach.The proposed methodology consists of an improved SIFT extractor for real time image using a robust matching.The other novelty of this work is the multipoint probabilistic layer.Experimental results demonstrated the robustness of the proposed approach with extensive use of the AR application in front of the "Città Ideale" painting.Users tests in the real scenario were also conducted to prove the usability and ensure a good user experience.
The paper is organized as follow: Section 2 proposes a review on AR techniques applied to digital cultural heritage; Section 3 describes the high-resolution acquisition process; Section 4 goes deeply inside the proposed image based AR methodology; and finally in Sections 5 and 6 qualitative results and user test results and a comparison of the current solution with state of the art and relative costs are presented together with conclusions and future works.

Augmented Reality (AR)
2.1.Related Work.The 1960 can be considered the origin of AR, in particular thanks to the cinematography, when Morton Heilig created Sensorama [6], a system that made the viewing experience immersive by enhancing the sensorial perception of the reality.Afterwards, the computer scientist Sutherland created the first virtual reality and augmented reality head-mounted display system [7,8].Then in 1975, Krueger et al. realized the Videoplace [9], the first system of virtual reality (VR) that gave the user the sensation to interact with the virtual objects, although the term VR was coined only in 1989 with the realization of the first 3D software.But it is only in the 90s that the VR became AR.The difference is that if before the user worked into an environment achieved by a computer, now the information comes out from the device, thanks to the overlapping of digital contents.
The first AR software is ARToolKit by Kato and Billinghurst [10], where, using the video tracking, it is possible to place a virtual camera in the same point of view of the observer and, starting from this, put contents in overlay.
During the year 2000, augmented reality technology has already reached a high degree of development; thereafter, this system has been introduced into applications such as navigation, medicine, urban environment, education, and cultural heritage.
The mobile application augmented reality for archaeological content, ARAC Maps [11], based on commercially available devices with android operating system, intends to augment archaeological paper maps using 3D models and other interactive approaches.
Another recent application for mobile devices is AR-TagBrowse [12] that allows tagging and browsing virtual 3D objects that pop up in a scene observed by a mobile camera.The view is interactive and a user can also inserts notes for specific parts of the object.
Also in Italy there are similar projects launched by the Italian Ministry of University and Regional Governments that implement augmented experiences.For example, Tell MeWhere project [13], designed by the University of Modena and Reggio Emilia, realizes a system to offer augmented experiences to visitors of the Enzo Ferrari Museum in Modena.Using smartphones and glass cameras, this system recognizes visitor gestures and provides car details so they can learn more about the luxury cars.Other projects, DICET and SMST, are launched in Italy in 2013, with the common aim to augment the experience of visitors with an increased interactivity and connectivity [13].
Given the above, we can summarize AR as the enrichment of the sensorial understanding through a series of digital or computer-generated contents, which enhance the knowledge of the environment, or objects, with artificial information overlaid "above" a screen.
Augmented reality (AR) is the phenomenon of adding virtual elements into our physical reality, allowing the visitors to communicate directly with the exhibitions.

Cultural Heritage AR.
Virtual and augmented reality technologies have almost invaded many disciplines, also those not usually related to computer science.In fact, AR technologies have been widely adopted in the domains of military and medical training, urban planning, and architecture, such as for industrial maintenance work (e.g., in automotive and aerospace industry) and entertainment.Nevertheless an additional field of application is the furnishing, in particular, the interior design and retail solution.Though late from the cultural heritage point of view, a large number of related researches has been performed in the recent years [14].An example is the Archiguide; this system allows the user to experience a VR world by realizing computer-generated 3D reconstructions of ruined sites always maintaining him in the "real world" [15].Lifeplus is another of AR applied to historical and archaeological sites using both handheld devices on site displays [16].Particularly significant is the example of Arco project, which proposes to rebuild the museum collection or at least part of it, in a virtual way.This is very important because, for the first time, the problem how to expose the entire collection of the museum has been tackled, too often concealed due to space limitations or for a long time of restoration [17].Moreover, The Louvre-DNP Museum Lab (LDML) is a three-year project comprising six presentations, with the purpose of obtaining experience in innovative multimedia approaches that they bring together visitors and artworks [18].An interesting "virtual" AR scenario was provided in 2003 by the Dino-Hunter design of the Senckenberg paleontological museum in Germany [19].In this project, young visitors, while they view the museum's website, could start a (virtual) mystery tour manipulating a (virtual) PDA that augmented the dinosaurs' skeletons, reconstituting how they would have been like [20].
At last, in 2010, the virtual exhibition took place at Museum of Modern Art (MoMA) of New York that was only visible using a mobile phone application called "Layer Augmented Reality browser, " which displayed numerous additional works for each floor.The main aim of the organizer was to analyse the impact of AR on our public and private spaces.
The following section describes the AR experience tested directly on one of the most important artwork of the Italian Renaissance (XV century): the "Città Ideale." The main purpose of this test was to increase the "artwork appreciation" by giving the visitors with some background knowledge an enjoying instrument to see the art, and the CH, in an alternative way, in order to enable them to notice and enjoy the artwork characteristics.

''Città Ideale'': High-Resolution Photographs of the Painting
The first purpose of our research is to obtain a low cost acquisition way for high-resolution images.In order to provide a digital electronic replacement for conventional photography of paintings, several systems were developed in the last years [21].They integrate complex acquisition systems and controlling software and use dedicated hardware to obtain high level of accuracy (resolution, colour, etc.).Moreover, they are not suitable for the public cultural institutions, because they are expensive and often offer more than what is needed in the field of CH protection and exhibition [22].The challenge is to guarantee the quality of content, performance, scalability pursuing, and, at the same time, a cost reduction, not only about the costs connected to the digitizing process but also about the costs for its sustainability.First of all, in our method [23] the camera is at a stand, and we do not use motorized track.Observing Figure 1 and following the workflow in Figure 2, the painting is in his position, to achieve fast and easy acquisition way and to lessen inefficiency in the museum.The panel of "Città Ideale" is 239,5 cm × 67,5 cm.It is at a distance of 120 cm from the floor.During the click phase, we obtained a RAW file.Then we check and manage parameters, in the postprocessing phase.From 5 m of distance by painting, we took 32 snapshots.We guaranteed the correct overlap between the pictures thanks to a paper grid that was distant 60 cm from the painting.In Table 1 there are the acquisition results with the timing and the provisioning.
Considering HW and SW resources, we carry out a zoom and the quality of an image that represents a very good standard not only in digital application for web and for  mobile.Indeed, the acquisition enables a reproduction of the painting both in digital printing and in offset printing.
A critical analysis of acquisition procedure of painting is presented in the following paragraphs.
The MTF tests [24] show that the duplicator use (LW/PH, that is, line width/picture height range [1300, 4000]) is a weak point.The image resolution falls from 3613 LW/PH (focal 200 mm-F/11 lens centre) to 3055 in the same condition; at lens edge, the resolution is 2171, that it means the lowest sharpness.
Chromatic aberrations (CA) are well controlled at 135 mm and 200 mm but are slightly the highest at 70 mm with an average value of 1,44 pixels wide open.With the lens at its 200 mm setting, duplicator increases the number of CA significantly.
The lens showed soft distortion characteristics in this lens class (1,7% at 200 mm).This problem can be corrected in processing phase using common software.
Considering the vignetting, we perform the acquisition with light falloff F/14, to preserve sharpness.We set ISO to 200 for the noise reduction in taking phase.
The fall of sharpness due to duplicator is the main problem.After quality study on the snapshot part, we decide to use control mask, during the stitching phase.
The main step of the workflow to obtain high-resolution photograph is the stitching phase.We arrange 32 snapshots in a macrophoto using the PTGui sw.An overlap of 30% is necessary to obtain an accurate stitching.The final image (22537 pixel × 6433 pixel, resolution 240) has a dimension of 848 MB.We support the Adobe RGB 1998 colour space; otherwise, the colour spectrum in the acquisition phase and in preprocessing phase (Camera Raw) is available after the stitching phase.

''Città Ideale'' AR Experience
The AR experience is divided into two substeps: global museum user localization using AR and AR painting details discovery.
The first one is related to an AR tool for global localization inside the museum based on the whole painting image; this section shows the main features of the application from the user's point of view.Once the app is launched, the device camera is activated and ready to recognize a target (according to the point selection framework described below).When framing a target image, the information pop-up is shown and a button in the action bar becomes active; if pressed, a radar is displayed on the top left of the screen.This feature shows the user the position of the other points of interest (POIs).Each of these points is represented as a marker and a billboard containing the name of the painting.This system shows the user the direction to follow to reach the painting through AR.For further information on how to reach a point of interest, the user has to only tap the corresponding geometry.This will display the path on the museum map from the current user position to the selected painting.
The second module is devoted to details discovery using AR and will be deeply investigated in the following section and in the result section.
The meaningful steps for the creation of the dUcale AR experience are explained in the following.
The whole procedure consisted of three phases that we can summarize as the block diagram in Figure 3 shows.The three phases are: the content creation, the web, and the final visualization.

Content Creation.
First of all, few items of interest in the painting have been identified, such as doves, architectural elements like floor or the environment in the background, and all elements suggested from the historians.More in deep, we decided to use the portions of the artwork which contains interesting detail that a common visitor of the museum could not discover by himself.In this way the visitor can be suddenly instructed on the highlights of the painting.For the proposed applications we decided to create a limited number of items such as visual contents, interactive buttons to discover the conceived lines by the artist or a video guide that explain some details about the mathematical Renaissance, and the construction of the perspective; however, tracking images and content can be modified or updated at anytime, and it is a great advantage for the museum managers because the museum exhibition can be easily changed and improved.
Everyone has been considered as a tracking image (also known as planar marker less tracking), that is to say images recognizable by the device once pointed with the camera.Then, once the content mask was created for each tracking image it has been possible to assign the content to the corresponding image in Figure 4.This step is mandatory for the visualization of pointed objects in an AR environment.
The whole high-resolution image is used for the global museum user localization using AR and paintings framed on the mobile device camera, as described before.

Cloud and Web
Service.The AR and content environment are based on a cloud service, ready to be used during the visit to the museum and ready to collect data and services in the cultural heritage data management.There is a significant experience and enthusiasm to participate in cloud-based development from the heritage organizations and agencies expressing their opinion in this report.The lack of knowledge and skills, trust, and legal issues are the main obstacles to participate for public administrations, while the main legal obstacle is the fact that many companies are charged with the governance of their data, and there will often be restrictions as to where that data may be placed and to whom it may be given.This limitation has been avoided thanks to the contribution of Mcloud: a public regional cloud infrastructure that hosts the described problem.
In this project, we introduced an XML standard description using SOAP Web Services to define the portrait entity (i.e., "Città Ideale" with descriptions such as painter, age, location, short history, comments, and audio guide), communication points for AR (i.e., tracking area, virtual layer, description, and social interaction), and user behaviours (i.e., statistics on the use of the system, interactions, and AR tracking area activations).
This description is public and available for the future standardization of this AR interaction with CH.

The AR Real Time Visualization and Point Selection
Framework.The final result is explained as follows; the content is available by pointing a common smartphone (or tablet) to the painting in front of the visitor.The device recognizes the tracked image and using this image it connects to internet getting associated images, visuals, and 3D shapes and then putting them into the view.For this purpose, a method was proposed based on an improved SIFT extractor for real time image that uses a robust matching developed in the robotics field.
The Scale Invariant Feature Transform (SIFT), developed by Lowe et al. [25,26], is invariant to image translation, scaling, and rotation.SIFT features are also partially invariant to illumination changes and affine 3D projection.These features have been widely used in the robot localization field as well as many other computer vision fields [27,28].
Even if the SIFT algorithm is invariant to scale and to rotation and robust to other image transforms, its main disadvantages is that the SIFT feature description of the image is typically large and slow to compute.Consequently we compute the image similarity using a reduced and optimized SIFT approach with 64 feature descriptors, and we introduced time saving improvements by the following two main steps: adaptation of SIFT parameters to each subimage, as shown in Figure 5 and fixed key point number extraction.In particular, the number of scales of the original image is defined according to its dimensions and thus in some cases not all SIFT scales are necessary to be computed.The following threshold where  is a scale factor, Dim and Dim are the  and  image dimensions, (, ) is the intensity of the grey level on the image, and (, ) is the medium intensity value of the processed image.The contrast threshold for the SIFT implementation of Lowe is statistically defined, while due to their sensitiveness to noise, the low contrast key points are discarded.In our implementation, the threshold is computed for each subimage, sometimes avoiding at all the time-consuming feature extraction process and in any cases dealing with different lighting conditions.
For the previously said problem and thanks to the adaptive threshold, we also reduced the number of key points and their corresponding extraction time, maintaining the same descriptor for each key point.This is a common way of reducing the dimensions and complexity of the problem when the image is very distinctive and with poor perceptual aliasing.
In the classical SIFT approach, key points are detected by testing each value in the DoG (Difference of Gaussians) at each scale with the 8 surrounding values of the same scale as well as with 9 neighbouring values in the scale above and 9 neighbouring values in the scale below.The first and last DoG scales are not examined.This means 26 ×  ×  comparisons for a DoG of size  × , taking into consideration that points around a given border of each DoG are not included in the key point detection [29].
The other novelty of this work concerns the multipoint probabilistic layer.The goal is to define a way to decide what AR interactive point should be selected when more than one is in the current view.
In order to define the best point of interest in the scene and the related AR content we made a comparison between two algorithms and a dataset of real user choices.The algorithms are winner-takes-all (WTA) models and a Bayesian model with maximum a posteriori estimate (MAP).For the final implementation we used the MAP approach based on string.MAP was used as an estimator based on a trained manual choice on the AR point of the images.The AR software to select the proposed action and to propose the related content uses this estimator.This method solves the issue of multiple points of interest in the same scene giving back a method to select one of these based on MAP action estimations.

Results
A first result of this work is the successful creation and the validation of an easy workflow for HD images.The use of the "Città Ideale" macrophotography in mobile and web-based applications has shown good performances of the image, obtained in our process.However, the methodology and equipment used remain in the field of low cost acquisitions, as demonstrated by comparisons with best market solutions as shown in Table 2.
Augmented reality interfaces perform the visualization of the digital contents of the artwork.The interfaces combine an app based form of presentation with either AR virtual exhibitions.This app allows users to reach the database contents by the use of a well-known interface, whereas the VR and AR exhibitions let them examine virtual reconstructions of selected objects in virtual environments.The virtual exhibitions displayed in the end-user interfaces are dynamically generated based on parameterized visualization templates and the database contents, as shown in Figure 6.On the portrait we defined 8 different points of interest for AR detection and user content overlapping.In particular, we tested image-based contents, videos (with transparent background), shapes with text, and interactive buttons to cope with social network activities and share contents from the AR application.Figure 7 shows the mobile application working in front of the real painting.
All test performed in the real scenario in different daytime and also using different reproductions of the famous painting demonstrated a great robustness of the proposed approach even if compared with commercial AR products.During the testing phase we did not observe false positive and all points of interest in the portrait were correctly detected.
It is important to emphasize that the entire demo has been performed over the real painting the "Città Ideale, " directly in the exhibition hall of the museum.

Users Test.
In order to know the characteristics of the interviewed sample preliminary questions relating to age, academic qualifications, work, and the use of technology in general were initially asked.
The system was tested on 15 different human subjects with age between 22 and 48.Five subjects have a high school academic qualification and they are students.The other 10 subjects have a university academic qualification and they are workers: 2 are self-employed and 8 are salaried workers (employed).
On the total number, 5 subjects are experts about using new technologies and 10 have good skills.Nine subjects spend time surfing the net each day more than five hours, 4 more than one hour and less than five hours, and 2 less than one hour.14 subjects have a smartphone for more than one year and 1 does not have a smartphone.Moreover, 9 subjects have a tablet and the other 6 do not have a tablet.
Table 3 lists the answers of the users concerning the "Città Ideale" application.Observing the answers, we can deduce, considering that the instruction level of the respondents is medium-high and they are skilled in the use of technology even if they most rarely visit a museum, that the app was greatly appreciated by the users.They have generally found that the application is clear, intuitive, and very simple to use.
In addition, we must also emphasize that it was the first time for 9 subjects to use AR application, while the remaining

Conclusions and Future Works
The presented solution for AR museum exhibition of cultural contents enables museums to become an enjoining place to spend time.Web pages and augmented reality techniques are needful in order to capture the attention of the visitor becoming an attractive tool for helping the visitor to do an active vision and to identify important facts, visiting the museum with a new insight.
Simply by pointing a handheld device, the AR terminal must support visitors to understand better the artwork that they contemplate.
Due to the aforementioned arguments, augmented reality has become an efficient, automatic, and playful method towards the appreciation and understandings of tangible and intangible cultural heritage.The users tests described in the previous paragraph prove that such a technology will enhance the approach of a growing public to the museums.The tests were intentionally carried out over users of any age or social background in order to simulate a typical journey inside the museum, for both experts and nonexperts.
Although AR is a growing technology in many fields, this sort of mobile apps applied to artworks is still broadly missing in the future we guess to evaluate our work also in comparison with other similar tools.
Future works in the dUcale project include the design of an ad hoc application in which a large number of services are dedicated to the visitors.Thus, starting from the ticket reservation, the visitor is tempted to use his own device as the main instrument for the entire visit.The application could also include a virtual route guidance that could guide the visitor along the museum, and it would be much more interesting if he could plan the visit before, choosing prior what to see or what to study in deep.
Finally, a next major step will be to develop and increase the number of image tracking for handheld devices, directly from this app.
In our opinion a main development for the "Città Ideale" app is to carry out experimental user tests, which examine the engagement and emotional response of visitors.We expect that the Urbino's Ducal Palace will participate to the debate recently engaged among international museum critics, curators, and neuroscientists [1].
Shelley Bernstein, the Chief of Technology at the Brooklyn Museum of Art [30], wrote that "experimentation without perfection is a good thing" and that "it is our responsibility, collectively, to try new approaches and provide as many entry points into content and the museum as possible." The main goal is to examine this entry points and to take advantage of new media tools in museum management.The cloud service will be able to monitor, evaluate, and easily modify tools and applications.

Figure 1 :
Figure 1: HR acquisition phases and equipment. Photoshop

Figure 2 :
Figure 2: The workflow for the high-resolution image.

Figure 3 :
Figure 3: The general scheme of dUcale AR demo.

Figure 4 :
Figure 4: The marker less tracking creation.

Figure 5 :
Figure 5: Adaptive SIFT approach.Feature extraction parameters are adapted to the processed image and then the SIFT algorithm is performed.

Figure 6 :
Figure 6: Virtual content visualized in real time in a real test in front of the "Città Ideale" in Palazzo Ducale, Urbino, Italy.

Figure 7 :
Figure 7: Image and text based point of interest in the AR application.In particular we tested image-based contents, videos (with transparent background), and shapes with text and interactive buttons to cope with social network activities and share contents from the AR application.

Table 2 :
Equipment for HD acquisition and comparison with market solution.