Addressing user expectations in mobile content delivery

. Multimedia services like television programs and live streaming of mobile videos can be delivered to mobile terminals via different access technologies. The question is – how do users perceive such services on mobile terminals? The objective of this study is to ﬁnd the correlation between video quality thresholds and the user context. Our study reveals the thresholds of user’s quality of experience (QoE) in a mobile environment by using different categories of content types, in relation to different access technologies and terminal capability. The mobile terminals used are: – (i) 3G Mobile Phone (ii) Personal Digital Assistant (PDA) and (iii) Laptop. We argue that quality of service (QoS) management should be driven by the user perception of quality rather than resulting from raw engineering parameters such as latency, jitter, bandwidth etc. Our results will be of great interest to network operators, service providers, terminal manufacturers, and researchers working in the area of quality of service management.


Introduction
Evolution in mobile technologies is enabling more services e.g.mobile videos and television programs for end user consumption.It is therefore vital for operators to measure and manage their networks efficiently.Quality of Service (QoS) is a metric commonly used to represent the capability of a network to provide guarantees to selected network traffic.However, a crucial measure of a network and the services it provides depends on how end users perceive the performance of the application.By contrast, Quality of Experience (QoE) is the term used to describe user perception of quality.Fulfilling all QoS parameters might not guarantee a satisfied user thus we need to understand QoE in order to use QoS effectively.The way in which QoS management is currently addressed is illustrated in Fig. 1.For instance, even though the maximum bit rate is fixed by the access network, a multimedia content may have been encoded at higher frame rate and lower frame quality or vice versa.
Focusing only on network-level QoS (NQoS) which deals with parameters such as bandwidth, packet loss, jitter etc. will fail to capture the requirements at the application level.Application-level QoS (AQoS) is concerned with the control of parameters such as content resolution, frame rate, colour depth, codec type, layering strategy, sampling rate and number of channels.Understanding the user requirements at the AQoS could further lead to better QoS management.
In this study we argue that when providing services targeted at end users, the user's perception should be placed at the centre of the service delivery chain.Hence we focus on a methodology for quantifying the user's perception of quality.The QoE for a user watching a news clip on a PDA might differ from another user watching that same news clip on a 3G mobile phone.This is because mobile terminals come in different display screens, bandwidth capabilities, frame rates, codecs and processing power.
This paper describes the experiments conducted to determine the thresholds of video quality acceptability across many different content types by considering mobile user requirements for different access networks.In addition, we examine the optimum way of saving on network bandwidth when delivering these services to the end users.The determination of these thresholds of acceptability across content types is crucial because service providers need to know the minimum levels of quality their customers find acceptable and this can also aid in implementing suitable charging schemes because there will also be consumer demand for lower quality at a lower cost.Knowing this information can help in providing good QoE, prevent customer migration, and enable efficient use of network resources.To ensure the validity of results, this study was carried out on the physical mobile devices.
The organisation of this paper is as follows: In Section 2 we discuss related work and issues affecting mobile applications.Materials and methods used to carry out our subjective assessment are described in Section 3. Results are illustrated and discussed in Section 4, and final remarks are included in Section 5.

Related work
Measuring and ensuring good QoE in a mobile environment is non-trivial as it is very subjective in nature and includes other factors such as environments, terminals, sensations, technical quality and content types.These factors make it almost impossible to measure QoE using objective methods.Before applications intended for end-user consumption are developed, it is essential to determine how end users perceive QoS.In other words, we need to map QoS strategies to application level specifications.Identifying these parameters and controlling QoS can help network infrastructure operators exploit network resources to deliver good QoE as required by the users.

Image size
Contrary to conventional television watching, mobile content delivery is constrained by requirements that include low bit rates, small display screens and low frame rates.A number of studies [5,17,18,23] have investigated human perception of image sizes.A survey carried out by Reeves et al. [23] revealed that image size does increase user arousal although this result was not specific for mobile terminals.It is generally acclaimed that larger image sizes can influence viewers' evaluation of quality.This is further justified by another study carried out by [5], where they have investigated the effects of image size and motion.The findings of their study indicate that people have a stronger response and feel more in control for a large image size than that of the smaller image size.
From mobile applications point of view, reducing image resolution has the advantage of utilizing less bandwidth but quality is reduced as a result of fewer pixels being used to represent visual details of the image.In a studied carried out by Knoche et al. [15] to identify the minimum acceptable image resolution for mobile TV, the authors found that resolutions smaller than 168 × 126 received poor ratings.In the case of education using video-based mobile learning, Maniar et al. [19] suggest that screen sizes typical of a PDA device may facilitate more effective learning experience in comparison to screen sizes of a mobile phone.

Application-level Quality of Service (AQoS)
Substantial attention has been given to AQoS [1,20], focused on frame rates and how it influences user's perception of multimedia quality.In [20] they found that subjects were more sensitive to reductions in frame quality than to changes in frame rates for small screens.Audio and video quality in terms of the overall ratings on audiovisual quality has also been studied [9,14,15,32].The integration of audio and video quality tends to be content dependent [9] where for less complex scenes (e.g. head and shoulder content); audio quality is slightly weighted higher than video quality.By contrast for high motion content, video quality is significantly weighted higher than audio quality.Winkler et al. [32] investigated the influence of various encoding parameters on audio, video and audiovisual quality.One finding in [32] was: More complex scenes would benefit from the allocation of a higher bitrate with relatively more bits allocated towards audio because a high audio bitrate seemed to produce the best overall quality.
From the literature review carried out, good audio provides a better video quality rating.But in [15] they found that better audio led participants to rate the video quality as unacceptable, which is contrary with previous studies [14,32].In [15], they found that low audio quality reduced participants expectation of video quality and as such they were less likely to rate the video as unacceptable.

Network-level Quality of Service (NQoS)
Different types of traffic demand different QoS requirements.For example, IP telephony and video conferencing require more stringent delay constraints compared to email and Internet browsing applications.Studies on NQoS parameters (bandwidth, delay, packet loss, and jitter) which affect the quality of multimedia services and applications are continuously being explored [4,10,11,27].The results in [10] suggests that for head and shoulder contents in QCIF format, only a bandwidth of 64 kbit/s is needed to provide good QoS.No added benefit on QoS was derived from increasing bandwidth to 128 kbit/s.The effect of packet loss tends to be the dominant factor that affects multimedia services.Packet loss that occurs in a long video stream is more tolerant to that which occurs in a short video stream [27].In another study carried out by [4], it was concluded that jitter can be nearly as important as packet loss in influencing perceptual video quality.

News clip.
Football clip.

Quality of Experience (QoE)
Some research [12,24,25] and proposals of models [3,22] have been carried out on QoE for mobile applications but no work to our knowledge has experimented on QoE using the diverse range of contents we have considered in this study to represent typical new services provided for mobile terminals.Furthermore, most work in this area have not been implemented using physical devices as we have done.Using physical devices instead of simulated devices gives the participants the freedom to move the screen closer to suit their needs, thus yielding results that correlate well to real world scenario rather than feeling constrained to watching the contents on a fixed monitor.Siller et al. [25] presented a framework for improving QoE.Their work shows that better QoE was achieved by prioritization mechanisms.Improved video quality was observed but at the expense of other media services being degraded.For real-time communication, Hestnes et al. [12] identified network conditions such as delay and packet loss to be the important parameters affecting QoE in a packet switched network.
To meet end-user expectations, we need to understand QoE in order to use QoS management effectively.The work presented herein differs significantly from an earlier poster paper [33] where the study focused only on web browsing over mobile UMTS networks.Herein, we reuse some of the fundamental concepts and methods but we assess the user's QoE for real-time streaming services.We study user's perception of quality (QoE) for different categories of contents in relation to encoding bitrates appropriate for different access networks (3G and WiFi) and terminal capability.

Selection of test materials
The selection of test materials used in this study were chosen to be a representative set of multimedia contents provided for mobile terminals and according to a survey carried out by [8] are also the types of contents consumers desire to watch on their mobile devices.Our test materials contain different amount of spatial and temporal information which, in turn, spans a wide range of coding complexity.A sample of this diverse variation is illustrated in Fig. 2, showing the original frames, whilst Fig. 3 shows the temporal variations between successive frames.
From Fig. 3, it can be seen that temporal motion of news clips on the average is significantly lower than the one present in football game clips.The descriptions of the contents used in this study are summarized in Table 1.Multimedia contents used in this study comply with the specifications stated in the Production of Video Test Sequences [28].The Moving Picture Expert Groups (MPEG) uses similar categories of contents in their subjective testing [21].However, the contents in [21] were not suitable for this study because they do not include audio and their duration is not sufficient to capture user's QoE.

Methodology
The method used in this study is known as Methods of Limits, which was proposed by Fechner [7].It is used to detect thresholds by changing a single stimulus in successive, discrete steps either in ascending or descending series.A series terminates when the intensity of the stimulus becomes detectable.The subjects give a binary response of "yes" or "no" when the stimulus is perceived.
In adopting this method and guided by [6,16,30], we designed two sets of experiments.In the first experiment (descending series), we gradually decreased from good to bad, the quality of the video parameters.Whilst in the second experiment (ascending series), we gradually increased from bad to good, the quality of the video parameters.The purpose is to determine whether there is a correlation between the user's thresholds of acceptability for both experiments.The audio parameters were kept constant in both experiments because audio consumes less bandwidth relative to video.Also previous research on audiovisual quality [14] suggests that better audio increases quality ratings.Method of limits has been successfully implemented in [15,20].

Preparation of source materials (video clips) for mobile terminals
The duration of the video clips that were used for quality evaluation in previous studies, varied between 8 to 30 seconds [2,10,21,29,31].However, this duration is insufficient to capture the user's QoE based on a preliminary test we carried out.The source materials were prepared as follows.Sections of TV programs were recorded for approximately 2 minutes and 40 seconds.This seemed appropriate since, according to studies by Knoche et al. [15] and Sodergard [26], the watching time of mobile television is very short, usually within 2 to 5 minutes.The Virtualdub software was used to divide each of the recorded contents into 8 segments (labelled 1 to 8 in Tables 2 to 4) and each segment was encoded at different parameters as follows: for the 3G mobile phone, using the trial version of Helix Mobile Producer (video codec: MPEG 4 and audio codec: AMR-NB); for the PDA and Laptop, using Windows Media Encoder series 9 (video codec: windows media video 9 and audio codec: windows media audio 9.1).Segments were encoded as illustrated in Tables 2 to 4 respectively.After encoding, each of these segments was concatenated using TMPGEnc 3.0 Xpress software to produce a continuous stream.
The idea is to gradually decrease or increase video quality to determine the thresholds of quality acceptability for users.Participants had no knowledge of these testbed combinations and are asked to indicate when they feel that video quality has become unacceptable or vice versa.Tables 2, 3 and 4 shows the testbed combinations.

Specification of mobile terminals
(i) Nokia N70 3G mobile phone with display of 28 × 35 mm, 18-bit colour depth, resolution of 176 × 220 pixels and a Nokia HS-3 headphone for audio playback.(ii) HP Ipaq rx1950 PDA with display type of 3.5 in TFT active matrix, 16-bit colour support, maximum resolution of 320 × 240 pixels and Goodmans PRO CD 3100 headphone for audio playback.(iii) Sony FR315B Laptop with a 15-inch TFT display, screen resolution of 1024 × 768 pixels but the actual used image size was 640 × 480 pixels and Goodmans PRO CD 3100 headphone for audio playback.

Subjective assessment
96 subjects participated in this study.The age range was between 22 to 36 years.Prior to each test session, each subject completed and passed a two-eyed Snellen test for 20/20 vision and an Ishihara test for color blindness.A training session was also given to make sure the subjects understood what was required.They were asked to use a binary response of "yes" for acceptable and "no" for unacceptable.The purpose is to find the threshold at which quality became unacceptable or vice versa.This is crucial information for service providers, because they need to know the minimum levels of quality their customers find acceptable in order to pursue good level of customer satisfaction.In accordance with ITU recommendation BT.500-11 [13], this methodology generates data that exhibits a logistic relationship to the perceived quality metric [20].The use of binary response for acceptability herein acts as a guidance because in a mobile environment the quality reference of DVD as excellent is not manageable.The Method of Limits, when used along with qualitative response from subjects, gives useful information about user's QoE.

Results
The analysis presented below addresses the aim of this study: to find the correlation between video quality thresholds.The binary response of "yes" for acceptable or "no" for unacceptable were mapped by allocating numbers into proportion of the video stream to measure subject's QoE.The bar chart in Fig. 4 shows the comparison between acceptability thresholds of multimedia contents on a 3G mobile phone.The mean QoE was obtained by taking the average of subject's rating thresholds across content types for each respective experimental set.The numbers used to represent Mean QoE corresponds to the first column of Table 2, which provides an indication of the various levels of encoding parameters video quality becomes acceptable(unacceptable) for the corresponding series.
It is clearly seen that when we presented the descending series to subjects, their satisfaction was generally lengthened before quality became unacceptable for them.Presented in the reversed order (ascending series), resulted in subjects keep wanting further improvement to the video quality which led to over provisioning before they accepted the quality of video.
Comments gathered for the mobile phone were loss of visual details "especially text details" and it was also impossible identifying players for contents like football and cricket except when players were zoomed in.

Contents Mean QoE
Descending Series Ascending Series Fig. 5. Correlation of video quality thresholds for the PDA.In the case for the PDA, over provisioning for the ascending series is not so pronounced when compared to the mobile phone.The reason for this may lie in the fact that as the screen size becomes larger the visual detail improves.The numbers used to represent Mean QoE corresponds to the first column of Table 3, which provides an indication of the various levels of encoding parameters video quality becomes acceptable(unacceptable) for the corresponding series.
For all content types, audio quality was acceptable.Comments gathered from participants regarding their ratings indicated that for news clips they preferred better audio quality, as they were primarily interested in the clarity of audio rather than video.
The correlation between video quality thresholds for the laptop (with 640 × 480 image size) is illustrated in Fig. 6.The numbers used to represent Mean QoE correspond to the first column of Table 4, which provides an indication of the various levels of encoding parameters video quality becomes acceptable(unacceptable) for the corresponding series.Subject's perception of news, romance movie and cartoon contents were also similar to ratings obtained for the 3G mobile terminal in the descending series.Jerkiness of video was more pronounced for all content types in the laptop compared to the other terminals.
It was noticed for all terminals that was used in this study, video game had the highest requirements in the ascending series.In general, users are accustomed to watching and playing video games in good enough video quality where their opponents and obstacles are clearly seen.In the case of initially starting from a bad quality and then gradually increasing the video quality, consequently led to a In order to demonstrate the differences in the subjects' ratings for the three mobile terminals, the statistical metric Z-score was used.This is because it allows us to compare the results of different normal distributions.Our comparison here is based on results obtained from different hardware, having differing characteristics such as codecs, screen size, brightness and contrast.From Table 5, it can be seen that the same trend is noticed across the different terminals regardless of the methodology type.

Conclusion
In this article we provide a new perspective on QoS management, arguing that this should be driven by the user perception of quality, rather than resulting from raw engineering parameters such as latency, jitter, bandwidth etc.To the best of the authors' knowledge, although subjective studies have been reported in the literature, the concept of 'user QoE' has not been studied in the context of real-time streaming media.We also addressed the requirements of mobile systems.The results presented herein represent a first step towards QoE-based QoS management, which has the potential to substantially increase the level of user satisfaction and, in turn, leads to a better use of network resources.The correlation between video quality thresholds for the methodology used in this study clearly indicates that QoS management can be efficiently managed when implemented in the descending series (good → bad) than in the reverse order.
Initial results presented herein indicate that QoE does change considerably with the screen size and the mode of presentation as discovered here.This was expected to some extent, but it is only via a quantitative assessment of user perception thresholds that one can optimize the service provisioning process.In this article we use a methodology to pursue this goal.
Service and Content providers may work together with the network operators to maximize user satisfaction in relation to the type of terminal, access network, and type of content being offered to the user.This will in turn, enable the provisioning of adaptable services which have been envisioned for years, but has not been delivered so far.Our work will evolve in the direction of studying the factors that need to be considered for the deployment of ubiquitous, adaptable services following the vision projected by 3GPP (the third generation partnership project) of services available anytime, anywhere, from any terminal.The new dimension that we are adding to that vision is that, the user perception has to be placed in the centre of the service delivery chain for mobile services.Future subjective studies will help understand how existing QoS management mechanisms should be used more effectively.
Another lesson learned from our study is that media adaptation cannot be achieved merely by acting on encoding parameters, compression ratio, frame rate and so forth.Contents must be edited specifically (e.g.larger text size for smaller screen size) for the type of terminal that will be used to access the content.This is a further level of optimization which can make a significant difference to the end user.During our experiments we created the clips as described in Tables 2, 3, 4 starting from recordings from television programs.This approach would hardly satisfy the user watching, for instance, a football match from a mobile terminal.The loss of visual details and the difficulty in identifying players or in detecting ball movement are bound to substantially decrease user satisfaction.Various editing techniques will address these issues.
On the other hand, we did notice that the user's expectations are considerably lower when they use a tiny terminal -the user is more tolerant to visual impairments when using a mobile phone.These 'psychological' factors cannot be taken into account via conventional QoS management studies.This is another example showing the potential of subjective studies which are so rare in the engineering community.
Our immediate target is to extend the work presented here in a peer-to-peer environment using JOOST as a case study, which provides IPTV services using peer-to-peer technology.We would also look into the realisation of a QoE model that could be based on the classification of video contents in the spatio-temporal domain.This is because from the subjective results obtained in this study, we noticed that certain content types had similar rating thresholds.

Fig. 6 .
Fig. 6.Correlation of video quality thresholds for the Laptop.

Table 1
Description of source materials