Online Discrete Anchor Graph Hashing for Mobile Person Re-Identification

. With the advance of mobile technologies, mobile devices such as unmanned aerial vehicle (UAV) become more important in video surveillance. By applying mobile person re-identiﬁcation (re-id), mobile devices can monitor pedestrians in the transportation system from complex environments. Since the computing and storage resources of mobile devices are limited, traditional person re-id methods are not appropriate for mobile condition. Besides, mobile person re-id task also requires real-time processing. In this paper, we propose a novel hashing method: online discrete anchor graph hashing (ODAGH) for mobile person re-id. ODAGH integrates the advantages of online learning and hashing technology. In ODAGH, we propose an online discrete optimization algorithm to improve the eﬃciency of anchor graph learning in the online scenario. Experimental results demonstrate the superiority of ODAGH in terms of both eﬀect and eﬃciency.


Introduction
With person re-identification (re-id) technology, it will be able to find the same identity from different and nonoverlapping cameras. Person re-id can be widely used for video surveillance; moreover, person re-identification is the key technology in pedestrian traffic monitoring [1]. Detecting and tracking a person across camera is important in traffic monitoring system [2]. Also, person re-id technology can intelligently and efficiently identify and track pedestrians in streets, airports, or other transportation systems.
e task of person re-id is an image retrieval problem. Given a probe image (query), the purpose of person re-id is to search the information about established personnel characteristics in a traffic database for images that contain the same person [3]. Traditional person re-id technology is used in the scene where cameras are unable to move, such as fixed camera networks in different public areas, including urban transport systems.
In recent years, with the advance of mobile technologies, mobile systems such as unmanned aerial vehicle (UAV) are widely used for video surveillance and traffic monitoring [2]. For example, UAV with cameras can play more important roles in tracking people in transportation system from complex environments such as rural area, mountain, and sea, where fixed cameras are lacking or cannot reach, and they can accomplish dangerous and boring visual information gathering tasks with great excellence. erefore, mobile person re-id is required. Unlike traditional person reid methods, mobile systems have limited computing and storages resources, and in real-word applications, the communication bandwidth is also limited. However, image processing methods usually have high time and space complexity, and traditional person re-id methods are not suitable for mobile systems. e hashing technique is an efficient image retrieval method for fast person re-id. It converts high-dimensional data into short binary code while keeping the data similar. With the operations including XOR and bit-counting, it will make it easier to do a fast search. Also, the hashing method based on machine learning is proved to be superior to the hashing method based on random projection [4][5][6]. However, most existing hashing methods use batch learning strategy, and they learn hash functions offline by using specific training data. If new person images are collected and added to the database, they also use the pre-trained model to obtain new hashing codes. If the information of new images is different to training data, the performance of offline hashing model will be affected significantly. Offline hashing methods must accumulate all the databases to retrain new hash functions and recompute all the hashing codes. ey are obviously very inefficient, especially when the database is frequently updated by new collected images.
Existing hashing methods are not appropriate for the application of mobile system which requires fast real-time methods to solve their tasks such as person re-id. To cope with the real-time demand of mobile systems, a mobile person re-id method should have two import characteristics: (1) it should support fast retrieval of images and (2) it should learn hashing functions from changing training data in online manner. As an emerging technology, online hashing technique [7,8] can be applied to cope with the online retrieval of changing database.
In the paper, online discrete anchor graph hashing (ODAGH) is proposed for mobile person re-id. ODAGH utilized the advantages of graph learning to effectively preserve the visual correlation of person images. However, traditional graph learning still requires much storage and computing resources. In ODAGH, we propose online anchor graph learning which first uses anchor graph to reduce the space cost of graph construction and then uses an online learning algorithm to optimize the graph model effectively and efficiently. e main contributions of this paper are summarized as follows: (i) ODAGH integrates the advantages of online learning and hashing, and as a hashing method, it can be easily applied to mobile systems with limited computing and storage resources. By leveraging online discrete algorithm, ODAGH can efficiently update hashing functions and learn discrete hashing codes when new images are collected. It can fulfil the real-time demand of mobile systems. (ii) ODAGH is an unsupervised hashing method, so it requires no training labels which are hard to obtain. By improving the graph learning, the visual correlation can be effectively preserved in hashing codes. Moreover, by using a GPI-based online optimization, quantization loss can be largely avoided in the hashing process. (iii) ODAGH only relies on one parameter, which guarantees its robustness in real applications. Also, experimental results demonstrate the effectiveness and efficiency of ODAGH compared to other person re-id methods. e rest of this paper is organized as follows. e related work is reviewed in Section 2. We present the details of our proposed method in Section 3. e experimental configuration and results are introduced in detail in Section 4. Finally, the conclusion of the paper is given in Section 5.

Image Hashing.
In recent years, image hashing has gained much attention from researchers in image retrieval. Generally, there are two kinds of hashing types: one is supervised hashing [9][10][11] and the other is unsupervised hashing [6,12]. By learning the class labels or leveraging other supervised information, the binary hash coding can be studied in a supervised hashing method. Representative supervised hashing methods include column samplingbased discrete supervised hashing (COSDISH) [9], supervised discrete hashing (SDH) [10], deep supervised discrete hashing (DSDH) [13], and fast scalable supervised hashing (FSSH) [14]. However, because of the extremely expensive costs to annotated supervised labels, the supervised hashing methods have limited application range on large-scale image retrieval.
Unsupervised hashing methods have one obvious advantage that they can learn hash functions without any label information.
erefore, the original geometric structure, including visual information, can be retained in hash coding. Representative supervised hashing methods include spectral hashing (SH) [4], iterative quantization (ITQ) [6], angular reconstructive embeddings (AREs) [15], unsupervised hashing with binary deep neural network (UH-BDNN) [16], and similarity-adaptive deep hashing (SADH) [17]. Nevertheless, most unsupervised and supervised hashing methods learn hash functions in offline manner, and they cannot cope with the scenario where new data are continuously added to the database.

Person Re-Identification.
Traditional research on person re-id mainly includes visual feature representation [18,19] and distance metric learning [20,21]. In [22], feature effectiveness was identified in a query-adaptive manner for feature fusion. e method proposed in [23] learns discriminative and robust representations via dictionary learning. In [24], Fisher vectors were used for person re-id. Part loss network was proposed in [25] for deep representation learning. ere are also many studies related to distance metric learning. In [26], person re-identification problem was formulated as a ranking problem, and Ensemble RankSVM was developed to solve the problem. KISSME [27] considers the scalability and learns a distance metric from equivalence constraints. Mahalanobis metric leaning was also used in several methods [28,29]. Feature presentation learning can be regarded as the pre-processing step of person re-id, and it can be combined with metric learning or our proposed hashing method.
Recently, several hashing-based person re-id methods [30][31][32][33] were proposed to improve the search efficiency. As we know, fast indexing is so essential between the raw image data and the binary hashing codes. In order to realize the end-to-end fast indexing, the convolutional neural network (CNN) is always adopted in deep regularized similarity comparison hashing (DRSCH) [34]. e spatial information is integrated with part-based deep hashing (PDH) by representing horizontal parts to binary codes for feature discrimination [35]. A novel coarse-to-fine (CtF) method [36] complementarily learns short and long codes in a pyramid structure from CNN. Most of existing hashing-based person re-id methods use deep network for feature representation learning, and they ignore the preservation of visual correlation in hashing codes. Moreover, deep learning [37,38] methods cost many computing and storage of resources, so they are not appropriate for the mobile person re-id task which is always performed on mobile systems with limited resources.

Proposed Method
In this section, we introduce online discrete anchor graph hashing (ODAGH) in detail. e overall framework of ODAGH is shown in Figure 1. If a mobile device (e.g., UAV) collects new images of a person, then visual features are extracted for online discrete anchor graph learning. ODAGH can efficiently update hashing functions by only new features and training variables with small sizes. en hashing codes of new images are added to database. In the query step, users can use a query image of specific person to search images of this person from database.

Problem Setting.
Suppose the database consists of streaming images. When new images come in, we update the hash functions. We define X ∈ R N×d as image matrix, where N is the number of all training images in database and d is the dimension of image feature. In the online learning process, image matrix X can be represented as [X T old , X T new ] T , where X old ∈ R N old ×d denotes old images in the database and X old ∈ R N new ×d denotes new images, N � N new + N old . Our goal is to learn hashing functions and hashing codes H ∈ R N×k for all images, where k is the code length. In order to guarantee efficiency, we directly use linear projection to reduce time cost. us, the formulation of hashing function is defined as where W ∈ R d×k is the weight matrix. e main notations used in this paper are summarized in Table 1.

Basic Formulation.
We use graph learning [39] to preserve the visual information in hashing codes. However, traditional graph learning costs many computing and storage resources, and it is hard to design an online algorithm for graph learning. erefore, we use anchor graph learning which constructs anchor graph to approximate a graph.
For constructing the anchor graph, N a anchors x a j | N a j�1 firstly can be randomly selected from the training data, and later we can approximate the data neighbourhood structure. en, the truncated similarity matrix Z ∈ R N×N a can be achieved as where dist(x i − x a j ) is the distance between image x i and anchor x a j (in this paper, we use L2 distance for similarity matrix) and σ is the mean of all distances. In addition, we can approximate the graph matrix of each modality by A � ZZ T .
Based on the construction of anchor graph, we can formulate anchor graph learning for hashing as e above formulation is similar to traditional graph learning. e biggest difference is that the Laplacian matrix should be computed as L � I − A, and A is approximated graph matrix. Tr(·) denotes trace operator.
It is unavailable to directly solve the discrete constraint of hashing matrix H in equation (3). erefore, we relax the hard discrete constraint by introducing a continuous matrix F � XW to replace H in equation (3). We also add the term ‖F − H‖ 2 F to make H close to F. en, we can obtain the overall basic formulation as where α is the parameter for the second term. e constraint W T W � I is used to avoid the trivial solution of W. e objective function (4) Figure 1: e overall framework of OAGH for mobile person re-id.
It is obvious that equation (5) is a quadratic problem on the Stiefel manifold [40], and the Lagrangian function of this objective function is By setting the derivative of equation (5) w.r.t W to 0, we have We can use generalized power iteration (GPI) [40] to solve W, and the detailed algorithm of GPI is shown in Algorithm 1.
en, we fix W and optimize H, and (4) becomes e solution of (4) can be easily obtained by

Online Algorithm.
In this section, we consider the online optimization of (4). In the online learning process, image matrix X can be represented as [X T old , X en, we consider the online improvement of GPI, and we can obtain that where C old � X T old X old , and we can find that C old is only related to old images; it can be preserved in previous learning. So, in the online process, we only need to compute X T new X new , whose time complexity is O(N new ). e time of computing C new is linear to the size of new images, and it is irrelevant to size of the database. e truncated similarity matrix Z also can be represented as [Z T old , Z T new ] T . erefore, we can obtain that where D old � Z T old X old . Similar to (8), we can find that the time complexity of computing D is also linear to the size of new images, and it is irrelevant to the size of the whole database.
Based on (11) and (12), we can obtain online updation of B new with linear time complexity which is irrelevant to the size of the database: Similarly, when we compute M, its term X T H can be computed in an online manner: where E old � X T old H old , and the time complexity of computing E new is linear to the size of new data.
According to the above division for online updating, we propose an online optimization algorithm based on GPI to solve W and optimize H. e detailed procedure is summarized in Algorithm 2.
In Algorithm 2, according to above analysis, we can find

Overall Process.
Suppose the initial database has N I images X I , and it becomes larger when new images come in. e overall process of our OAGH is described in Algorithm 3.
Suppose that the database size is N currently. We can easily find that the time complexity of Algorithm 1 is O(N I ), and the time complexity of Algorithm 2 is O(N new ). erefore, the time cost of our overall iterative online process is linear to the size of all images. Also, it is equivalent to the time cost of one-round learning of offline hashing.

Datasets.
We use Market-1501 dataset [41] to evaluate the performance of our method. ere are 32,668 bounding boxes of 1,501 identities in Market-1501 dataset. It has the largest person re-id dataset with 14.8 cross-camera ground truths for each query on average. Also, it is closer towards realistic situations than previous ones. Market-1501 dataset can better show the effectiveness of online methods. Since we mainly focus on the efficiency of person image retrieval and do not focus on feature learning, we directly use IDE_R-esNet_50 features.

Experimental Settings.
We compared our methods with two non-hashing methods: Euclidean [41] and KISSME [27], and two supervised offline hashing methods: COSDISH [9] and SDH [10]. We set N I � 1000, which means that the database contains 1000 images at the beginning. en, we set N new � 1000, which means that at each round, 1000 new images are added to the database. Finally, the database contains 32668 images; at the last round, 668 images are added. Since COSDISH and SDH are offline methods and it is obviously time-consuming to train in every round, to be consistent with real applications, we only train hash functions at first round for COSDISH and SDH. Deep learning-based hashing methods [34][35][36] usually require many computing and storage resources, which are not appropriate for mobile embedded systems. erefore, we do not use them for comparison.
Non-interpolated mean average precision (MAP) score is utilized [42] to evaluate the performance of all compared methods. Given a query, the average precision (AP) is defined as where p is the number of relevant images, pre(i) is the precision of top i retrieved images, and rel(i) � 1 if the image is relevant to i − th query; otherwise, rel(i) � 0. e MAP score is the mean of AP scores from all queries. Besides (2) Compute C new according to (11); (3) Compute D new according to (12); (4) Compute B new according to (13); (5) Initial an orthogonal matrix W; (6) for iter < max_iter do (7) Compute H new � sgn(X new W) ; (8) Update E new according to (14); Journal of Advanced Transportation 5 MAP, precision-recall (PR) curves are also used to measure the performance of all methods. Table 2 shows the MAP score of all compared methods on Market-1501 dataset. e results of Euclidean and KISSME are reported in [41], and we carefully tune the parameters of COSDISH and SDH to report best results. From Table 2, we can find that OAGH performs best. Although OAGH is an unsupervised method, it can effectively preserve visual correlation of images in hashing codes. e performance of two offline hashing methods is much worse than other methods, even though they are supervised methods. e main reason is that they cannot support online learning of new images, and they only use 1000 initial images for training. e MAP scores obtained by the non-hashing method KISSME are very close to OAGH.

Experimental Results.
e main reason is that hashing will introduce quantization loss. It is reasonable that hashing methods perform even worse than non-hashing methods. Figure 2 shows the PR curves of all compared methods, and we can further observe the retrieval performance of them. We can obtain similar results as MAP scores. OAGH and two non-hashing methods significantly outperform other two hashing methods. Also, the yellow curve of KISSME is very close to blue curve of OAGH.
Both MAP scores and PR curves demonstrate the superiority of our ODAGH. It can easily outperform offline hashing methods. Also, it can even outperform classical nonhashing methods. e above results prove that the proposed online discrete anchor graph learning process has good applicability and makes the process effective.
At last, we also evaluate the total training time of ODAGH. Table 3 shows the comparison of training time, and the experiment described in Table 3 is conducted on a PC with Core i5 2.11 GHz CPU and 16 GB memory. For offline hashing, we use SDH as representative. When new images come in, offline hashing trains the whole database, while ODAGH only needs to process new data. As a result, in our experiment, ODAGH totally processes 32668 images, and SDH processes 560668 images in fact. erefore, the training time of ODAGH is significantly less than SDH. If new data continue to come in and the database continues to be enlarged, the superiority of our method in total training time will be much more significant.

Influence of Code Length.
Traditional graph-based hashing methods cannot directly solve the discrete constraint of hashing codes, which will introduce much quantization loss. When code length increases, the quantization loss will also increase, and thus their performance cannot be improved and even deteriorates when the code length increases [43]. Figure 3 shows the performance of ODAGH with increasing code length, and we can observe that the performance increases constantly with the increase of code length. e main reason is that ODAGH uses the GPI-based online discrete optimization procedure which can effectively avoid the quantization loss. We can also find that when that code    length is larger than 512, the increase of MAP score is not significant, which indicates that 512 bits are sufficient for this person re-id task.

Parametric Analysis.
In this section, we analyse the influence of parameter on our method. Figure 4 shows the MAP scores with different values of α, where α varies from 1 to 10 5 . We can observe that the performance of ODAGH is stable with the variation of α, and thus our method is not sensitive to the parameter and is robust in person re-id. Moreover, we can find that MAP score achieves relatively high value when α > 10 4 , which has the same magnitude with image number.
is phenomenon indicates that we can easily set the parameter α according to database size.

Conclusions
In this paper, we propose online discrete anchor graph hashing (ODAGH) for mobile person re-id. We first introduce the basic formulation of discrete anchor graph learning which can effectively preserve visual correlation of images and avoid quantization loss. en, we derive a novel online optimization algorithm to update hashing functions and compute discrete hashing codes in online manner. e time complexity of online optimization algorithm is linear to the size of new images and irrelevant to the database size. Experimental results on real-world dataset Market-1501 demonstrate both effectiveness and efficiency of ODAGH compared to several state-of-the-art non-hashing and offline hashing person re-id methods.
In future work, we will continue to improve ODAGH to make it more suitable for the real-world application of mobile person re-id. For example, several UAVs usually work together with each other, and distributed hashing [44] is required for the person re-id of UAV swam.

Data Availability
e Market-1501 data used in this study are from previously reported study which has been cited, and they are publicly available on https://github.com/zhunzhong07/IDEbaseline-Market-1501. e experimental result data used to support the findings of this study are available from the first author upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this article.