Classification and Statistical Trend Analysis in Detecting Glaucomatous Visual Field Progression

Aim To evaluate the agreement between different methods in detection of glaucomatous visual field progression using two classification-based methods and four statistical approaches based on trend analysis. Methods This is a retrospective and longitudinal study. Twenty Caucasian patients (mean age 73.8 ± 13.43 years) with open-angle glaucoma were recruited in the study. Each visual field was assessed by Humphrey Field Analyzer, program SITA standard 30-2 or 24-2 (Carl Zeiss Meditec, Inc., Dublin, CA). Full threshold strategy was also accepted for baseline tests. Progression was analyzed by using Hodapp–Parrish–Anderson classification and the Advanced Glaucoma Intervention Study visual field defect score. For the statistical analysis, linear regression (r2) was calculated for mean deviation (MD), pattern standard deviation (PSD), and visual field index (VFI), and when it was significant, each series of visual field was considered progressive. We also used Progressor to look for a significant progression of each visual field series. The agreement between methods, based on statistical analysis and classification, was evaluated using a weighted kappa statistic. Results Thirty-eight visual field series were analyzed. The mean follow-up time was 6.2 ± 1.53 years (mean ± standard deviation). At baseline, the mean MD was −7.34 ± 7.18 dB; at the end of the follow-up, the mean MD was −9.25 ± 8.65 dB; this difference was statistically significant (p < 0.001). The agreement to detect progression was fair between all methods based on statistical analysis and classification except for PSD r2. A substantial agreement (κ = 0.698 ± 0.126) was found between MD r2 and VFI r2. With the use of all the statistical analysis, there was a better time-saving. Conclusions The best agreement to detect progression was found between MD r2 and VFI r2. VFI r2 showed the best agreement with all the other methods. GPA2 can help ophthalmologists to detect glaucoma progression and to help in treatment decisions. PSD r2 was the worse method to detect progression.


Introduction
Glaucoma is a chronic disease characterized by an optic neuropathy with irreversible damage to the optic nerve head (ONH) and visual field (VF). From population studies, we know that increased intraocular pressure is associated with increased prevalence [1,2] and incidence [3] of glaucoma. With increasing life expectancy, it is fundamental to slow down its progression to avoid visual disability and blindness. Unfortunately, often times, the disease progresses despite treatment; for this reason, it is important to monitor changes and in particularly rates of change. ese changes may be observed by analyzing the ONH and the VF [4]. It is not easy to recognize VF changes due to test variability [5][6][7], even if field series are evaluated by experienced observers. Many algorithms are able to distinguish between fluctuation of sensitivity and progression, but no single one has been identified as being superior to the other [8][9][10].
ere are many approaches for detecting and quantifying clinical progression [11]. Most often, progression is based on the comparison of serial visual field printouts by expert clinicians. However, this is often insufficient to reliably determine progression; for this reason, classification and statistical approaches exist. Classification systems such as Hodapp-Parrish-Anderson [12] classification and the Advanced Glaucoma Intervention Study [8] (AGIS) systematically score research protocols evaluating both changes in depth and location (cluster analysis). ese methods are time-consuming but less subjective. e statistical methods are based on event and trend analysis. While the event analysis looks for the subjective identification of a confirmed event of change in a visual field series compared to a baseline exam, the trend analysis uses linear regression of the VF indices or of the sensitivity of the tested points to detect glaucoma progression over time.
e purpose of this study was to examine the level of agreement between classification systems (Hodapp et al. [12], AGIS [8]) and trend statistical methods (linear regression of MD and PSD, Humphrey Guided Progression Analysis (GPA) 2, and Progressor) in assessing VF progression.

Patients and Methods
is was a retrospective and longitudinal study with at least 5 years of follow-up (F/U). We followed in part the methods of Iester et al. [4]. e study, made in agreement with the tenets of the Declaration of Helsinki, included 20 Caucasian patients (mean age 73.8 ± 13.43 years) with primary openangle glaucoma recruited from the MI's Glaucoma Clinic. Visual fields were assessed by Humphrey Field Analyzer 750 II, (HFA, Carl Zeiss Meditec, Dublin, California, USA), using the 30-2 or 24-2 SITA standard (Swedish interactive thresholding algorithm) test. e first 3 fields in each series were excluded to minimize learning effects: the forth and fifth full threshold or SITA Standard exams were used as baseline. Full threshold strategy was also accepted only for baseline tests. Patients were classified as having primary open-angle glaucoma when they had a typical abnormal ONH and/or a typical glaucomatous VF, open angle at gonioscopy, IOP>21 mm·Hg before treatment, and no clinically apparent secondary cause for their glaucoma [13]. e abnormal ONH classification [14] was based on the presence of an optic rim notch or of diffuse/generalized loss of optic rim tissue, vertical cup/disc diameter ratio asymmetry unexplained by differences in optic disc size, or disc hemorrhage. A glaucomatous VF defect [14] was defined as three adjacent points depressed by 5 dB, with one of the points depressed by at least 10 dB and two adjacent points depressed by 10 dB, or a 10 dB difference across the nasal horizontal meridian in two adjacent points. None of the points could be edge points unless immediately above or below the nasal horizontal meridian. In addition, visual field testing was considered reliable only when false-negative responses and fixation losses were less than 20%; unreliable VFs were not included in the analyses. Mean deviation (MD) and pattern standard deviation (PSD) were considered in the study to describe the included patients. Included patients presented with a typical glaucomatous visual field (baseline MD>3). Exclusion criteria were concomitant ocular disease (for example, cataract), previous ocular surgery, systemic disease or medication known to affect the VF, refractive error exceeding 8D spherical equivalent or 3D of astigmatism, and visual acuity < 20/50 at baseline or during the F/U. Progression was analyzed by using classification systems and statistical analysis.

Hodapp-Parrish-Anderson Classification.
is classification is based on two criteria. e first criterion is the overall extent of damage, which is calculated by using both the MD value and the number of defective points in the Humphrey Statpac 2 pattern deviation probability map of the full threshold test. e second criterion is based on the defect proximity to the fixation point. is system divides early, moderate, and severe glaucomatous visual defects [13] and recognizes the progression of visual field glaucoma damage if a new defect in a previously normal area or a decrease of sensitivity of a previously defect or a previously defect that became larger or a general depression of visual field sensitivity appears [12].

Advanced Glaucoma Intervention Study Scoring System.
It is a quantitative method used to assess test reliability and to measure visual field defect severity using the Humphrey threshold test. e AGIS visual field defect score is based on both the number and depth of adjacent depressed test locations in the nasal area, upper hemifield, and lower hemifield. is score is obtained from the total deviation plot of the Statpac 2 single field analysis. A point is considered to be defective when a minimum amount of sensitivity depression is reached. Scores for each hemifield and for the nasal area are summed. e maximum possible score is 20 (two for the nasal field and nine for each hemifield). Progression is defined as an increase in score by 4 or more in three consecutive follow-up fields [15].

Guided Progression Analysis 2 (GPA2).
e visual field index (VFI) [16] is the trend analysis algorithm included in the GPA2. e VFI is a global parameter adjusted for age and expressed in percentage (a perimetrically normal field is set at 100% and a perimetrically blind one at 0%). To decrease the influence of cataracts, the pattern deviation probability map is used to identify test points with normal sensitivity (considered normal and scored 100%), those showing relative loss (these are scored as a function of total deviation and age-corrected normal threshold) and those showing no sensitivity (scored 0%). e VFI implements a weighting procedure that assigns more importance to central points than to peripheral points (considering 5 concentric rings of increasing eccentricity in the visual field plot). A significant trend was considered when the change in the VFI slope was considered as statistically significant (p < 0.05) by GPA2. A minimum of five exams over 3 years must be included in GPA2 for the linear regression results to be presented. e length of projection is equal to the number of years of GPA2 data that is available, up to a maximum projection time of 5 years.

Progressor.
It performs a point-by-point linear regression analysis of sensitivity on time for the whole visual field series. e program produces a cumulative graphical output of each test location over time, using bar graphs. e height of the bar graphs above or below a horizontal line represents the sensitivity of the testing location above or below 30 dB, respectively, allowing visual comparison of each point over time by the height of the subsequent bars. Significant rate of changes is shown through color coding. Pointwise linear regression analysis requires at least two fields to generate a slope and a minimum of five fields to be clinically useful. e pointwise linear model has been demonstrated to provide a valid framework for detecting and forecasting glaucomatous loss [10]. In this study, the level of statistical significance we used was p < 0.05, and the rate of decibel loss we considered clinically significant was >1 dB/year.

Mean Deviation (MD).
It gives an overall value of the total amount of visual field loss, with normal values typically within 0 dB to −2 dB.

Pattern Standard Deviation (PSD).
It measures irregularity by summing the absolute value of the difference between the threshold value for each point and the average visual field sensitivity at each point.

Statistical Analysis.
For the statistical analysis, linear regression (r 2 ) was obtained from GPA2 (VFI) and Progressor, and r 2 was calculated for MD and PSD. When significant (p < 0.05), each VF series was considered progressive. e agreement between methods, based on statistical analysis and classification, was evaluated using a weighted kappa statistic. e κ statistic interpretation was as follows: κ < 0, no agreement; κ � 0.0 to 0.19, poor; κ � 0.20 to 0.39, fair; κ � 0.40 to 0.59, moderate; κ � 0.60 to 0.79, substantial; and κ � 0.80 to 1.0, almost perfect agreement.
Time to decide if the VF series were getting worse was calculated for each method.

Results
A total number of 303 visual fields divided into 38 VF series of 20 patients were analyzed. Visual field tests were not possible to perform in two eyes for the low visual acuity. When possible, in the study, both eyes' visual fields were analyzed for patients, and the data were analyzed independently in a masked way without knowing the other eye status. e mean follow-up time was 6.2 ± 1.53 years (mean ± standard deviation). At baseline, the mean MD was −7.34 ± 7.18 dB and the mean PSD was 5.67 ± 4.09 dB. At the end of the follow-up, the mean MD was −9.25 ± 8.65 dB and the mean PSD was 6.92 ± 4.67 dB. e difference in perimetric indices at baseline compared to those at end of follow-up were statistically significant (Student's t-test p < 0.05 and p < 0.001, respectively, for MD and PSD).
Among the 38 VF series, 21 were considered as progressing using the Hodapp classification, 11 using the AGIS scoring system, 13 with MD r 2 , 5 with PSD r 2 , 12 with VFI r 2 , and 13 with Progressor. e agreement in detecting progression was ≥ κ � fair for all methods except for PSD r 2 . A substantial agreement (κ � 0.698 ± 0.126) was found between MD r 2 and VFI r 2 (Table 1). e mean time, expressed in minutes, needed to evaluate the progression of VF series using the different methods is summarized in Table 2. e difference in time between classification systems (Hodapp et al. [12], AGIS [8]) and statistical methods (MD r 2 , PSD r 2 , VFI r 2 , and Progressor) was statistically significant (Wilcoxon/Mann-Whitney test; p < 0.001) with statistical methods being less timeconsuming.

Discussion
e ability to detect the progression of visual field defects remains one of the most challenging aspects of glaucoma management. We found that, with the use of all the statistical analyses, there was a greater time-saving which is very important in the clinical practice [17]. In the event-based statistical approach [18], the criterion for progression is defined at the start of the study, and progression is confirmed when changes in VF have dipped below the preset threshold. Information at baseline and that from the most recent test is used to decide whether an eye has progressed or not. Trend-based statistical methods [18] can be adopted to improve progression rate measurement. ese may be applied to the MD or VF sectors or at individual test locations over time by linear regression analysis. ese approaches have been shown to be more sensitive in detecting progression than event-based analysis because all VF measurements over the course of follow-up are taken into consideration for the analysis; however, it is less accurate to inform on the location of the damage.
In our study, the Hodapp et al. [12] classification identified a greater number of progressive VF series compared to the other methods. Hodapp et al. [12] classification can be of great use in deciding when to start treatment once glaucoma has been diagnosed and how aggressive therapy should be, which is usually based on individual visual defect severity. e disadvantages include the fact that this threestage subdivision is too simplified and thus may make it inappropriate for a fine categorization of visual field defects. Moreover, it requires an accurate and time-consuming analysis of every single visual field test result [19].
We can explain the poor agreement found between PSD r 2 and all the other methods as the PSD analyzes localized VF defects. For this reason, it is not a good index when it is used in the POAG follow-up. In fact while the PSD is able to detect localized defects in early glaucoma, it is completely insensitive to a decline in the global background visual field level. PSD improves in the advanced stage of the disease; thus, it is not useful in the follow-up when the damage is advanced.
In our study, four eyes (11%) were identified as progressing by all statistical methods except PSD r 2 . is low concordance amongst different techniques has also been reported by other investigators [20,21].
Up to now, many methods have been developed for assessing glaucomatous VF progression, but there is no gold standard method to detect progression. ese methods that quantify the rate of visual field progression seem, however, to be the most appropriate for guiding treatment decisions [22].
In our study, VFI r 2 showed the best agreement with all other methods, especially with MD r 2 . e substantial agreement found between VFI r 2 and MD r 2 is because both are statistical reductions of the visual field sensitivity. VFI was built considering MD values corrected with some coefficients calculated on the position of the tested points: the paracentral points weigh more heavily in VFI than the more peripheral ones [16]. Furthermore, VFI expresses the field as a percentage of the "normal," whereas MD in dB scale. e MD makes no such assumptions, so is physiologically more robust. VFI is less affected by cataract and other media changes than MD when using the pattern deviation probability map. It allows a quantification of the VF series by comparing the defect depth with the normal age-adjusted visual field. is is done by taking into account the functional damage related to eccentricity to correlate with ganglion cell density. Moreover, it shows field status as a percentage. e latter easily allows, even for an ophthalmologist not specialized in glaucoma, to determine the rate of VF progression, therefore allowing to set a target intraocular pressure and instilling an individualized treatment. Furthermore, in the literature, different opinions about the virtues of the VFI over a long-established compared to MD are still present and on study [16,[23][24][25][26].
In conclusion, the best agreement to detect progression was found between MD r 2 and VFI r 2 , while PSD r 2 was the worse method to detect progression. VFI r 2 showed the best agreement with all the considered event analysis methods. GPA2 could help ophthalmologists to detect glaucoma progression and to help in treatment decisions because of the VFI analysis and the event analysis graph which could help to identify the VF area where the changes occur.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.