Mismatch Based Diagnosis of PV Fields Relying on Monitored String Currents

This paper presents a DC side oriented diagnostic method for photovoltaic fields which operates on string currents previously supplied by an appropriate monitoring system.The relevance of the work relies on the definition of an effective and reliable day-byday target for the power that every string of the field should have produced. The procedure is carried out by comparing the instantaneous power produced by all solar strings having the same orientation and by attributing, as producible power for all of them, the maximum value. As figure of merit, the difference between the maximum allowed energy production (evaluated as the integral of the power during a defined time interval) and the energy actually produced by the strings is defined. Such a definition accounts for both weather and irradiance conditions, without needing additional sensors.The reliability of the approach was experimentally verified by analyzing the performance of two medium size solar fields that were monitored over a period of four years. Results allowed quantifying energy losses attributable to underperforming solar strings and precisely locating their position in the field.


Introduction
It is commonly known that photovoltaic (PV) fields can have a lower energy yield than expected.This occurrence can be due to many factors, the main ones being the adoption of low-quality materials (in order to reduce costs), careless assembly (because of poorly skilled manpower), and wrong design.Often the bad performance is only recognized after that the degradation has become so impressive that revenues fall well behind the nominal targets.The chance to detect malfunction events early depends on the adoption of reliable monitoring/diagnostic (M&D) systems (a complete literature survey about M&D methods can be found in [1]).
A rough classification of M&D techniques can be based on the "level of granularity" (LoG).The lowest LoG corresponds to treating of the solar field as a whole, thus monitoring the instantaneous output power, at either the DC side or the AC side, while the highest LoG corresponds to the monitoring of each individual solar panel embedded in the solar field.Independently of the LoG, the yield of a photovoltaic system can be evaluated by means of the performance ratio (PR), which, according to the IEC 61724 [2], is defined as the ratio between the measured instantaneous power,   , and the nominal power of the system,  nom , corrected by taking into account the instantaneous irradiance   with respect to the irradiance at STC.
In order to take into account thermal effects, the improved version was proposed in [3]: where  is the temperature coefficient and Δ is the temperature increment with respect to 25 ∘ C.

International Journal of Photoenergy
In [4],  nom was replaced with the expected ac power  ac , evaluated as  ac =   ( 1 +  2   +  3 log (  )) (1 +  4 (Δ)) , (3) where  1 ,  2 ,  3 , and  4 are fitting parameters, while, in [5,6], (1) was modified as follows: where  is a loss term, taking into account both temperature and mismatches effects.Methods based on the monitoring at low LoG level have the drawback of not being suitable for locating faulted components.Such a feature can be achieved by moving towards a higher LoG, which implies the increasing of the number of monitored parameters.This approach is illustrated in [7], where string currents are compared with a previously defined nominal reference.
A slight different approach can be found in [8], where an inferential algorithm is adopted to individuate, after an initial training, one or more reference strings whose yields are used in place of the nominal power for the definition of the performance ratio.In [9], a restricted dataset of observed string currents and voltages are analyzed for the determination of possible faults that could have caused that dataset.
M&D methods based on the PR require the definition of a range where PR is considered as normal (even though less than 1); the definition of the range is usually made on statistical bases by considering the standard deviation .The optimal width of the confidence interval in terms of  is still debated; as an example, in [10] the ineffectiveness of the 3 rule, usually adopted to recognize outliers strings, was evidenced.Alternative methods were proposed in [11,12], where the plot of the whole I-V curve of individual strings was used to recognize six categories of faults, including shadow effects and bypass diode fault.
Improved fault location capability can be attained by pushing the monitoring at the individual solar panel level.This solution is relatively simple to implement in distributed conversion systems [13], where each solar panel has its own dc/dc converter that can be properly controlled to plot or estimate the I-V characteristic [14].Otherwise a restricted set of parameters can be measured.In [15] both the instantaneous solar panel operating voltage and the operating string current (which coincides with the panel current) are measured, so that the instantaneous power delivered by the solar panel can be calculated.In [16][17][18] the insertion of a series connected switch is exploited to temporarily disconnect individual solar panels from the string, thus allowing the measurement of both open circuit voltage and short circuit current.Unfortunately, single panel monitoring systems are too expensive and their adoption is very limited.
On the other hand, almost all large solar fields have some form of current monitoring.This information is usually provided by current sensors embedded into the parallel boxes (string-boxes), which logs operating currents into a devised database.Querying of the database is usually allowed and data can be organized in a large variety of plots and tables; however, interpretation is left to the customer.The main issue is that data might be not significant by itself; in fact, there is no way to interpret, for example, the power delivered by a PV string in a given day without information about the expected power in that day.Unfortunately, as evidenced by ( 1)-( 4), the latter information depends on the specific weather conditions.This problem has been tackled (e.g., [3,8]) by enriching the monitoring system with additional sensors (irradiance, temperature, wind speed, and so on), allowing to convert weather information into suitable yield targets.The main drawback of this approach (in addition to the need for a more pervasive sensor network) is that it requires accurate models and reliable parameters.
In this paper, an automated procedure for analyzing string current is proposed.Differently from other methods, the proposed approach does not require irradiance and/or temperature sensors.Moreover, since no provisional models are adopted, calibration and parameter extraction are not required as well.The method, indeed, relies on the generation of a site dependent target, which is built by exploiting only the string currents, with no need for weather information.As will be detailed in Section 2 the target is built by evaluating the instantaneous power produced by all solar strings and by attributing, as producible power for all of them, the maximum value.Thanks to this approach real-time warnings can be given, along with detailed information about production losses.Moreover, malfunctioning strings can be located and, in selected cases, the origin of the performance limitation can be suggested.
Since the procedure is based on DC side measurements its main limit is that losses coming from AC faults cannot be quantified.However, it should be considered that catastrophic faults occurring in the AC side (e.g., undervoltage/overvoltage and underfrequency/overfrequency) result in the disconnection of the inverter from the utility grid.This event is recognized thanks to the simultaneous (and sudden) zeroing of all strings currents.
The paper is organized as follows.Section 2 reports the method for generating the power versus time (P-T) reference curves.Section 3 illustrates the results of a wide experimental campaign, performed on two solar fields (280 kWp and 420 kWp).Discussion and comments about the results are also provided.Conclusions are drawn in Section 4.

Target Generation
As mentioned above, string current monitoring systems (string-boxes) are largely available and widely adopted in large PV fields.In this paper the commercial system [19] was exploited.This system populates a devised database with string currents measured every 5 min; subsequently, currents were converted into power by reading the string voltages (since the performance of strings belonging to the same subarray was compared, the voltage was just a proportionality constant between current and power).It must be remarked that both analyzed solar fields were not equipped with irradiance/temperature sensors; hence, data analysis with approaches presented elsewhere (e.g., [3,8]) was not allowed.
Power profiles available for each string had the form shown in Figure 1.This figure reports the power produced, for example, on March 28, 2015, by PV string #3, connected to inverter #1 (notation (, ) in the legend means that the curve refers to string # which is connected to inverter #).
The analysis of the curve poses several issues.First, a medium size solar field consists of tens of strings (the two fields analyzed in this paper had 61 and 85 PV strings, resp.).Thus, a large number of curves are produced every day.Second, the curves do not give information by themselves because the power produced in a given instant of time is the result of the specific irradiance conditions.As a consequence, the curves must be compared with an expected reference; references might be constructed by adopting historical weather series (as done in commercial tools for PV system design).In such a case, only a statistical comparison over a long observation period could be performed.Otherwise, references could be achieved by exploiting real-time data on weather conditions.In such a case, very reliable parameters and accurate modeling of the solar field structure would be needed [20]; special cases like architectural shadows [21,22] could not be recognized.
In this paper, the target P-T curve was built by exploiting the string currents (converted into power) provided by the monitoring system.More specifically, it was assumed that, in a large solar field, it is likely that, for each instant of time, at least one PV string is properly working.On this basis, a virtual target curve is created, according to (5) by considering, for each instant of time, the power produced by the best performing string.
In ( 5) the subscripts  and  have the meaning defined above (string # connected to inverter #), and  is the number of strings connected to inverter #.In other terms, the target P-T curve (for a group of strings which are parallel-connected to the same inverter) consists of several pieces, and each piece is extracted from the P-T curve of the string that, in the specified interval of time, is producing more than all the other strings.The sense of this definition is that if in the array there is a string that can actually produce that power (as defined in ( 5)), all the strings could do that, based on the assumption that weather conditions, irradiance, and ambient temperature are the same.
As an example, Figure 2 shows, along with the curve already depicted in Figure 1, the target P-T curve evaluated according to (5).The analysis of Figure 2 clearly points out a reliability issue affecting string #3, while Figure 1, without the comparison with the target, could have been interpreted as a reduction of irradiance (clouds) occurring at 15:00.
It is worth nothing that the definition of the reference P-T curve allows determining of the daily energy target as well; indeed, energy is defined as the integral of the P-T curve and is graphically illustrated in Figure 3(b).

Experiments and Discussion
The daily production target defined above was exploited for the diagnosis of malfunctions occurring in two solar fields, both performing unexpected low energy production.
Two kinds of analysis were performed: a horizontal analysis, where the performance of each PV string was compared with other strings and with the target, over one year of observation, and a vertical analysis, where the performance of each string is compared with itself over four years of observation.
Comparisons were made at the subarray level, where the subarray is defined as a group of parallel-connected strings with same orientation.Since different subarrays experienced slightly different tilt and azimuth orientations, devised P-T targets were defined for each of them.With the aim of avoiding the presentation of almost duplicate data, the results of the horizontal analysis will be shown with reference to the 280 kWp solar field, while the results of the vertical analysis refer to the 420 kWp solar field.

Horizontal Analysis.
In the horizontal analysis, the energy produced every day by each PV string of a subarray was monitored for a one-year period (2015) and compared with the energy target defined by integrating the corresponding P-T curve, the target being defined with reference to the same subarray.The comparison between the energy actually produced and the target made it possible to quantify the energy losses accumulated by the specified string during the given day.
In Figure 4, an example of the results is shown.For the sake of simplicity, only the energy losses referring to four days of the year are reported.These four days are representative of the four seasons; in order for the data to be representative, four bright days were chosen.The figure shows the performance of the ten strings connected to the inverter denoted as #1.
From the figure, it can be argued that two strings, #6 and #7, perform almost like the target (which means that according to (5) the target was built mainly by taking pieces of the P-T curves from string #6 and string #7); some of the others show relevant energy losses, with peaks of about 50% in December (yellow bars).The latter circumstance suggested the eventuality that losses could have been attributed to architectural shadowing.In order to point out the effects of architectural shading the relative energy losses were also evaluated by restricting the integration window of the P-T curve to only two hours, centered at noon (from 11:00 to 13:00), so as to have the sun as high as possible in the sky. Figure 5 shows the results.
The figure reveals a strong reduction of the percentage losses, thus confirming that when the sun appears lower in the sky the solar field is strongly subject to shadowing.However, significant gaps can be still seen around noon as well.In the proposed approach those strings exhibiting more than a given percentage (set by the customer) of energy losses, during a given interval of time (set by the customer as well), are classified as malfunctioning.In other terms, the customer can decide to receive a real-time alarm if a string produces (for example) less than 50% of the target during one hour of observation or a daily alarm if the daily production is less than (say) 20%.The analysis shown in Figures 4 and 5 was repeated for all eight inverters of the field.The results are collected in Figure 6.
It can be observed that there are large variations, both among the subarrays and through each subarray.In Table 1, the strings experiencing energy losses greater than 5% are reported.A cumulative loss of about 5 MWh/year can be attributed to them.The cumulative energy losses evaluated by considering all the strings (61) were about 10 MWh/year.
A further summary of the results is shown in Figure 7, which reports the average energy loss (with respect to the target) cumulated by each subarray; the standard deviation is also reported.
It is important to point out that each subarray had its own target; therefore percentage losses, shown in the figure, are reliable estimators of the energy that a proper maintenance could recover.The width of the standard deviation can be used to individuate those subarrays embedding outliers string.For example, the large standard deviation found for inverter #7 is a clear warning about the presence of some strongly underperforming string, as confirmed by Figure 6(g) (inverter #7), which shows large losses attributable to string #5; the fact that losses strongly increase in December suggests that the string is subject to architectural shadowing.The above results are not trivial.It should be pointed out, indeed, that the analysis was fully automated.The tool can operate in blind mode on every kind of database containing string currents.Moreover, as reported in Table 1, strongly underperforming strings are precisely located, and the amount of energy loss can be used as a criterion for deciding the priority of maintenance.

Vertical Analysis.
In the vertical analysis the performance of each solar string was monitored over a period of four years.This kind of analysis made it possible to quantify deterioration over time.Examples of vertical analyses are reported in Figures 8(a The comparison of the measurements performed in October 2012 and October 2015 shows a dramatic deterioration of string #9.It is important to note that the target curves were almost identical; this observation allowed to surely classify the string as faulty.In the current version, the procedure automatically classifies as "probably faulty" those strings exhibiting an increment of energy loss (with respect to the target) greater than 5% from one year to another.
For the sake of completeness, Figure 9 illustrates the cumulated energies corresponding to the P-T curves of Figure 8.
As can be seen the daily energy lost by string #9 was about 40%.
Summary results of the vertical analysis for all the subarrays are reported in Figure 10 (notice that results referring to 2013 were omitted because in that year the solar field was subject to dramatic failure on the AC side).
As can be seen, in some cases (see, e.g., Figure 10(b), string #9) degradation occurs suddenly from one year to another.This fact indicates some disruptive phenomenon occurring in solar panels rather than the gradual worsening of the parameters.Another observation is that, in many cases, those strings which show a large deterioration from one year to another (see, e.g., Figure 10(c), string #1) were characterized by high losses since the first year of operation, thus suggesting the presence of defective solar panels, already prone to reliability issues.The above considerations can be better appreciated by analyzing degradation on a statistical base.Let us consider Figure 11, which represents the percentage energy yield (with respect to the maximum allowed energy production given by the target) over time for the strings connected, for example, to the inverter #4.The bars refer, respectively, to the best performing string, the worst performing string, and the average yield, the latter being evaluated by considering all the strings connected to inverter #4.From the figure it can be argued that the decreasing of the average value is almost entirely caused by the strong degradation of the worst string (as confirmed by Figure 10(d), string #9), thus confirming, again, that degradation is not a uniform phenomenon.As mentioned above, it is also interesting to note that the worst string was already underperforming since the first year; this observation suggests the hypothesis that disruptive phenomena take place because of poor components which might cause damage propagation.
A summary, referring to the entire filed, is reported in Figure 12.The figure shows the mean percentage energy yield (with respect to the target) evaluated by considering the whole solar field; the standard deviation is also reported (results referring to inverter #4 are shown for comparison).The analysis of the figure confirms that degradation is mainly a spotted phenomenon.Indeed, the standard deviation increases with time as a result of largely spread individual performance.

How to Set the Alarm Threshold.
From the above discussion it comes that the threshold set for the alarms affects the number of generated alarms.
At first glance the minimum applicable value of the threshold should be adopted so as to achieve the maximum fault detection sensitivity.Such a minimum value depends on the accuracy of instrumentation needed to perform electrical International Journal of Photoenergy  measurements (voltage and current sensing).However, two drawbacks occur when the threshold is too low.First, the  Figure 9: Vertical analysis of accumulated energies.Data refer to the P-T curves reported in Figure 8. lower the threshold, the higher the possible occurrence of false positive alarms, specifically if the value of the threshold is close to the measurement error.Second, a low threshold might identify some events associated with negligible energy losses as significant, where the concept of negligible should be related to the economic value of the losses and the expected cost for fixing the fault.Therefore the threshold should be set by trying to fulfill two constraints: it should be sufficiently higher than the measurements errors (about 1% for the systems under investigation) and its value should select only energetically significant losses.
The value of 5% adopted in the above discussed experiments takes into account both the minimum amount of energy loss justifying a maintenance activity and the accuracy provided by the monitoring system.
The effect of a different choice is shown in Figure 13, where the number of alarms is shown as function of the threshold   for both aforementioned horizontal and vertical analyses.As expected, the algorithm identifies about one fault per string if the threshold is 1%.while the number of alarms falls down as the threshold value increases.
It is interesting to note from Figure 13(b) (referring to the vertical analysis performed on the 420 kWp field) that the number of alarms increases with time.This fact means that some strings deteriorate faster than others, with an acceleration from 2014 to 2015 as already observed with reference to Figure 10.

Conclusions
In this paper, a method for analyzing the data provided by PV field monitoring systems, working at the single string level, was presented.The method takes advantage of a site dependent target definition for the power that each PV string could produce.The target inherently accounts for weather conditions and makes it possible to quantify the energy losses attributable to each underperforming PV string.As case study, the data provided by the monitoring systems of two medium size solar fields were analyzed.Analysis was conducted among the strings, during a given year (horizontal analysis) and over time for a given string (vertical analysis).The horizontal analysis showed up to 50% of energy losses of some strings with respect to the reference and cumulate losses of about 10 MWh in a year, 5 MWh being attributable

Figure 3 :Figure 4 :
Figure 3: (a) Actual and target P-T curves in Figure 1 and (b) corresponding energy versus time (E-T) curves.

Figure 6 :Figure 7 :Figure 8 :
Figure 6: Percentage of energy losses for all subarrays under investigation.

Figure 10 :
Figure 10: Vertical analysis of energy losses for four subarrays of the solar field.

Figure 11 :Figure 12 :
Figure 11: Relative energy yield over time.The performance of both the best and worst strings is compared with the mean yield of the subarray and with the target.

Figure 13 :
Figure 13: Number of alarms versus threshold: (a) horizontal analysis and (b) vertical analysis.

Table 1 :
Yearly losses of the worst strings.