Regression is often conducted assuming independent model errors. The detection of atypical values in regression (leverage and influential points) assumes independent errors. However, such independence could be unrealistic in geostatistics. In this article, we propose a methodology based on least squares and geostatistics to identify such values in spatial regression. Our procedure uses the hat matrix to detect leverage points. A modified Cook distance is employed to confirm whether these points are influential. The methodology is evaluated with stationary and non-stationary geostatistical data. We apply this methodology to real georeferenced data related to depth, dissolved oxygen, and temperature. First, an autoregressive model is fitted to depth data. Second, a regression between oxygen and temperature is estimated. In both models, spatial correlation is assumed to determine the parameters, leverage, and influential observations. Our methodology can be used in regression with geographical information to avoid misinterpreted results. Not considering this information may under- or over-estimate geographical indicators, such as the mean depth, which can affect the circulation of water masses or dissolved oxygen variability. Our results reveal that including spatial dependence to identify high leverage points is relevant and must be considered in any geostatistical analysis.
|Publicación||International Journal of Geographical Information Science|
|Estado||Aceptada/en prensa - 2022|
|Publicado de forma externa||Sí|