spsur: an R package for spatial seemingly unrelated regressions

Román Mínguez

University of Castilla La Mancha, Area of Statistics

Fernando A. López

Technical University of Cartagena

Jesús Mur

University of Zaragoza

2019-12-12

Abstract

In recent decades, several methodological improvements have been suggested to specify, estimate and validate Seemingly Unrelated Regression (SUR) models in a spatial framework. These new procedures allow us testing for the presence of spatial dependence, and estimate spatial models using different algorithms. Furthermore, new misspecification tests, singularly in a maximum likelihood framework, have also been developed. Unfortunately, no standard, friendly software is easily avaliable. This is the deficit that the spsur R package pretends to fill. The new package allows for the estimation of the most popular spatial econometric models by maximum likelihood or instrumental variable procedures. Moreover, spsur implements a collection of Lagrange Multipliers and Likelihood Ratios to test for misspecifications in the SUR. Additional functions allow for the estimation of the so-called spatial impacts (direct, indirect and total impacts) and also obtains random data sets, of a SUR nature, with the features decided by the user. An important aspect of spsur is that it operates both in a pure cross-sectional setting or in panel data sets. We include well-known examples in the applied literature on spatial data to illustrate the main functionalities of the new R package.

1 Introduction

Seemingly unrelated regression models (SUR from now on) are a type of multivariate econometric formulation very popular since the seminal papers of Zellner (1962), Malinvaud (1970) or Theil (1971). SUR have been used in many research areas in economics and others fields (for instance, see Fiebig 2007), especially in the topics of consumption, production and environmental economics, but also on economic growth, health economics, real estate or competition between local agents. Distinct mechanisms may result in the type of dependence that lies behind SUR models such as omitted variables, unobserved effects or common factors, with a dfferent impact for each equation but uniform among the individuals. It is clear that if the errors are connected, a SUR framework assures efficiency gains by estimating the system jointly rather than processing each equation separately. These gains explain the relevant position that SUR models occupy in the applied research agenda. Spatial econometrics is, surprisingly, an exception: the applied literature is very short in spite of a quite extensive methodological research.

Anselin (1988) introduced the term spatial SUR in reference to a model made of ‘an equation for each time period, which is estimated for a cross section of spatial units’ (p. 141). The approach of Anselin focuses on the problem of serial dependence in the errors of an equation, which is perfectly natural if the same agents intervene in every period. According to usual practice at the end of the eighties, the proposal of Anselin allows for a very limited heterogeneity. In fact, the regression coefficients are assumed to be the same across individuals, and unobserved effects are excluded. These constraints have been overcomed in the more recent works of Mur, López, and Herrera (2010) and Anselin (2016), where spatial and nonspatial parameters can take on different values in different cross-section.

Our case assumes that the model may contain G equations, with spatial structure. Additionally, it is assumed that the cross-sectional dimension (N, number of individuals) is large, and that the time dimension (Tm, number of periods) may increase. Rey and Montouri (1999), Egger and Pfaffermayr (2004), Le Gallo and DallErba (2006), Fingleton (2007), Moscone, Tosetti, and Knapp (2007), Le Gallo and Chasco (2008), Lauridsen et al. (2010), Kakamu, Polasek, and Wago (2012) or Bech et al. (2012) are well-known examples in this line. If the number of individuals is small and they are observed for a long time period (finite N and large Tm), the inference should be based on the time dimension, for example, specifying an equation for each spatial unit. Arora and Brown (1977), Hordijk and Nijkamp (1977) and White and Hewings (1982) follow the last approach.

A significant contribution is Wang and Kockelman (2007), who allow for greater heterogeneity among the individuals by introducing unobserved random effects. They extend the equation of the errors by including a mechanism of error components ‘a la KKP’, from Kapoor, Kelejian, and Prucha (2007), where the unobserved effects are intermingled with the random term of the spatial errors. This scheme is different from the proposal of Baltagi, Song, and Koh (2003), BSK, where the error is the sum of a spatial error mechanism plus an unobserved individual effect. Another point of interest in the work of Wang and Kockelman is that they built a standard SUR model, with G equations, Tm cross-sections and N individuals. Baltagi and Bresson (2011) extend the results of Wang and Kockelman by developing a collection of misspecification tests, under a maximum-likelihood, ML, framework, whereas Baltagi and Pirotte (2011) introduce generalized method of moments, GMM, in a spatial SUR with spatial errors, SEM, also ‘a la KKP’. Finally López, Mur, and Angulo (2014) address the problem of selecting the best specification for spatial SUR models, using a battery of Lagrange Multipliers obtained in a ML framework. Lundberg (2006), Zhou and Kockelman (2009), Cotteleer and Kooten (2012), Izón et al. (2016) and López, Martínez-Ortiz, and Cegarra-Navarro (2017) present nice examples in this direction.

From the discussion above, it is clear that there is a growing interest in the case of seemingly unrelated regressions in a spatial context, although the lack of specific software has slowed its adoption in applied research. Our purpose is to introduce a new R package, called spsur, to fill this gap. The preferred framework is a large N and an increasing Tm. The core of the package lies in ML estimation because, at this time, we value exactitude and reliability over quickness. There is no need to recall that the hypothesis of normality plays a key role in this approach, which may be difficult to admit in some cases. For that reason, and also to allow for speed of calculus in large samples, the ML approach is completed with a module devoted to IV procedures.

Among the limitations, let us note that we rely explicitly on the assumptions of linearity and dynamic relations are not allowed. This version of the package can remove unobserved individual effects from the equations, which are not estimated. Finally, we require the user to specify the weighting matrix, which is assumed exogenous, known, time invariant and constant across equations.

Second Section introduces notation and reviews some basic results in the literature on spatial SUR models, using the approach of Anselin. Section 3 presents the structure and some of the main functions included in spsur. Section 4 focusses on misspecification tests included in spsur. Section 5 is devoted to the issue of spatial multipliers whereas Section 6 presents the general case of a SUR model with G equations, Tm time periods and N individuals. Section 7 introduces an additional functionality that allows for simulation experiments. Finally, Section 8 concludes with a brief summary and prospects for the future.

2 SUR and Spatial SUR models

The SUR model consist of several equations, possibly with different regressors, where the error terms are correlated among equations. We focus on the Anselin’s case with a single equation, G=1, several time periods, Tm, and N individuals in the cross-section; we assume that N>Tm (note that the discussion is almost similar if Tm=1 and G>1). Our baseline model, without spatial effects, is the following: \[\begin{equation} {{\bf y}_t}={{\bf X}_t} {\boldsymbol{\beta} }_t+{\boldsymbol{\varepsilon}}_t\ ; \ \ \ \ \ \ E[{\boldsymbol \varepsilon}_t]=0\ ;\ \ \ \ \ E[{\boldsymbol \varepsilon}_t {\boldsymbol \varepsilon^{T}}_s]= \sigma_{ts} {\bf I}_N \ \ \ \ \ \ t,s=1,...,Tm \tag{2.1} \end{equation}\] where \({\bf y_t}\), \({\bf u}_t\) and \({\boldsymbol \varepsilon}_t\) are N\(\times\)1 vectors, \({\bf X}_t\) is a N\(\times p_t\) matrix, with \(p_t\) the number of regressors that appear in the t-th equation and \(\boldsymbol{\beta}_t\) the \((p_t \times 1)\) vector of coefficients. We call this model as SUR-SIM, from spatially independent model following the terminology in López, Mur, and Angulo (2014).

The singular aspect of (2.1) is that the SUR structure appears because there is intra-individual serial dependence; inter-individual serial dependence is excluded. The serial dependence in the errors is not parameterized, but estimated in the (Tm\(\times\)Tm) sampling covariance matrix, \(\boldsymbol \Sigma\)=\((\sigma_{ts})\).

The equation of (2.1) may be enough in some circumstances but not in a spatial setting where a potential problem, often present in applied research, is the existence of spatial dependence. The spatial mechanisms may have different forms (Elhorst 2014). A general SUR model that include most of the possible spatial effects of interest is:

\[\begin{equation} {\bf y}_t=\lambda_t {\bf W}_t^y {\bf y}_t+ {\bf X}_t {\boldsymbol \beta}_t+ {\bf W}_t^x {\bf X}_t^* {\boldsymbol \gamma}_t + {\bf u}_t;\ {\bf u}_t=\rho_t {\bf W}_t^u {\bf u}_t +{\boldsymbol \varepsilon}_t\ ; \ \ E[{\boldsymbol \varepsilon}_t]=0\ ;\ \ E[{{\boldsymbol \varepsilon}_t} {{\boldsymbol \varepsilon^{T}}_s}]=\sigma_{ts}{\bf I}_N \tag{2.2} \end{equation}\]

where \({\bf X}_t^*\) is the matrix of regressors excluded the intercept term; \({\bf W}_t^y\), \({\bf W}_t^x\), \({\bf W}_t^u\) are (N\(\times\)N) weighting matrices. The model in (2.2) is often referred to as a General Nesting Model, SUR-GNM. Several constrained specifications emerge from this equation, such as:

  1. The SUR-SIM, spatially independent model, \(\lambda_t = \rho_t = \gamma_t = 0, (\forall t)\).
  2. The SUR-SLX, spatial lags in Xs model, where \(\lambda_t = \rho_t = 0, (\forall t)\).
  3. The SUR-SLM, spatial lag model, where \(\rho_t=\gamma_t = 0, (\forall t)\).
  4. The SUR-SEM, spatial error model, where \(\lambda_t=\gamma_t = 0, (\forall t)\).
  5. The SUR-SDM, spatial Durbin model, where \(\rho_t = 0, (\forall t)\).
  6. The SUR-SDEM, spatial Durbin error model, where \(\lambda_t= 0, (\forall t)\).
  7. The SUR-SARAR, spatial lag model with autoregresive errors, where \(\gamma_t = 0, (\forall t)\).

In the applied literature, it is usual to assume that the weighting matrices are the same \({\bf W}_t^x\) = \({\bf W}_t^u\) = \({\bf W}_t^{\varepsilon}\) = \({\bf W}\). The user should supply this matrix, based on apriori knowledge, which, in any case, must conform to he requisites described in Kelejian and Prucha (2004): the terms on the main diagonal are zero and the row and column sums of \(\textbf W\) are uniformly bounded in absolute value, the same as \(\textbf A_{t}^{-1} =\mathop{\left({\bf I}_N -\lambda _{t} {\bf W}\right)}\nolimits^{-1}\) and \(\textbf B_{t}^{-1} =\mathop{\left({\bf I}_N -\rho _{t} {\bf W}\right)}\nolimits^{-1}\) which must exist.

The model of (2.2), for the case of G=1, can be expressed using obvious matrix notation:

\[\begin{equation} \begin{array}{ll} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ {\bf Ay}={\bf X}{\boldsymbol \beta} + ({\bf I}_{Tm} \otimes {\bf W}){\bf X^*} {\boldsymbol \gamma} +{\bf u}; \ \ {\bf Bu}={\boldsymbol \varepsilon}; \ \ {\boldsymbol \varepsilon} \sim N(0,{\boldsymbol \Omega} ) \\ \\ {{\bf y}=\mathop{\left[\begin{array}{c} {\mathop{\bf y}\nolimits_{1} } \\ {\mathop{\bf y}\nolimits_{2} } \\ {\cdots } \\ {\mathop{\bf y}\nolimits_{Tm} } \end{array}\right]}\limits_{TmN\times1} ; {\bf X} =\mathop{\left[\begin{array}{cccc} {\mathop{\bf X}\nolimits_{1} } & {0} & {\cdots } & {0} \\ {0} & {\mathop{\bf X}\nolimits_{2} } & {\cdots } & {0} \\ {\cdots } & {\cdots } & {\cdots } & {\cdots } \\ {0} & {0} & {\cdots } & {\mathop{\bf X}\nolimits_{Tm} } \end{array}\right]}\limits_{TmN\times p} ;\, \, {\boldsymbol \beta} =\mathop{\left[\begin{array}{c} {\mathop{{\boldsymbol \beta}}\nolimits_{1} } \\ {\mathop{{\boldsymbol \beta}}\nolimits_{2} } \\ {\cdots } \\ {\mathop{{\boldsymbol \beta}}\nolimits_{Tm} } \end{array}\right]}\limits_{p\times 1} ;{\bf u}=\mathop{\left[\begin{array}{c} {\mathop{{\bf u}}\nolimits_{1} } \\ {\mathop{{\bf u}}\nolimits_{2} } \\ {\cdots } \\ {\mathop{{\bf u}}\nolimits_{Tm} } \end{array}\right]}\limits_{TmN\times1} ;\, \, {\boldsymbol \varepsilon} =\mathop{\left[\begin{array}{c} {\mathop{{\boldsymbol \varepsilon}}\nolimits_{1} } \\ {\mathop{{\boldsymbol \varepsilon}}\nolimits_{2} } \\ {\cdots } \\ {\mathop{{\boldsymbol \varepsilon}}\nolimits_{Tm} } \end{array}\right]}\limits_{TmN\times1} } \end{array} \tag{2.3} \end{equation}\]

where \(p=\sum p_t\), \({\bf A}={\bf I}_{TmN}-{\boldsymbol \Lambda} \otimes {\bf W}\) with \({\boldsymbol \Lambda}=diag(\lambda_1,\dots,\lambda_{Tm})\); \({\bf B}={\bf I}_{TmN}- {\boldsymbol \Gamma} \otimes {\bf W}\) with \({\boldsymbol \Gamma}=diag(\rho_1,\dots,\rho_{Tm})\) and \({\boldsymbol \Omega}={\boldsymbol \Sigma} \otimes {\bf I}_{TmN}\), being \(\otimes\) the Kronecker product. Moreover, \({\bf X}^*\) is a NTm \(\times\) (p-Tm) that omits the columns of the intercepts because these terms cannot be spatially lagged. Assuming that the errors are normally distributed, the log-likelihood function of (2.2) can be written as:

\[\begin{multline} l({\bf y};{\boldsymbol \eta})=-\frac{NTm}{2} \ln (2\pi)-\frac{N}{2} \ln |{\bf \Sigma}|+Tm\sum _{t=1}^{Tm}\ln|{\bf B}_t| +Tm\sum _{t=1}^{Tm}\ln|{\bf A}_t| \\ -\frac{\left({\bf Ay}-{\bf \bar{X}}[{\boldsymbol \beta} \ {\boldsymbol \gamma}]^{T} \right)^{T}{\bf B}^{T}\mathop{{\bf \Omega}}\nolimits^{-1} {\bf B}\left({\bf Ay}-{\bf \bar{X}}[{\boldsymbol \beta} \ {\boldsymbol \gamma}]^{T} \right)}{2} \tag{2.4} \end{multline}\]

where \(\bar{\textbf X}=[\textbf X,\bf W {\bf X}^*]\) is a NTm \(\times\) (2p-Tm).

The vector of parameters is \({\boldsymbol \eta}^{T}=\left[{\boldsymbol \beta}^{T};{\boldsymbol \gamma}^{T};\lambda_{1}; \dots ;\lambda_{Tm};\rho_{1};\dots ;\rho_{Tm} ;\mathop{\sigma }\nolimits_{ij} \right]\), where there are P= (2p-Tm)+Tm+Tm(Tm+1)/2 unknown coefficients Wang and Kockelman (2007) and López, Mur, and Angulo (2014) use numerical optimization techniques to solve the maximum likelihood estimates. Note that the presence of the log of the Jacobian terms, \(\ln|{\bf B}_t|\) and \(\ln|{\bf A}_t|\), forces the spatial parameters, \(\lambda_t\) and \(\rho_t\), to lie inside the so-called stability interval (if the weighting matrix is row-standardized, a rough approximation is -1,+1). Under the assumption that the spatial SUR model is correctly specified (including normality), the ML estimators are consistent, efficient and asymptotically normally distributed (Davidson and MacKinnon 1993).

2.1 Testing for spatial effects

The estimation of spatial SUR models can be complex for strong spatial dependence mechanisms, so it is advisable to avoid needless efforts. This is the purpose of the Lagrange Multipliers presented below. As usual in the spatial econometrics literature, our starting point is the SUR-SARAR model.

The first Multiplier is a global test for the presence of spatial effects in the SUR:

\[\begin{equation} H_0 :\lambda_t =\rho_t =0 \ \ (\forall t) \ \ \ \ vs \ \ \ \ H_A: No \ \ H_0 \tag{2.5} \end{equation}\]

obtained as:

\[\begin{equation} \mathop{{\rm LM}}\nolimits_{{\rm SARAR}}^{{\rm SUR}} = {\mathop{{\bf g}^{T}}\nolimits_{({\boldsymbol\eta})_{\left|\mathop{H}\nolimits_{0} \right. }}} \left[\mathop{{\bf I}}\nolimits_{({\boldsymbol\eta})_{\left|\mathop{H}\nolimits_{0} \right. }} \right]^{-1} {\mathop{{\bf g}}\nolimits_{({\boldsymbol\eta})_{\left|\mathop{H}\nolimits_{0} \right. }}} \mathop{\sim }\limits_{as} \mathop{\chi }\nolimits^{2} (2Tm) \tag{2.6} \end{equation}\]

where \(\mathop{{\bf g}}\nolimits_{({\boldsymbol\eta})_{\left|\mathop{H}\nolimits_{0} \right. }}\) is the score vector of order (Px1), evaluated under the null hypothesis. \(\mathop{\bf I}\nolimits_{({\boldsymbol \eta} )_{\left|\mathop{H}\nolimits_{0} \right. }}\) is the (PxP) information matrix of the SUR-SARAR model also under the null of (2.5); more details in Mur, López, and Herrera (2010).

If we change the model of the alternative, from the SUR-SARAR to the SUR-SLM, we can test for the significance of the spatial lags:

\[\begin{equation} H_0 :\lambda_t =0 \ \ (\forall t) \ \ \ \ vs \ \ \ \ H_A: No \ \ H_0 \tag{2.7} \end{equation}\]

The model of the null hypothesis continues to be the SUR-SIM, and it is implicitly assumed that the \(\rho\) parameters are zero. The discussion is entirely similar to the test of (2.6) in the SARAR case, just changing this model by the SLM. The corresponding Lagrange Multiplier is called \({\rm LM}_{\rm SLM}^{\rm SUR}\).

Similarly, a SEM model can be placed in the alternative, assuming that the \(\lambda\) parameters are zero. We can test for the presence of spatial structure in the errors:

\[\begin{equation} H_0 :\rho_t =0 \ \ (\forall t) \ \ \ \ vs \ \ \ \ H_A: No \ \ H_0 \tag{2.8} \end{equation}\]

The Lagrange Multiplier associated to the hypothesis of (2.8) is called \({\rm LM}_{\rm SEM}^{\rm SUR}\).

Bera and Yoon (1993) show that the last two raw Multipliers, \({\rm LM}_{\rm SLM}^{\rm SUR}\) and \({\rm LM}_{\rm SEM}^{\rm SUR}\), are oversized in case of misspecification of the alternative hypothesis. This lack of robustness complicates the identification of the correct specification. Let us note that this problem is not specific of a spatial setting but extends to other cases where there are groups of parameters whose scores are not orthogonal. The solution, as suggested in Anselin et al. (1996), is to robustify the raw Multipliers to obtain the so-called robust Lagrange Multipliers, denoted as \({\rm LM}_{\rm SLM}^{\rm *SUR}\) and \({\rm LM}_{\rm SEM}^{\rm *SUR}\) respectively. The size of the robust Multipliers is more balanced at the cost of a small decrease in power; in any case, they are very popular among practitioners in spatial data analysis.

2.2 The IV estimation for spatial SUR models

The ML approach offers an adequate framework for solving the inference of spatial SUR models. However, as the size of the cross-section, N, increases they become highly demanding in terms of computational effort due to the presence of the Jacobian matrices in the likelihoods. Under some circumstances, it may be reasonable to use other algorithms, such as Instrumental Variables, IV.

Durbin (1954) and Sargan (1958) introduced the IV approach as a method to deal with problems of endogeneity. This is the typical situation in a spatial model that includes lags of the explained variable among the regressors. For example, in the SUR-SLM case is inmediate to conclude that \(E\left[\bar{{\bf y}}^{T} {\bf u}\right]=tr\left[({\boldsymbol \Delta}^{-1})^{T} \left({\bf I}_{TmN} \otimes \textbf{W}^{T} \right) {\boldsymbol \Omega} \right] \neq 0\), being \({\boldsymbol \Delta}={\bf I}_{Tm} \otimes {\boldsymbol \Lambda} \otimes {\bf W}\) and \(\bar{{\bf y}}\) the spatial lag of the explained variable, \(\bar{{\bf y}} = \left[{\bf I}_{Tm} \otimes {\bf W} \right] {\bf y}\). Kelejian and Prucha (1998) and Kelejian and Prucha (1999) adapt the IV procedure to a spatial setting, advocating for the use of the so-called spatial instrumental variables, which is the approach implemented in spsur.

As known, a good instrument, say \({\bf Z}\), should combine two properties:

  • the instrument must be orthogonal with the error term, \(E\left[{\bf Z}^{T} {\bf u}\right]=0\)
  • the instrument should be highly correlated with the endogenous regressor \(E\left[{\bf Z}^{T} \bar{{\bf y}} \right] \neq 0\)

Kelejian and Prucha show that if the \({\bf X}\) variables are really exogenous, the spatial lags of these variables are good candidates to become IV for the spatial lag of the explained variable. However, our problem is not standard because we are working in a multivariate SUR framework, which lead us to a Three-Stage Least-Squares approach using spatial instrumental variables. The procedure is as follows:

  1. List the instruments in matrix \(\bf H\). By default, spsur sets \({\bf H}=[{\bf X},\left( {\bf I}_{Tm} \otimes {\bf W}\right){\bf X}^*], \left( {\bf I}_{Tm} \otimes {\bf W}^{2} \right){\bf X}^*]\) for the SUR-SLM and \({\bf H}=\left[{\bf X},\left( {\bf I}_{Tm} \otimes {\bf W}\right){\bf X}^*,\left( {\bf I}_{Tm} \otimes {\bf W}^{2} \right){\bf X}^*, \left( {\bf I}_{Tm} \otimes {\bf W}^{3} \right){\bf X}^*\right]\) for the SUR-SDM. This means that, in the SLM, \(\bf X\) variables instrument themselves, \(\bf WX^*\) and \(\bf W^2X^*\) serves as instrument for \(\bf Wy\); in the SDM case, \(\bf X\) and \(\bf WX^*\) instrument themselves and \(\bf W^2X^*\), \(\bf W^3X^*\) instrument \(\bf Wy\).

  2. Solve a Least-Squares, LS, regresion of the explained variable on the list of instruments to obtain an approximation to the optimal instruments, \(\hat{{\bf y}} = {\bf H}\left( {\bf H^{T}H} \right)^{-1} {\bf H^{T}} {{\bf y}}\)

  3. Substitute the spatial lag of the explained variable, \({\bf Wy}\), for its estimates \({\bf W\hat{y}}\) in the corresponding SUR equation. In this moment, the endogeneity problem has been corrected.

  4. The errors of the SUR model are not standard because they are correlated among the Tm equations. However, LS produces unbiased and consistent estimates of the \(\beta\), \(\theta\) and \(\lambda\) parameters. Accordingly, the second stage is to solve a LS regression with this group of variables.

  5. Using the LS residuals of the previous step, we obtain a consistent estimate of the covariance matrix of the SUR equations, \({\bf \hat{\Sigma}}_{g,h} = \frac{ \sum\limits_{t=1}^{Tm} \hat{{\bf u}^{T}}_{tg}\hat{{\bf u}}_{th}} {Tm}\), being \(\hat{{\bf u}}_{tg}\) the \((N \times 1)\) vector of LS residuals of the g-th equation in time period t (from now on we will use two subscrits for time and equation, respectively). This estimate is consistent with Tm.

  6. The last step is a Feasible Generalized Least-Squares estimate of the \(\beta\), \(\theta\) and \(\lambda\) parameters with the aim of improving their efficiency; that is:

\[\begin{equation} \hat{{\boldsymbol \eta}}_{3sls}^{*}=\left[{\bf \hat{X}}^{T}\left({\bf \hat{\Sigma}}^{-1} \otimes\ {\bf I}_{TmN} \right) {\bf \hat{X}}\right]^{-1} \left[{\bf \hat{X}}^{T}\left({\bf \hat{\Sigma}}^{-1} \otimes\ {\bf I}_{TmN}\right) {\bf y} \right] \tag{2.9} \end{equation}\]

being \({\boldsymbol \eta}^{*}=\left[{\boldsymbol \beta}; {\boldsymbol \theta}; {\boldsymbol \lambda}\right]^{T}\) and \(\bf \hat{X}\) the corresponding matrix of regressors of the third step described above.

In sum, the spsur3sls() function consists in a Three-Stages Least-Squares, 3SLS, algorithm based on spatial instrumental variables. This estimation procedure is almost linear, which expedites the computational effort. On the negative side, let us note that the estimation of the \(\lambda\) parameters is not restricted by the Jacobian term, which allows for estimates of \(\lambda\) outside the stability interval: \(\dfrac{1}{\lambda_{t}^{-}} < \lambda_{t} < \dfrac{1}{\lambda_{t}^{+}}\), being \({\lambda_{t}^{-}}\) and \({\lambda_{t}^{+}}\) the greatest negative and positive eigenvalues, respectively, of the weighting matrix. Moreover, the parameters associated with the spatial errors, \(\rho\)s, cannot be estimated using this algorithm.

2.3 Software review

The aim of spsur is to stimulate the use of spatial SUR models in applied research, by offering a friendly software easily accessible. As said, at present the user has very limited options, which has slowed the spreading of spatial SUR models.

In the field of quantitative analysis for spatial data, we can cite several R specialized packages. Probably, the most popular are spdep (Bivand, Pebesma, and Gomez-Rubio 2013) and spml (Millo, Piras, and others 2012). The spdep is an R package with high functionalities for econometric analysis of cross-sectional data whereas spml is more focused in the case of spatial panels. On the other hand, the R package systemfit (Henningsen, Hamann, and others 2007) is devoted to SUR models, but do not incorporate spatial effects. That is, none of the three packages have specific utilities to solve inference of spatial SUR models.

Indeed, to our knowledge, the closest antecedent of spsur is the beta version of Spacestat (Anselin 1999), called SpaceStat v.May91, which had limited functionalities and required Windows 98 (see video YouTube). Of course, nowadays it would be very difficult doing applied research using SpaceStat v.May91.

There is a second alternative in Python; it is the library pysal (Rey and Anselin 2007). The module spreg.sur provides SUR estimation of models with no spatial effects; the output includes several diagnostic measures among which appear the raw Lagrange Multpliers. Moreover, the codes spreg.sur_error and spreg.sur_lag provides the estimation of spatial SUR models with spatial errors and with spatial lags of the explained variable, respectively. The SUR-SEM model is estimated by ML whereas the SUR-SLM is estimated using spatial instrumental variables throug 3SLS. Finally, there is some code in MATLAB, related to the work of (López, Mur, and Angulo 2014), but the source is not free software and the codes are not fully integrated.

3 Introducing the spsur package

spsur is an open-source software for the R computing platform (R Core Team 2018). spsur depends on the packages Formula, MASS, Matrix, methods, numDeriv, minqa, sparseMVN, spdep and stats. The package can be freely dowloaded from CRAN and a development version is available in the link Github-spsur. The users will find a complete helping documentation and a vignette to guide them through the functionalities of the package.

spsur is made of 10 different functions which allow for a thoroughful analysis of a spatial SUR specification. Figure 1 shows the main functionalities of the package. spsur includes one function to evaluate the Lagrange Multipliers for the presence of spatial correlation in a SUR-SIM model (lmtestspsur()); two functions to estimate spatial SUR models, by maximum likelihood (spsurml()) or using instrumental variables (spsur3sls()) for the general case of G>1 (including both Tm=1 and Tm>1), and a third one designed to deal with pure spatial panels (spsurtime()) for G=1 and Tm>1; four additional functions allow to improve the specification (lrtestspsur(); wald_betas(), wald_deltas() and lr_betas_spsur()); another function obtains the spatial impacts of the regressors (impacts()). Finally dgp_spsur() allows the user to solve Monte Carlo experiments using spatial SUR models.

\label{Fig1} Main functionatities of spsur package

Figure 3.1: Main functionatities of spsur package

3.1 Datasets and baseline models

Two main data sets have been included in the package with the aim of showing the functionalities of spsur, namely, spc and NCOVR. Each data set consists of the corresponding data frame plus a weighting matrix that must be row-standardized; if the matrix is not row-standardized, then spsur does it.

Example 1: The first dataset, spc, constitutes a classical example taken from Anselin (1988) p. 203-211, for 25 counties in South-West Ohio and two time periods, 1981 and 1983; the associated weighting matrix is Wspc. This is a Phillips-Curve problem where the explained variable measures the changes in wage rates (WAGE), whereas the regressors are unemployment rate (UN), net-migration rate (NMR) and a dummy variable (SMSA) with a value of 1 for metropolitan counties.

The SUR model estimated by Anselin is: \[\begin{equation} \begin{array}{ll} WAGE_{83} = \beta_{10} + \beta_{11} \ UN_{83} + \beta_{12} \ NMR_{83} + \beta_{13} \ SMSA + \varepsilon_{83} \\ WAGE_{81} = \beta_{20} + \beta_{21} \ UN_{80} + \beta_{22} \ NMR_{80} + \beta_{23} \ SMSA+ \varepsilon_{81} \end{array} \tag{3.1} \end{equation}\]

The package is installed from the CRAN repository and the data set loaded as usual,

We use the Formula package [Zeileis and Croissant (2010) to especify the multiequational SUR model of (3.1):

Note that in the left side of the formula, two dependent variables have been included separated by the symbol |. In the right side appear the independent variables for each equation, separated again by a vertical bar |, and keeping the same order that in the left side.

Example 2: The second data set (NCOVR) comes from GeoDa Data and Lab collection and refers to homicide rates for 3,085 continental U.S. counties for four time periods (1960, 1970, 1980 and 1990), whose full description can be found in GeoDa. This data set includes a high number of socio-economic characteristics of the counties. The matrix \(\textbf W\) is included in spsur. This data set has been used by Baller et al. (2001) and collected via a non-gridded sampling design.

To illustrate the functionalities of spsur, we build a SUR model with three equations and different number of regressors in each equation: \[\begin{equation} \begin{array}{lll} HR_{80} = \beta_{10} + \beta_{11} \ PS_{80} + \beta_{12} \ UE_{80} + \varepsilon_{HR} \\ DV_{80} = \beta_{20} + \beta_{21} \ PS_{80} + \beta_{22} \ UE_{80} + \beta_{23} \ SOUTH + \varepsilon_{DV} \\ FP_{79} = \beta_{30} + \beta_{31} \ PS_{80}+ \varepsilon_{FP} \end{array} \tag{3.2} \end{equation}\] \(HR_{80}\) is homicide rates per 100,000 in 1980, \(PS_{80}\) measures the population structure in 1980, \(UE_{80}\) the unemployment rate, \(DV_{80}\) is the divorce rate, \(FP_{79}\) is the percentage of families below poverty in 1980 whereas SOUTH is a dummy variable for Southern counties. Note that the model of (3.2) has three equations but only one cross-section. To input this model into R we should write:

3.1.1 The LM tests in spsur package

The function lmtestspur() computes 5 Lagrange Multipliers, LM, for omitted spatial dependence in a SUR-SIM model. The syntax of this function is:

lmtestspsur(Form = NULL, data = NULL, W = NULL, X = NULL,Y = NULL, time = NULL, G = NULL, N = NULL, Tm = NULL, print_table = TRUE)

Note that, as the examples below show, most of the argument with NULL default can be omitted from the syntaxis, if they are not needed, so that the final expressions simplify to a great deal. The simplest way to obtain the LMs is through a formula, Form, previously defined by the user, a data frame, data, and a weighting matrix, W, such as:

spc contains information for only 25 spatial units and, therefore, the power of the LM tests is very low. Similarly, for the homicide rates case of (3.2), we obtain:

There is a second way to compute the LMs, based on matrices instead of formulas. In first place, it is necessary to create the requiered matrices; for example, according to model (3.2) we should write:

The last matrix, \(\bf X\) it is block-diagonal. Then, the user introduces these matrices as arguments in lmtestspsur(). The number of individuals (N), equations (G) and time periods (Tm) must also be specified,

In the example above, all the LMs reject their respective null hypotheses, which clearly discards the SUR-SIM model.

3.1.2 Maximum Likelihood estimation of spatial SUR models

If the LM tests reject their associated null hypothesis, a spatial SUR model must be estimate. The function spsurml() obtains the ML estimation of the different models listed in Section 2. The syntax of this fuction is:

spsurml(Form = NULL, data = NULL, R = NULL, r = NULL, W = NULL, X = NULL, Y = NULL, G = NULL, N = NULL, Tm = NULL, p = NULL, demean = FALSE, type = "sim", cov = TRUE, control = list(tol = 0.001, maxit = 200, trace = TRUE))

Similarly, most of the argument with NULL default can be omitted from the syntaxis, if they are not needed. Once again, the simplest way to estimate a spatial SUR is by using a formula, a data frame and a weighting matrix, which are the arguments that appear in Form, data and W. The second way implies the use of matrices, similar as before.

To estimate a SUR-SLM model with the data set of Example 1, the syntax is:

The output of spsurml() includes details about the iteration process and time of computation. Let us also note that the user can configure the optimization process using the control list, as an additional argument to the function. In this way, the tolerance of the convergence process can be set to tol; the same as the number of maximum iterations, maxit, whereas trace allows the user hide intermediate results.

Consistent with the conventions in the R environment, the summary() method prints the output equation by equation.

The summary includes basic information for each equation such as estimated coefficients, standard deviations, etc. Moreover, in a SUR context, the matrices of covariances and of correlations of the residuals among the Tm equations are of special interest for the user. The summary also reports the value of maximum log likelihood of the model, some measures of goodness of fit, the Breusch-Pagan test of diagonality, (Breusch and Pagan 1980) and the corresponding marginal LM tests, for omitted spatial errors in SLM or SDM models or omitted lags of the explained variable in SEM or SDEM models (more details in Section 4.2 below). The marginal multipliers are printed in the output simply as \(LMM\).

The syntax for other spatial specifications is very similar, just changing the argument type. For example:

For reasons of space, we skip the output for these other models, and continue using the SUR-SLM case.

As said before, the control options can be used to customize the iteration process. In the case below, the convergence process will stop if the difference in the estimated likelihood between two consecutive iterations is smaller than tol=0.1 or the number of iterations reaches the value maxit=20. The controls also advises to hidden the intermediate results, with the argument trace.

The second way to estimate a SUR-SLM model uses arguments based on matrices such as, for example:

The output of the funtion spsurml() is an object of an spsur class.

3.1.3 3SLS estimation

The function spsur3sls() estimates spatial SUR models by spatial IVs in a 3SLS framework. The syntax of this fuction is very similar to spsurml().

spsur3sls(Form = NULL, data = NULL, R = NULL, b = NULL, W = NULL,X = NULL, Y = NULL, G = NULL, N = NULL, Tm = NULL, p = NULL, demean = FALSE, type = "slm", maxlagW = 2)

The main differences between the two are: (i) the argument type now only supports two spatial SUR models: slm and sdm and (ii) a new argument is needed, maxlagW, to set the maximum order of spatial lags of the regressors to obtain the IVs. The default value is maxlagW=2 for SUR-SLM models and maxlagW=3 for the SDM variant.

The 3SLS algorithm is especially adequate in cases of large datasets or when spsur3sls() has to deal with very dense Jacobians. Our Example 2 includes 3,085 US counties and it is a good candidate for spsur3sls(). The code to estimate a SUR-SLM model for the SUR of (3.2) is,

NCOVRSUR.slm.3sls <-spsur3sls(Form = NCOVRformula, data = NCOVR, type = "slm", W = W)
#> Time to fit the model:  0.17  seconds
summary(NCOVRSUR.slm.3sls)
#> Call:
#> spsur3sls(Form = NCOVRformula, data = NCOVR, W = W, type = "slm")
#> 
#>  
#> Spatial SUR model type:  slm 
#> 
#> Equation  1 
#>                Estimate Std. Error t value  Pr(>|t|)    
#> (Intercept)_1  8.877617   1.307821  6.7881 1.206e-11 ***
#> PS80_1         0.892351   0.138898  6.4245 1.388e-10 ***
#> UE80_1        -0.060245   0.033200 -1.8146   0.06962 .  
#> lambda_1      -0.221472   0.191153 -1.1586   0.24664    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> R-squared: 0.1405 
#>   Equation  2 
#>                Estimate Std. Error t value  Pr(>|t|)    
#> (Intercept)_2 2.9935206  0.3798303  7.8812  3.61e-15 ***
#> PS80_2        0.2533626  0.0262643  9.6467 < 2.2e-16 ***
#> UE80_2        0.0914432  0.0095959  9.5294 < 2.2e-16 ***
#> SOUTH_2       0.0346575  0.0471028  0.7358    0.4619    
#> lambda_2      0.2124110  0.0900048  2.3600    0.0183 *  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> R-squared: 0.291 
#>   Equation  3 
#>                Estimate Std. Error t value  Pr(>|t|)    
#> (Intercept)_3  7.670117   1.095699  7.0002 2.734e-12 ***
#> PS80_3        -1.585681   0.160403 -9.8856 < 2.2e-16 ***
#> lambda_3       0.386346   0.087474  4.4167 1.014e-05 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> R-squared: 0.5611 
#>   Variance-Covariance Matrix of inter-equation residuals:                              
#>  45.377766  1.867308 18.961586
#>   1.867308  1.938112 -1.427426
#>  18.961586 -1.427426 34.820491
#> Correlation Matrix of inter-equation residuals:                                
#>  1.0000000  0.1991154  0.4770190
#>  0.1991154  1.0000000 -0.1737588
#>  0.4770190 -0.1737588  1.0000000
#> 
#>  R-sq. pooled: 0.3595

A great advantage of spsur3sls() is the serious reduction of computing time, compared with ML methods. In the case of NCOVR, with 3,085 observations and 3 equations, the estimation took less than a second. The ML estimation, using the function spsurml() for the same problem, required of almost 10 minutes (using a Intel Core i7 2.93 GHz).

Like in the case of spsurml(), the output of the funtion spsur3sls() is an object of the spsur class.

3.2 Spatial panel SUR models

It is not unusual, in applied work, to have data for only one equation, \(G\)=1, such as in Anselin (1988). We refer to this case as an Spatial panel SUR with, for example, a SLM structure: \[\begin{equation} {\begin{array}{ll} {\mathop{{\bf y}}}_{t} = \mathop{\lambda}_{t} {\bf W} \mathop{{\bf y}}_{t} +\mathop{{\bf x}}_{t} {\boldsymbol \beta}_{t} +\mathop{{\boldsymbol \varepsilon}}_{t}; {t=1,2,\ldots ,Tm} \\ {} \\ {E\left[\mathop{{\boldsymbol \varepsilon}}_{t} \right]=0} \ \ {E\left[\mathop{{\boldsymbol \varepsilon}}_{\mathop{t}} \mathop{{\boldsymbol \varepsilon^{T}}}_{\mathop{s}} \right]=\mathop{\sigma}_{t,s} {\bf I}_{N}} \end{array}} \tag{3.3} \end{equation}\]

Now \({\bf y}_t\) and \({\boldsymbol \varepsilon}_t\) are (N\(\times\)1) vectors and \(\textbf{x}_{t}\) in a \((N \times p_{t})\) matrix where \(p_{t}\) is the number of regressors in the t-th cross-section. The spatial parameters, \(\lambda\)s, as well as the \(\beta\)s are allowed to change from cross-section to cross-section. The SUR structure appears because there is serial dependence in the error terms of each individual in the sample; serial dependence among differente individuals is not allowed. The serial dependence is estimated in the (Tm\(\times\)Tm) \({\bf \Sigma}\) covariance matrix by the function spsurtime(). An important constraint to mention is that the serial dependence is assumed to be uniform for all individuals in the sample.

The syntax of spsurtime() combines arguments of spsurml() and spsur3sls() and it is quite transparent:

spsurtime(Form, data, time, type = "sim", method = "ml", maxlagW = 2, W = NULL, cov = TRUE, demean = FALSE, trace = TRUE, R = NULL, b = NULL)

The spatial models supported by spsurtime() are the seven specifications listed in Section 2, SUR-SIM, SUR-SLX, SUR-SLM, SUR-SEM, SUR-SDM, SUR-SDEM and SUR-SARAR. Moreover, the user can decide if the estimation algorithm is ML, which is the default, or 3SLS following the same treatment described in Section 3.

Finally, this function also allows to demean the data with the purpose of removing potential unobserved effects in the sample. We use the most popular transformation, which simply subtracts the time sampling average of each individual from the corresponding observation, using the demeaning \(\bf Q\) matrix: \({\bf Q}= {\bf Q}^{+} \otimes {\bf I}_{N}\), where \({\bf Q}^{+}=\left({\bf I}_{Tm}-\dfrac{1}{Tm}{\bf J}_{Tm}\right)\), with \({\bf J}_{Tm}={\boldsymbol \tau}_{Tm}{\boldsymbol \tau}_{Tm}^{T}\) and \({\boldsymbol \tau}_{Tm}\) a (Tm\(\times\)1)) vector of 1s. Note that the \(\bf Q\) matrix needs the data to be ordered, first, by time and second by individuals. If this is not the case, the data are routinely sorted by spsurtime(), according to the argument time.

Demeaning the data can produce strange results, in spatial SUR panels, if the time dimension, Tm, is very large and N, number of spatial units, is finite in which case we have a problem of high-dimensionality: we need to estimate \(\dfrac{Tm \times (Tm-1)}{2}\) covariance terms with N\(\times\)Tm observations. Moreover, in the general case where G>1, if Tm is very small and N is finite demeaning can produce problems of singularity in the covariance matrix because it inflates the correlations among the error terms by a proportionality factor of order \(O(\dfrac{1}{Tm})\), (Hsiao 2003). Thus, our advice is using demeaning for large N and moderate values of Tm in the spatial panel case and avoid demeaning if Tm is very small and N is moderate, in both cases.

3.2.1 A case of example

In this example we employ, once more, the data set NCOVR. We should remind that this data frame has not a panel data structure. So, the first question is to reshape NCOVR. This is the purpose of the indices index_time and index_indiv. The panel equation explains the homicides rates, HR, in each county in function of the population structure, PS, and the unemployment rate, UE; the equation has a SLM structure, which is estimated by 3sls.

data(NCOVR,package="spsur")
N <- nrow(NCOVR)
Tm <- 4
index_time <- rep(1:Tm, each = N)
index_indiv <- rep(1:N, Tm)
pHR <- c(NCOVR$HR60, NCOVR$HR70, NCOVR$HR80, NCOVR$HR90)
pPS <- c(NCOVR$PS60, NCOVR$PS70, NCOVR$PS80, NCOVR$PS90)
pUE <- c(NCOVR$UE60, NCOVR$UE70, NCOVR$UE80, NCOVR$UE90)
pNCOVR <- data.frame(indiv = index_indiv, time = index_time,
                      HR = pHR, PS = pPS, UE = pUE)
rm(NCOVR,pHR,pPS,pUE,index_time,index_indiv)
form_pHR <- HR ~ PS + UE
pHR_slm <- spsurtime(Form = form_pHR, data = pNCOVR, W = W,
                     time = pNCOVR$time, type = "slm", method = "3sls")
#> Time to fit the model:  0.33  seconds
summary(pHR_slm)
#> Call:
#> spsur3sls(W = W, X = X, Y = Y, G = G, N = N, Tm = Tm, p = p, 
#>     type = type, maxlagW = maxlagW)
#> 
#>  
#> Spatial SUR model type:  slm 
#> 
#> Equation  1 
#>                Estimate Std. Error t value Pr(>|t|)
#> (Intercept)_1  2.501017   1.645140  1.5202   0.1285
#> PS_1          -0.029887   0.105860 -0.2823   0.7777
#> UE_1           0.029898   0.046422  0.6441   0.5196
#> lambda_1       0.405674   0.392876  1.0326   0.3018
#> R-squared: 0.2843 
#>   Equation  2 
#>                Estimate Std. Error t value Pr(>|t|)  
#> (Intercept)_2 1.8871930  1.9497433  0.9679  0.33310  
#> PS_2          0.1705918  0.1651996  1.0326  0.30179  
#> UE_2          0.0099723  0.0497241  0.2006  0.84105  
#> lambda_2      0.6971788  0.2942217  2.3696  0.01782 *
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> R-squared: 0.3604 
#>   Equation  3 
#>                Estimate Std. Error t value  Pr(>|t|)    
#> (Intercept)_3  6.881003   1.220794  5.6365 1.774e-08 ***
#> PS_3           0.658929   0.135891  4.8489 1.256e-06 ***
#> UE_3           0.183858   0.032266  5.6982 1.239e-08 ***
#> lambda_3      -0.172457   0.178288 -0.9673    0.3334    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> R-squared: 0.06152 
#>   Equation  4 
#>               Estimate Std. Error t value  Pr(>|t|)    
#> (Intercept)_4 3.917895   0.369609 10.6001 < 2.2e-16 ***
#> PS_4          0.909703   0.119384  7.6200 2.723e-14 ***
#> UE_4          0.319217   0.042374  7.5332 5.295e-14 ***
#> lambda_4      0.023133   0.073747  0.3137    0.7538    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> R-squared: 0.134 
#>   Variance-Covariance Matrix of inter-equation residuals:                                    
#>  31.72416 20.95984 16.31181 13.23690
#>  20.95984 53.75722 26.29607 19.95032
#>  16.31181 26.29607 45.37777 19.69094
#>  13.23690 19.95032 19.69094 38.89755
#> Correlation Matrix of inter-equation residuals:                                        
#>  1.0000000 0.5075448 0.4299180 0.3768166
#>  0.5075448 1.0000000 0.5324158 0.4362851
#>  0.4299180 0.5324158 1.0000000 0.4686880
#>  0.3768166 0.4362851 0.4686880 1.0000000
#> 
#>  R-sq. pooled: 0.1608

The output reported by spsurtime() is quite detailed. Note that the ML estimation of the SUR-SLM panel equation takes almost 4 minutes while the 3sls requires of only 0.28 seconds.

4 Misspecification tests and other utilities

The estimation of spatial SUR models needs of a series of assumptions that are critical to guarantee the good properties of the (ML, 3SLS) estimates. Some of the tests required to build the spatial structure of the model have been presented in Section 2.1. Below we add a series of misspecification tests, implemented in spsur, that may help to improve the specification.

4.1 Testing for the diagonality of the covariance matrix

The problem is well known in the econometric literature: if the matrix is diagonal, the SUR structure collapses to a set of non-related equations that can be estimated separately. The null and alternative hypotheses are:

\[\begin{equation} \begin{array}{ll} H_0: {\bf \Sigma} = diag(\sigma_{11},...,\sigma_{TmTm}) \\ H_A: no \ \ H_0 \end{array} \tag{4.1} \end{equation}\]

Breusch and Pagan (1980) obtained a simple Lagrange Multiplier test, that is become very popular: \(\mathop{LM}\nolimits_{\boldsymbol \Sigma }^{} =N\sum _{t=1}^{Tm}\sum _{j=1}^{t-1}\mathop{r}\nolimits_{ij}^{2} = \chi^2_{Tm(Tm-1)/2}\). The correlation coefficients, \(\mathop{r}\nolimits_{ij}\), are obtained from the ML residuals corresponding to the Tm time periods, as said, estimated separately. Note that in the general case of G>1, Tm>1 and N>1, the covariance matrix is of order (G\(\times\)G). The LM test of diagonality is implement in the function spsurml() and printed routinely.

4.2 Marginal Lagrange Multipliers

The purpose of the Marginal Lagrange Multipliers is to check if there remains correlation in the error terms, once we have estimated a SUR-SLM model or if a spatial lag of the explained variable has been omitted, once we have estimated a SUR-SEM model (Anselin and Bera 1998). The null hypothesis for the first case is the same as that in (2.7), but now the hypothesis is tested in a SUR-SLM, not in a SUR-SIM model. The model of the alternative is a SUR-SARAR equation. Similarly, the null and alternative hypotheses of the second case are the same as those that appear in (2.8), which are tested in a SUR-SEM ML estimation. The model of the alternative is, once again, a SUR-SARAR.

Both tests are printed routinely in the output of the spsurml() function, denoted as \(LMM\). There is no risk of confusion: if we have estimated a SUR-SLM model (or a SUR-SDM) with spsurml(), \(LMM\) tests for omitted spatial errors and for omitted spatial lags in case of having estimated a SUR-SEM model (or a SUR-SDEM).

4.3 Likelihood Ratio tests

The models listed in Section 2 are related in a nesting sequence of restrictions. Given that spsurml() operates in a ML framework, we can easily obtain the corresponding Likelihood Ratios, LR, to compare all the competing models. This is the purpose of the function lrtestspsur(), whose use is very simple. In case of Example 1, the syntax to obtain the LR test is:

The output of the function only shows the value of the corresponding LRs and their associated pvalues, obtained from a \(\chi^2\) distribution with degrees of freedom equal to the number of restrictions in the respective null hypothesis. lrtestspsur() does not show the estimated models, although they are computed by the function.

The last LR in the example above, COMFAC, is known as the Common Factor test and occupies a relevant position into a Gets, General to specific, selection strategy of spatial models because the two most popular specifications, SLM and SEM, are not nested and cannot be directly related. The COMFAC offers the possibility of, indirectly, compare the two specifications through a spatial Durbin equation (Mur and Angulo 2006).

4.4 Testing for linear restrictions on the parameters

As described before, the functions spsurml() and spsur3sls() solve ML and 3SLS estimation of the spatial SUR models. Now we are interested in a strategy to test linear restrictions on the parameters of the model, which shoul rely on the asymptotic properties of both algorithms.

Lee (2004) showed that, in a pure cross-sectional setting, the ML estimates of the SLM model are \(\sqrt{N}\)-consistent. Under mild regularity conditions, they are also asymptotically normally distributed. These conclusions can be extended to other spatial specifications, such as those computed in spsur. As indicated in Elhorst (2014), the ML estimates maintain their good properties in a panel data framework, which now can be consistent with N, Tm or with both simultaneously (the exception is the recovery of the unobserved effects, that requires of a fixed N). In a SUR context, Oberhofer and Kmenta (1974) showed that the ML algorithm is also consistent, efficient and asymptotically normally distributed, with Tm. Similarly, Schmidt (1990) demonstrates the asymptotic good properties of the 3SLS estimates in multivariate models with endogenous regressors, assuming that all instruments are orthogonal with all error terms.

That is, in both cases it is possible to sustain the asymptotic normality, with Tm and/or N, of the corresponding estimates. In the case of ML, under the standard conditions described in Lee (2004) and Elhorst (2014), we can write that:

\[\begin{equation} \hat{{\boldsymbol \eta}} \underset{as}{\sim} N \left({\boldsymbol \eta}; {\bf V}({\boldsymbol \eta}) \right) \tag{4.2} \end{equation}\]

where \({\boldsymbol \eta}\) is the vector of parameters included in the spatial SUR model, as defined in (2.4) and \({\bf V}({\boldsymbol \eta})\) the corresponding covariance matrix of the ML estimates, obtained from the Information Matrix as described in López, Mur, and Angulo (2014). A similar result can be stated for the case of the 3SLS estimates, under the conditions of orthogonality stated previouly (the details on the covariance matrix can be found in Chapter 7 of Wooldridge (2010)). Let us note that the \(\rho\) and \(\sigma\) parameters are excluded from the inference in the 3SLS approach.

Using the asymptotic distribution of (4.2), it is immediate to test for any kind of linear restriction on the vector of parameters of the model, such as exclusion restrictions or across-equation linear restrictions. The procedure is as usual. First we need to write down the (\(r \times p\)) matrix \(\bf R\), with rank r, that captures the linear restrictions:

\[\begin{equation} H_0:\, \mathop{\ \ \ \textbf{R} {\boldsymbol \eta}} = \textbf{b} \ \ \ \textit{vs} \ \ \ H_{A} :\, \mathop{\ \ \ \textbf{R} {\boldsymbol \eta}} \neq \textbf{b} \tag{4.3} \end{equation}\]

The Wald statistic for testing (4.3) is

\[\begin{equation} W = \left( \textbf{R} \hat{{\boldsymbol \eta}} - \textbf{b} \right)^{T} \left(\textbf{R} \hat{{\bf V}}\textbf{R}^{T}\right)^{-1} \left(\textbf{R}\hat{{\boldsymbol \eta}} - \textbf{b} \right) \underset{as}{\sim} \chi_r^2 \tag{4.4} \end{equation}\]

With slight variations, we obtain a similar result for the 3SLS algorithm.

As a complement to the Wald tests, spsur offers the possibility to test linear restrictions on the \(\beta\) parameters through a Likelihood Ratio. Of course, this option is restricted to ML estimates and it is a bit more costly than the Wald test because we need to estimate two models, the model of the null hypothesis and the model of the alternative hypothesis.

4.4.1 Illustration

spsur includes two functions to test linear restrictions on the parameters of the model. The first, is the function wald_betas(), directed at testing linear restrictions on the \(\beta\) parameters whereas the second, wald_deltas(), evaluates linear restrictions on the spatial coefficients (\(\lambda\) or \(\rho\) parameters). Both functions operate with the Wald statistic of (4.4). As said before, the linear restrictions on the \(\beta\) parameters can also be evaluated through a LR test using the function lr_betas_spsur().

wald_betas() operates with the \(\beta\) parameters. The function needs three arguments. The first is an object of spsur class obtained with the functions spsurml() or spsur3sls(), the second is a matrix that must reflect the restrictions on the \(\beta\) parameters and the third is a vector with the values of the restrictions, under the null hypothesis. Let us assume that we need to test whether the two intercepts, in the two equations of the SUR-SLM model of Example 1, are equal:

\[\begin{equation} H_0: \beta_{10}=\beta_{20} \ \ \ \ vs \ \ \ \ H_A: \beta_{10} \neq \beta_{20} \tag{4.5} \end{equation}\]

The code for this problem is:

If, as in the example above, the wald_betas() leads to accept the restrictions, the user, probably, would want to estimate the constrained version of the corresponding model, such as:

\[\begin{equation} \begin{array}{ll} WAGE_{83} = \beta_{0} + \beta_{11} \ UN_{83} + \beta_{12} \ NMR_{83} + \beta_{31} \ SMSA + \lambda_{1} WAGE_{83} + \varepsilon_{83} \\ WAGE_{81} = \beta_{0} + \beta_{21} \ UN_{80} + \beta_{22} \ NMR_{80} + \beta_{32} \ SMSA + \lambda_{2} WAGE_{81} + \varepsilon_{81} \end{array} \tag{4.6} \end{equation}\]

spsurml() and spsur3sls() provide restricted ML and 3SLS estimation for the spatial SUR models through the optional arguments R=R1 and b=b1 in the description of the functions. The code, for the example that we are considering is,

Note that in the output only the intercept is in the first equation but is the same for all the equations.

The purpose of the function, wald_deltas(), is to obtain a Wald test to evaluate linear restrictions on the spatial parameters of the model, \(\lambda\) or \(\rho\) parameters. The use of the function is totally analogous to wald_betas(), and extends to ML and 3SLS algorithms. For example, in the SUR-SLM of (4.6) the user may be interested in testing if the \(\lambda\) parameters of the two equations are equal, that is:

\[\begin{equation} H_0: \lambda_1=\lambda_2 \ \ \ \ vs \ \ \ \ H_A: \lambda_1 \neq \lambda_2 \tag{4.7} \end{equation}\]

lr_betas_spsur() operates only with objects produced by spsurml(), which correspond to ML estimates of spatial SUR models. For the previous example, the code is simply:

5 Direct, Indirect and Total impacts

According to LeSage and Pace (2009), a major difference between times series models and spatial models consists in the interpretation of the \(\beta\) parameters. In a time series model, the \(\beta\) parameters have a unequivocal meaning: they measure the expected increase in the explained variable as a consequence of a unitary change in a given regressor. This impact is constant across time, \(\dfrac{\partial y_{t}}{\partial x_{t}} = \beta_{k}\), and its significance can be checked by a simple tratio.

Things are not so clear in a spatial setting because of the feed-back effects. In fact, a change in a regressor, in a given location, will induce a chain of reactions that will spill over into the space, depending on the weights matrix, \(\bf W\). Consider, for example, the reduced form of a SUR-SARAR model:

\[\begin{equation} {\bf y}={\bf A}^{-1}{\bf X}{\boldsymbol \beta}+{\bf A}^{-1}{\bf B}^{-1}{\boldsymbol \varepsilon}={\bf I}_{Tm}\otimes\left[{\bf I}_{N}+{\boldsymbol \Lambda} \otimes {\bf W}+{\boldsymbol \Lambda}^{2}\otimes {\bf W}^{2}+\ldots \right] {\bf X} {\boldsymbol \beta}+{\bf A}^{-1}{\bf B}^{-1}{\boldsymbol \varepsilon} \tag{5.1} \end{equation}\]

The expression above shows that a change in the value of a regressor, in certain point in space, will impact the explained variable in the same location and also in other points of space. Note that \({\bf A}\) is a block-diagonal matrix whose non-zero matrices are \({\bf A}_{t}={\bf I}_{N}-\lambda_{t}\textbf{W}; t=1,2, \ldots Tm\). The inverse of \({\bf A}_{t}\) admits a power expansion on the weights matrix, , and the spatial parameters \(\lambda_{t}\), which lead us to the result in (5.1).

Turning to the general case of G>1 and Tm>1 and using the notation in LeSage and Pace (2009), we may write the g-th equation of (5.1), in period t, as:

\[\begin{equation} {\mathop{{\bf y}}\nolimits_{tg} }=\mathop{\left[\begin{array}{c} {\mathop{y}\nolimits_{tg1} } \\ {\mathop{y}\nolimits_{tg2} } \\ {\cdots } \\ {\mathop{y}\nolimits_{tgN} } \end{array}\right]} = \sum_{h=1}^{p_{q}} \mathop{\left[\begin{array}{cccc} {\mathop{S}\nolimits^{h}_{tg}({\bf W})_{11}} & {\mathop{S}\nolimits^{h}_{tg}({\bf W})_{12}} & {\cdots } & {\mathop{S}\nolimits^{h}_{tg}({\bf W})_{1N}} \\ {\mathop{S}\nolimits^{h}_{tg}({\bf W})_{21}} & {\mathop{S}\nolimits^{h}_{tg}({\bf W})_{22}} & {\cdots } & {\mathop{S}\nolimits^{h}_{tg}({\bf W})_{2N}} \\ {\cdots } & {\cdots } & {\cdots } & {\cdots } \\ {\mathop{S}\nolimits^{h}_{tg}({\bf W})_{N1}} & {\mathop{S}\nolimits^{h}_{tg}({\bf W})_{N2}} & {\cdots } & {\mathop{S}\nolimits^{h}_{tg}({\bf W})_{NN}} \end{array}\right]} \mathop{\left[\begin{array}{c} {\mathop{x}\nolimits_{tg1} } \\ {\mathop{x}\nolimits_{tg2} } \\ {\cdots } \\ {\mathop{x}\nolimits_{tgN} } \end{array}\right]}+ {\mathop{{\bf A}}\nolimits_{g}^{-1}} {\mathop{{\bf B}}\nolimits_{g}^{-1}} \mathop{\left[\begin{array}{c} {\mathop{\varepsilon}\nolimits_{tg1} } \\ {\mathop{\varepsilon}\nolimits_{tg2} } \\ {\cdots } \\ {\mathop{\varepsilon}\nolimits_{tgN} } \end{array}\right]} \tag{5.2} \end{equation}\]

being \({\mathop{{\bf S}}\nolimits^{h}_{tg}({\bf W})}= {\mathop{\beta}\nolimits_{h}} {\mathop{{\bf A}}\nolimits_{g}^{-1}}\) and \({\mathop{S}\nolimits^{h}_{tg}({\bf W})_{ij}}\) its \(i,j-th\) element. From above, it is clear that the value of the partial derivative of \(y\) with respect to \(x_{h}\) depends on the point in space that we are impacting; that is: \(\dfrac{\partial y_{itg}}{\partial x_{h_{jtg}}} = {\mathop{S}\nolimits^{h}_{tg}({\bf W})_{ij}}\). Moreover, it is evident, form (5.2), that a shock in a point will also spread out through the space, but the discussion about spatial multipliers excludes the case of impacting the error terms.

Specifically, for each equation and every regressor, we can evaluate three effects, such as:

\[\begin{equation} \begin{array}{c} {\mathop{{\bf M}}\nolimits_{Total_{g}^{h}}} = \dfrac{1}{N} \tau^{T}_{N} {\bf S}^{h}_{.g} \tau_{N} \ \ \ \\ {\mathop{{\bf M}}\nolimits_{Direct_{g}^{h}}} = \dfrac{1}{N} tr {\bf S}^{h}_{.g}\\ {\mathop{{\bf M}}\nolimits_{Indirect_{g}^{h}}} = {\mathop{{\bf M}}\nolimits_{Total_{g}^{h}}} -{\mathop{{\bf M}}\nolimits_{Direct_{g}^{h}}} \\ \\ h=1,\ldots p_{g}; g=1 \ldots G \end{array} \tag{5.3} \end{equation}\]

being \(\tau_{N}\) a N\(\times\)1 unitary vector and \({\bf S}^{h}_{.g}= \sum_{t=1}^{Tm} {\bf S}^{h}_{tg}({\bf W})\).

Under the assumption that the spatial parameters and the weighting matrices are constant across time, the partial derivatives in (5.2) are also constant. Futhermore, it is important to recall that the models in spsur have no time dynamics, so the impact of a modification in a regressors must be absorved in the same time period.

Finally, we can aggregate the multipliers in (5.3) to obtain overall measures for each variable:

\[\begin{equation} \begin{array}{c} {\mathop{{\bf M}}\nolimits_{Total_{\bullet}^{h}}} = \sum_{g=1}^{G} {\mathop{{\bf M}}\nolimits_{Total_{g}^{h}}} \\ {\mathop{{\bf M}}\nolimits_{Direct_{\bullet}^{h}}} = \sum_{g=1}^{G} {\mathop{{\bf M}}\nolimits_{Direct_{g}^{h}}} \\ {\mathop{{\bf M}}\nolimits_{Indirect_{\bullet}^{h}}} = \sum_{g=1}^{G} {\mathop{{\bf M}}\nolimits_{Indirect_{g}^{h}}} \\ \\ h=1,\ldots p_{g} \end{array} \tag{5.4} \end{equation}\]

The multipliers of (5.4) should be accompanied by measures of statistical significance to be useful for the user. LeSage and Pace (2009) suggest two possibilities. First is the Bayesian Markov Chain Monte Carlo method, MCMC, which draws samples of parameters from their posterior distribution and computes random sequences of multipliers, to calibrate the estimates. However, to our knowledge, there are no results available for spatial SUR models using MCMC. The second alternative is bootstrapping. In the case of spsur, we prefer a parametric bootstrap, which means that:

  • We estimate the spatial SUR model by ML.
  • We draw B samples of random vectors of the parameters in our spatial SUR, \({\boldsymbol \eta}\), using a multivariate normal distribution, \(N \left(\hat{{\boldsymbol \eta}}; \hat{{\bf V}}(\hat{{\boldsymbol \eta}})\right)\), where \(\hat{{\boldsymbol \eta}}\) is the ML estimation of the parameters and \(\hat{{\bf V}}(\hat{{\boldsymbol \eta}})\) their corresponding estimated covariance matrix. Let denote by \({\boldsymbol \eta}_b\) the b-th draw.
  • We re-evaluate the effects using the boostrapped vectors \({\boldsymbol \eta}_b\), maintaining the data of the regressors. In this way, a sequence of values: \({\mathop{M}\nolimits_{Total_{g}^{h^{(b)}}}}\), \({\mathop{M}\nolimits_{Direct_{g}^{h^{(b)}}}}\) and \({\mathop{M}\nolimits_{Indirect_{g}^{h^{(b)}}}}\) for \(b=1, \ldots B\) is obtained.
  • Finally, we calculate the corresponding measure of dispersion as the standard deviation of the series of estimated multipliers:

\[\begin{equation} {\mathop{\sigma}\nolimits_{M_{Effect_{g}^{h}}}} = \sqrt{ \dfrac{1}{B} \sum_{b=1}^B \left( \mathop{M}\nolimits_{Effect_{g}^{h^{(b)}}} - \overline{\mathop{M}\nolimits_{Effect_{g}^{b}}} \right)^{2}} \end{equation}\]

with Effect=Total, Direct, Indirect.

5.0.1 A case of example

The discussion about the multipliers is solved quite easily in spsur through the function impacts(), which has only two arguments: an object of class spsur and the number of simulations to estimate the significance measures, nsim, whose default value is 1000.

Below we continue with the case of Example 1 where a SUR-SLM model was specified for the data frame spc. Let us recall that this is a spatial Phillips curve explaining unemployment, UN, through net migration, NMR, and a dummy variable, SMSA, for metropolitan counties. The code is very simple

The output of the function are three tables of impacts for each regressor, with their respective measures of statistical significance. As indicated, the user can specify the number of draws for the measures of significance. See the help of the function for more examples using the database NCOVR.

6 A general case: G>1, Tm>1 and N>1

The framework of the previous Sections can be generalized quite easily to a case where the user has data for more than one single equation; that is, G>1. The equivalent to the GNM model of expression (2.2) is:

\[\begin{align} {\bf y}_{tg} &= \lambda_{g} {\bf W}{\bf y}_{tg} + {\bf X}_{tg} {\boldsymbol \beta}_{g} + {\bf W} {\bf X}_{tg}^* {\boldsymbol \gamma}_{g} + {\bf u}_{tg} \\ {\bf u}_{tg} &= \rho_{g} {\bf W} {\bf u}_{tg} + {\boldsymbol \varepsilon}_{tg} \nonumber \tag{6.1} \end{align}\]

\[\begin{equation*} E \left[ {\boldsymbol \varepsilon}_{tg} \right] = 0 \quad E \left[ {\boldsymbol \varepsilon}_{tg} {\boldsymbol \varepsilon}^{T}_{sh} \right] = \left\{ \begin{array}{ll} \sigma_{gh} {\bf I}_{N} & t = s \\ {\bf 0}_{N} & t \ne s \\ \end{array} \right. \end{equation*}\]

for g=1,\(\ldots\)G; t=1,\(\ldots\)Tm. The SUR nature of the model appears in the form of contemporaneous correlation among the errors of the different equations, corresponding to the same individual. This correlation is zero for errors belonging to equations of different individuals and/or different time periods. Serial correlation is only allowed in Anselin’s approach (treated with the function spsurtime(), as shown before). If we stack the G vectors of errors corresponding to period t in a (NG\(\times\)1) vector \({\boldsymbol \varepsilon}_{t}\), its covariance matrix can be written as \(E\left[\mathop{{\boldsymbol \varepsilon}}\nolimits_{t} \mathop{{\boldsymbol \varepsilon}}\nolimits_{t}^{T} \right]= {\boldsymbol \Sigma} \otimes \mathop{{\bf I}}\nolimits_{N}\); this is a (NG\(\times\)NG) matrix. In continuation, we can stack these Tm vectors into a general (TmNG\(\times\)1) vector by ordering the observations, first, by time period, then by equation and finally by individuals. Using obvious matrix notation, we obtain:

\[\begin{equation} \begin{array}{l} \left. \begin{array}{l} {\, \begin{array}{ll} {\bf Ay}={\bf X}{\boldsymbol \beta} +({\bf I}_{Tm} \otimes {\bf I}_{G} \otimes {W}) {\bf X}^* {\boldsymbol \gamma}+{\bf u} & {} \\ {{\bf Bu}={\boldsymbol \varepsilon};\, \, {{\boldsymbol \varepsilon} \sim (0,{\boldsymbol \Omega })} } \end{array}} \end{array}\right\} \\ {} \\ {\bf y}=\mathop{\left[\begin{array}{c} {\mathop{{\bf y}}\nolimits_{1} } \\ {\mathop{{\bf y}}\nolimits_{2} } \\ {\cdots } \\ {\mathop{{\bf y}}\nolimits_{Tm} } \end{array}\right]}\limits_{TmGN\times1} ; \, \, \, {\bf X}=\mathop{\left[\begin{array}{c} {\mathop{{\bf X}}\nolimits_{1} } \\ {\mathop{{\bf X}}\nolimits_{2} } \\ {\cdots } \\ {\mathop{{\bf X}}\nolimits_{Tm} } \end{array}\right]}\limits_{TmGN\times p} ; \, \, \, {\bf X}^*=\mathop{\left[\begin{array}{c} {\mathop{\bf X}^*_{1}} \\ {\mathop{\bf X}^*_{2}} \\ {\cdots } \\ {\mathop{\bf X}^*_{Tm}} \end{array}\right]}\limits_{TmGN\times (p-G)} ; \, \, \, {\bf u}=\mathop{\left[\begin{array}{c} {\mathop{{\bf u}}\nolimits_{1} } \\ {\mathop{{\bf u}}\nolimits_{2} } \\ {\cdots } \\ {\mathop{{\bf u}}\nolimits_{Tm} } \end{array}\right]}\limits_{TmGN\times 1} ; \, \, \, {\boldsymbol \varepsilon} =\mathop{\left[\begin{array}{c} {\mathop{{\boldsymbol \varepsilon}}\nolimits_{1} } \\ {\mathop{{\boldsymbol \varepsilon}}\nolimits_{2} } \\ {\cdots } \\ {\mathop{{\boldsymbol \varepsilon}}\nolimits_{Tm} } \end{array}\right]}\limits_{TmGN\times 1};\\ {} \\ \mathop{{\bf X}}\nolimits_{t} =\mathop{\left[\begin{array}{cccc} {\mathop{{\bf X}}\nolimits_{1t} } & {0} & {\cdots } & {0} \\ {0} & {\mathop{{\bf X}}\nolimits_{2t} } & {\cdots } & {0} \\ {\cdots } & {\cdots } & {\cdots } & {\cdots } \\ {0} & {0} & {\cdots } & {\mathop{{\bf X}}\nolimits_{Gt} } \end{array}\right]}\limits_{GN\times p} ; \mathop{{\bf X}}^*_{t} =\mathop{\left[\begin{array}{cccc} \mathop{{\bf X}}^*_{1t} & {0} & {\cdots } & {0} \\ {0} & \mathop{{\bf X}}^*_{2t} & {\cdots } & {0} \\ {\cdots } & {\cdots } & {\cdots } & {\cdots } \\ {0} & {0} & {\cdots } & \mathop{{\bf X}}^*_{Gt} \end{array}\right]}\limits_{GN\times (p-G)}; {\boldsymbol \beta} =\mathop{\left[\begin{array}{c} {\mathop{{\boldsymbol \beta}}\nolimits_{1} } \\ {\mathop{{\boldsymbol \beta}}\nolimits_{2} } \\ {\cdots } \\ {\mathop{{\boldsymbol \beta}}\nolimits_{G} } \end{array}\right]}\limits_{p\times 1} ; {\boldsymbol \gamma} =\mathop{\left[\begin{array}{c} {\mathop{{\boldsymbol \gamma}}\nolimits_{1} } \\ {\mathop{{\boldsymbol \gamma}}\nolimits_{2} } \\ {\cdots } \\ {\mathop{{\boldsymbol \gamma}}\nolimits_{G} } \end{array}\right]}\limits_{(p-G)\times 1} ; \end{array} \tag{6.2} \end{equation}\]

where \({\bf A}={\bf I}_{Tm} \otimes \left[{\bf I}_{GN} -{\boldsymbol \Lambda} \otimes {\bf W}\right] = {\bf I}_{Tm} \otimes diag\left\{{\bf A}_{g}; g=1,2,..,G\right\}\), and \({\bf B}={\bf I}_{Tm} \otimes \left[{\bf I}_{GN} -{\boldsymbol \Upsilon } \otimes {\bf W}\right]=diag \left\{{\bf B}_{g} ;g=1,2,..,G\right\}\), \({\boldsymbol \Lambda}\) and \({\boldsymbol \Upsilon}\) are two (G\(\times\)G) diagonal matrices containing the spatial parameters \({\lambda}_{g}\) and \({\rho}_{g}\), respectively. Moreover \({\boldsymbol \Omega} ={\bf I}_{Tm} \otimes {\boldsymbol \Sigma} \otimes {\bf I}_{N}\). The same as before, we allow for a different number of regressors in the right hand side of each equation, \(p_{g}\), where the total number of \(\beta\) parameters is: \(p=\Sigma_{g=1}^{G} p_{g}\) and p-G is the total number of \(\gamma\) parameters.

Assuming that the errors are normally distributed, the log-likelihood function of the spatial GNM-SUR model of (6.2) can be written as:

\[\begin{align} l({\bf y};{\boldsymbol \eta}) =& -\frac{NTmG}{2} \ln \left(2\pi \right)-\frac{NTm}{2} \ln \left|{\boldsymbol \Sigma} \right|+Tm\left\{\sum _{g=1}^{G}\ln \left|{\bf B}_{g} \right| +\sum _{g=1}^{G}\ln \left|{\bf A}_{g} \right| \right\} \\ & -\frac{\left\{{\bf Ay}-[{\bf X};{\bf X}_{{\bf W}}^*] {\boldsymbol \Pi} \right\}^{T}{\bf B}^{T}\mathop{{\boldsymbol \Omega} }\nolimits^{-1} {\bf B}\left\{{\bf Ay}-[{\bf X};{\bf X}_{{\bf W}}^*]{\boldsymbol \Pi} \right\}}{2} \tag{6.3} \end{align}\]

where \({\boldsymbol \eta}^{T}=\left[{\boldsymbol \Pi}^{T}; \mathop{\lambda }\nolimits_{1} ;\cdots ;\mathop{\lambda }\nolimits_{G} ;\mathop{\rho }\nolimits_{1} ;\cdots ;\mathop{\rho }\nolimits_{G} ;\mathop{\sigma }\nolimits_{ij} \right]\) is the vector parameters of the model, being \({\boldsymbol \Pi}=[{\boldsymbol \beta}; {\boldsymbol \gamma}]^{T}\). The discussion that continues from this point is totally analogous to the spatial SUR panel analyzed so far, thus we leave it now to avoid repetitions.

The example below shows the case of a SUR-SLM model, for NCOVR, where G=3, Tm=4 and N=3,085, including the reshaping of a database into a classical panel:

The estimation of a SLM model with G=3, Tm=4 and N=3,085 can be done as usual using the spsurml() function (to prevent overflows of memory we set cov argument equals to FALSE). The last sentence of the code just print the vector of estimates of the \(\lambda\) and \(\beta\) parameters and the correlation matrix among the residuals of the three equations:

7 How to generate random SUR datasets

This section describes an additional functionality included in spsur, which offers the possibility of generate random data sets with the dimensions and spatial structure decided by the user. This is the purpose of dgp_spsur(). The function has great flexibility to accomodate the necessities of the user and may be of interest in two different situations: (i) as part of a larger simulation experiment where the user needs random data sets with specific features (SUR nature, spatial structure, etc.) or (ii) with the aim of showing specific properties of spatial SUR models and inferential procedures related to them.

dgp_spsur() is a Data Generating Process where the user decides the dimensions of the required data set, specificies the values of the parameters that intervene in the corresponding SUR model and selects the distribution function from which the regressors and the random terms are to be drawn. The syntax of the function is the following:

dgp_spsur <- function(Sigma, Tm = 1, G, N, Betas, Thetas = NULL, durbin = FALSE, rho = NULL, lambda = NULL, p = NULL, W = NULL, X = NULL, pdfU = "nvrnorm", pdfX = "nvrnorm")

The dimensions of the data set are defined by the arguments Tm, N and G. Then the user specifies the SUR structure, begining by the covariance matrix among the residuals of the G equations, which is the core of the SUR model, with the argument Sigma. Of course, a (N\(\times\)N) spatial weighting matrix should be uploaded in the argument W.

In continuation, the user should think in the parameters that intervene in the equations of the SUR. This is the purpose of the arguments Betas, Thetas, rho and lambda which are defined through row vectors of the adequate dimensions. A fundamental piece of information is the argument p that defines the number of regressors (that is, x variables) that appear in each equation; if p is a scalar, every equation has the same number of regressors whereas if we need a model with different number of regressor in each equation, p must be a (1 \(\times\) G) vector. In both cases, Betas is a row vector of order (\(1 \times G^{*}\)), where \(G^{*}\) is pG in the first case and \(\Sigma_{1}^{G}p_g\) in the second case.

Note that, depending on the values given to these arguments, the type of spatial model is unequivocally determined. For example, if Thetas=NULL, rho=NULL but lambda is not NULL, the user is specifying a SUR-SLM model or a SUR-SDEM model in the case of lambda=NULL but Thetas and rho different from NULL.

There are two possibilities to build the matrix \(\textbf{X}\) of regressors. In Monte Carlo experiments is very usual to maintain fixed the values of the regressors and repeatedly draw random matrices of error terms, to simulate the explained variable. If this is the case, the user should upload the required \(\textbf{X}\) matrix in the argument X, which must be consistent with the dimensions of the SUR model. If X=NULL the user prefers to randomly draw this matrix using the function dgp_spsur(). This is the purpose of the argument pdfX. By default, dgp_spsur() uses a multivariate standard Normal distribution to draw the observations of the regressors; the alternative is a Uniform distribution, in the interval \(\left[0,1\right]\), for each regressor. In both cases, the regressors are generated independently.

Finally, the argument dfU informs which multivariate probability distribution function should be used to draw the values of the error terms. dgp_spsur() offers two possibilities, the multivariate Normal, which is the default, and the log-Normal distribution which means to exponentiate the sample drawn from the multivariate Normal distribution. In both cases, the covariance matrix matches the \({\bf \Sigma}\) matrix uploaded previously. The log-Normal should be taken as a clear departure from the assumption of normality.

The output of dgp_spsur() is a vector, called \({\bf Y}\), of order (TmNG\(\times\)1), with the values generated for the explained variable in the G equations of the spatial SUR model. If the argument X is set to NULL the user will receive another matrix, called \({\bf XX}\), of order (TmNG\(\times \Sigma_{g=1}^{G}p_g)\), with the values generated for the regressors of the SUR.

The example below shows the random generation of 3 variables, with a SUR structure, for 500 individuals and 1 cross-section, using a SUR-SDM model. Each equation contains 3 regressors, as specified in p. The weighting matrix, W, is obtained randomly applying the 5 nearest neighbors criteria, after having drawn the lattitude and longitude of each point in space

As said, the output of this function is a list with the data (explained variable and regressors) in matrix form. Next code estimates a SDM model for the simulated data:

8 Conclusions and Directions for Future Research

This article introduces a new R package, spsur, directed at estimating and solving inference on spatial SUR models. Our intention was to fill a gap in the applied spatial econometrics literature, in the sense that there are few papers which use spatial SUR models as a useful research technique. Our impression is that the lack of an efficient, friendly software has slowed its adoption in this field

The package consist of 10 inter-related functions. lmtestspsur() test for the existence of spatial effects in a model specified without spatial mechanisms, or SUR-SIM model. Then, assuming that some of these tests are positive, three functions spsurml(), spsur3sls() and spsurtime() allow to estimate, by ML or 3SLS, the prefered spatial specification. Four additional functions, wald_betas(), wald_deltas(), lr_betas_spsur() and lrtestspsur() can be used to improve the specification. The function impacts() is useful to obtain the spatial multipliers of the variables in the model selected, that come with a collection of measures of statistical significance. Finally, the function dgp_sur() is intented to be used in Monte Carlo experiments, involving spatial SUR models, or to illustrate specific features of this type of data.

Of course, this is only the first, basic version of the package that should be improved in the future. A list of actions already in the desk of the coauthors are (i) the extension of 3SLS to the treatment of endogenous regressors, others than the spatial lag of the explained variable, and the use of external instruments, (ii) a more efficient treatment of the unobserved effects, which allowed for their estimation and, at the same time, enlarge the options to demean the data, (iii) the introduction of time dynamics in a strictly stationary framework.

References

Anselin, L. 1999. “SpaceStat, a Software Program for the Analysis of Spatial Data, Version 1.90. Bio Medware.” Ann Arbor.

Anselin, Luc. 1988. Spatial Econometrics: Methods and Models. Studies in Operational Regional Science. Dordrecht: Kluwer Academic Publishers.

———. 2016. “Estimation and Testing in the Spatial Seemingly Unrelated Regression (Sur).” Geoda Center for Geospatial Analysis; Computation, Arizona State University. Working Paper 2016-01.

Anselin, Luc, and Anil Bera. 1998. “Spatial Dependence in Linear Regression Models with an Introduction to Spatial Econometrics.” In Handbook of Applied Economic Statistics, edited by Andrew Ullah and Derek Giles, 237–89. Marcel Dekker.

Anselin, Luc, Anil K. Bera, Raymond Florax, and Mann J. Yoon. 1996. “Simple Diagnostic Tests for Spatial Dependence.” Regional Science and Urban Economics 26 (1): 77–104.

Arora, Swarnjit S., and Murray Brown. 1977. “Alternative Approaches to Spatial Autocorrelation: An Improvement over Current Practice.” International Regional Science Review 2: 67–78.

Baller, Robert D, Luc Anselin, Steven F Messner, Glenn Deane, and Darnell F Hawkins. 2001. “Structural Covariates of Us County Homicide Rates: Incorporating Spatial Effects.” Criminology 39 (3): 561–88.

Baltagi, Badi H, and Georges Bresson. 2011. “Maximum Likelihood Estimation and Lagrange Multiplier Tests for Panel Seemingly Unrelated Regressions with Spatial Lag and Spatial Errors: An Application to Hedonic Housing Prices in Paris.” Journal of Urban Economics 69 (1): 24–42.

Baltagi, Badi H, and Alain Pirotte. 2011. “Seemingly Unrelated Regressions with Spatial Error Components.” Empirical Economics 40 (1): 5–49.

Baltagi, Badi H, Seuck Heun Song, and Won Koh. 2003. “Testing Panel Data Regression Models with Spatial Error Correlation.” Journal of Econometrics 117 (1): 123–50.

Bech, Mickael, Maja Hansen, Jørgen Lauridsen, and Christian Kronborg. 2012. “Can the Municipalities Prevent Medication of Mental Diseases?” The Journal of Mental Health Policy and Economics 15 (2): 53–60.

Bera, Anil K., and Mann J. Yoon. 1993. “Specification Testing with Locally Misspecified Alternatives.” Econometric Theory 9 (4): 649–58.

Bivand, Roger S., Edzer Pebesma, and Virgilio Gomez-Rubio. 2013. Applied Spatial Data Analysis with R, Second Edition. Springer, NY.

Breusch, Trevor Stanley, and Adrian Rodney Pagan. 1980. “The Lagrange Multiplier Test and Its Applications to Model Specification in Econometrics.” The Review of Economic Studies 47 (1): 239–53.

Cotteleer, Geerte, and G Cornelis van Kooten. 2012. “Expert Opinion Versus Actual Transaction Evidence in the Valuation of Non-Market Amenities.” Economic Modelling 29 (1): 32–40.

Davidson, Russell, and James G MacKinnon. 1993. Estimation and Inference in Econometrics. OUP Catalogue. Oxford University Press.

Durbin, James. 1954. “Errors in Variables.” Revue de L’institut International de Statistique, 23–32.

Egger, Peter, and Michael Pfaffermayr. 2004. “Distance, Trade and Fdi: A Hausman–Taylor Sur Approach.” Journal of Applied Econometrics 19 (2): 227–46.

Elhorst, Paul. 2014. Spatial Econometrics from Cross-Sectional Data to Spatial Panels. Berlin: Springer.

Fiebig, Denzil G. 2007. “Seemingly Unrelated Regression.” In A Companion to Theoretical Econometrics, edited by Badi Baltagi, 101–21. Wiley-Blackwell.

Fingleton, Bernard. 2007. “Multi-Equation Spatial Econometric Model, with Application to Eu Manufacturing Productivity Growth.” Journal of Geograph Systems 119-144: 119–44.

Henningsen, Arne, Jeff D Hamann, and others. 2007. “Systemfit: A Package for Estimating Systems of Simultaneous Equations in R.” Journal of Statistical Software 23 (4): 1–40.

Hordijk, Leen, and Peter Nijkamp. 1977. “Dynamic Models of Spatial Autocorrelation.” Environment and Planning A 9 (5): 505–19.

Hsiao, Cheng. 2003. Analysis of Panel Data. Cambridge University Press.

Izón, Germán M., Michael S. Hand, Daniel W. Mccollum, Jennifer A. Thacher, and Robert P. Berrens. 2016. “Proximity to Natural Amenities: A Seemingly Unrelated Hedonic Regression Model with Spatial Durbin and Spatial Error Processes.” Growth and Change 47 (4): 461–80.

Kakamu, Kazuhiko, Wolfgang Polasek, and Hajime Wago. 2012. “Production Technology and Agglomeration for Japanese Prefectures During 1991–2000.” Papers in Regional Science 91 (1): 29–41.

Kapoor, Mudit, Harry H Kelejian, and Ingmar R Prucha. 2007. “Panel Data Models with Spatially Correlated Error Components.” Journal of Econometrics 140 (1): 97–130.

Kelejian, Harry H, and Ingmar R Prucha. 1998. “A Generalized Spatial Two-Stage Least Squares Procedure for Estimating a Spatial Autoregressive Model with Autoregressive Disturbances.” The Journal of Real Estate Finance and Economics 17 (1): 99–121.

———. 1999. “A Generalized Moments Estimator for the Autoregressive Parameter in a Spatial Model.” International Economic Review 40 (2): 509–33.

———. 2004. “Estimation of Simultaneous Systems of Spatially Interrelated Cross Sectional Equations.” Journal of Econometrics 118 (1-2): 27–50.

Lauridsen, Jorgen, Mickael Bech, Fernando López, and Mariluz Maté Sánchez. 2010. “A Spatiotemporal Analysis of Public Pharmaceutical Expenditure.” The Annals of Regional Science 44 (2): 299–314.

Lee, Lung-Fei. 2004. “Asymptotic Distributions of Quasi-Maximum Likelihood Estimators for Spatial Autoregressive Models.” Econometrica 72 (6): 1899–1925.

Le Gallo, Julie, and Coro Chasco. 2008. “Spatial Analysis of Urban Growth in Spain, 1900–2001.” Empirical Economics 34 (1): 59–80.

Le Gallo, Julie, and Sandy DallErba. 2006. “Evaluating the Temporal and Spatial Heterogeneity of the European Convergence Process, 1980–1999.” Journal of Regional Science 46 (2): 269–88.

LeSage, James, and Robert Kelley Pace. 2009. Introduction to Spatial Econometrics. Chapman; Hall/CRC.

López, Fernando A, Pedro J Martínez-Ortiz, and Juan-Gabriel Cegarra-Navarro. 2017. “Spatial Spillovers in Public Expenditure on a Municipal Level in Spain.” The Annals of Regional Science 58 (1): 39–65.

López, Fernando A, Jesús Mur, and Ana Angulo. 2014. “Spatial Model Selection Strategies in a Sur Framework. The Case of Regional Productivity in Eu.” The Annals of Regional Science 53 (1): 197–220.

Lundberg, Johan. 2006. “Spatial Interaction Model of Spillovers from Locally Provided Public Services.” Regional Studies 40 (6): 631–44.

Malinvaud, Edmond. 1970. Statistical Methods of Econometrics. North Holland.

Millo, Giovanni, Gianfranco Piras, and others. 2012. “Splm: Spatial Panel Data Models in R.” Journal of Statistical Software 47 (1): 1–38.

Moscone, Francesco, Elisa Tosetti, and Martin Knapp. 2007. “SUR Model with Spatial Effects: An Application to Mental Health Expenditure.” Health Economics 16 (12): 1403–8.

Mur, Jesús, and Ana Angulo. 2006. “The Spatial Durbin Model and the Common Factor Tests.” Spatial Economic Analysis 1: 207–26.

Mur, Jesús, Fernando López, and Marcos Herrera. 2010. “Testing for Spatial Effects in Seemingly Unrelated Regressions.” Spatial Economic Analysis 5 (4): 399–440.

Oberhofer, William, and James Kmenta. 1974. “A General Procedure for Obtaining Maximum Likelihood Estimates in Generalized Regression Models. ,” Econometrica 42: 579–590.

R Core Team. 2018. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

Rey, Sergio J., and Luc Anselin. 2007. “PySAL: A Python Library Onf Spatial Analytical Methods.” Review of Regional Studies 37: 5–27.

Rey, Sergio J., and Brett D. Montouri. 1999. “US Regional Income Convergence: A Spatial Econometric Perspective.” Regional Studies 33 (2): 143–56.

Sargan, John D. 1958. “The Estimation of Economic Relationships Using Instrumental Variables.” Econometrica: Journal of the Econometric Society, 393–415.

Schmidt, Paul. 1990. “Three-Stage Least Squares with Different Instruments for Different Equations.” Journal of Econometrics 43: 389–94.

Theil, Henri. 1971. Principles of Econometrics. JohnWiley; Sons.

Wang, Xiaokun, and Kara M Kockelman. 2007. “Specification and Estimation of a Spatially and Temporally Autocorrelated Seemingly Unrelated Regression Model: Application to Crash Rates in China.” Transportation 34 (3): 281–300.

White, Eugene N, and Geoffrey JD Hewings. 1982. “Space-Time Employment Modeling: Some Results Using Seemingly Unrelated Regressions Estimators.” Journal of Regional Science 22 (3): 283–302.

Wooldridge, Jeffrey M. 2010. Econometric Analysis of Cross Section and Panel Data, 2 Edition. Cambridge, Massachussetts: Massachusets Institut of Technology Press.

Zeileis, Achim, and Yves Croissant. 2010. “Extended Model Formulas in R: Multiple Parts and Multiple Responses.” Journal of Statistical Software 34 (1): 1–13.

Zellner, Arnold. 1962. “An Efficient Method of Estimating Seemingly Unrelated Regressions and Test of Aggregation Bias.” Journal of the American Statistical Association 57: 500–509.

Zhou, Bin Brenda, and Kara M Kockelman. 2009. “Predicting the Distribution of Households and Employment: A Seemingly Unrelated Regression Model with Two Spatial Processes.” Journal of Transport Geography 17 (5): 369–76.