pythongmm

Dynamic panel model using the generalized method of moments (GMM) estimation of Arellano and Bond


Based on the work of Kuo et al (Kuo, H.-I., Chen, C.-C., Tseng, W.-C., Ju, L.-F., Huang, B.-W. (2007). Assessing impacts of SARS and Avian Flu on international tourism demand to Asia. Tourism Management. Retrieved from: https://www.sciencedirect.com/science/article/abs/pii/S0261517707002191?via%3Dihub), I am measuring the effect of COVID-19 on tourism demand.

My panel data can be found here: https://www.dropbox.com/s/t0pkwrj59zn22gg/tourism_covid_data-total.csv?dl=0

I would like to use a first-difference transformation model(GMMDIFF) and treat the lags of the dependent variable (tourism demand) as instruments for the lagged dependent variable. The dynamic and first difference version of the tourism demand model: Δyit = η2Δ yit-1 + η3 ΔSit + Δuit

where, y is tourism demand, i refers to COVID-19 infected countries, t is time, S is the number of SARS cases, and u is the fixed effects decomposition of the error term.

Up to now, using python I managed to get some results using the Panel OLS:

import pandas as pd
import numpy as np
from linearmodels import PanelOLS
import statsmodels.api as sm

tourism_covid_data=pd.read_csv('../Data/Data - Dec2021/tourism_covid_data-total.csv, header=0, parse_dates=['month_year']

tourism_covid_data['l.tourism_demand']=tourism_covid_data['tourism_demand'].shift(1)
tourism_covid_data=tourism_covid_data.dropna()
exog = sm.add_constant(tourism_covid_data[['l.tourism_demand','monthly cases']])
mod = PanelOLS(tourism_covid_data['tourism_demand'], exog, entity_effects=True)
fe_res = mod.fit()
fe_res

enter image description here

I am trying to find the solution and use GMM for my data, however, it seems that GMM is not widely used in python and not other similar questions are available on stack. Any ideas on how I can work here?


Solution

  • I just tried your data. I don't think your data fits diff GMM or system GMM because it is a T(=48) >>N(=4) long panel. Anyway, pydynpd still produces results. In both cases, I had to collapse instrument matrix to reduce the issue with too many instruments.

    Model 1: diff GMM; treating "monthly cases" as predetermined variable

    import pandas as pd
    from  pydynpd import regression
    
    df = pd.read_csv("tourism_covid_data-total.csv")  #, index_col=False)
    df['monthly_cases']=df['monthly cases']
    command_str='tourism_demand L1.tourism_demand monthly_cases  | gmm(tourism_demand, 2 6) gmm(monthly_cases, 1 2)| nolevel collapse '
    mydpd = regression.abond(command_str, df, ['Country', 'month_year'])
    

    The output:

    Python 3.9.7 (default, Sep 10 2021, 14:59:43) 
    [GCC 11.2.0] on linux
    Warning: system and difference GMMs do not work well on long (T>=N) panel data
    Dynamic panel-data estimation, two-step difference GMM
     Group variable: Country       Number of obs = 184  
     Time variable: month_year     Number of groups = 4 
     Number of instruments = 7                          
    +-------------------+-----------------+---------------------+------------+-----------+
    |   tourism_demand  |      coef.      | Corrected Std. Err. |     z      |   P>|z|   |
    +-------------------+-----------------+---------------------+------------+-----------+
    | L1.tourism_demand |    0.7657082    |      0.0266379      | 28.7450196 | 0.0000000 |
    |   monthly_cases   | -182173.5644815 |    171518.4068348   | -1.0621225 | 0.2881801 |
    +-------------------+-----------------+---------------------+------------+-----------+
    Hansen test of overid. restrictions: chi(5) = 3.940 Prob > Chi2 = 0.558
    Arellano-Bond test for AR(1) in first differences: z = -1.04 Pr > z =0.299
    Arellano-Bond test for AR(2) in first differences: z = 1.00 Pr > z =0.319
    

    Model 2: diff GMM; treating the lag of "monthly cases" as exogenous variable

    command_str='tourism_demand L1.tourism_demand L1.monthly_cases  | gmm(tourism_demand, 2 6) iv(L1.monthly_cases)| nolevel collapse '
    mydpd = regression.abond(command_str, df, ['Country', 'month_year'])
    

    Output:

    Warning: system and difference GMMs do not work well on long (T>=N) panel data
    Dynamic panel-data estimation, two-step difference GMM
     Group variable: Country       Number of obs = 184  
     Time variable: month_year     Number of groups = 4 
     Number of instruments = 6                          
    +-------------------+-----------------+---------------------+------------+-----------+
    |   tourism_demand  |      coef.      | Corrected Std. Err. |     z      |   P>|z|   |
    +-------------------+-----------------+---------------------+------------+-----------+
    | L1.tourism_demand |    0.7413765    |      0.0236962      | 31.2866594 | 0.0000000 |
    |  L1.monthly_cases | -190277.2987977 |    164169.7711072   | -1.1590276 | 0.2464449 |
    +-------------------+-----------------+---------------------+------------+-----------+
    Hansen test of overid. restrictions: chi(4) = 1.837 Prob > Chi2 = 0.766
    Arellano-Bond test for AR(1) in first differences: z = -1.05 Pr > z =0.294
    Arellano-Bond test for AR(2) in first differences: z = 1.00 Pr > z =0.318
    

    Model 3: similar to Model 2, but a system GMM.

    command_str='tourism_demand L1.tourism_demand L1.monthly_cases  | gmm(tourism_demand, 2 6) iv(L1.monthly_cases)| collapse '
    mydpd = regression.abond(command_str, df, ['Country', 'month_year'])
    

    Output:

    Warning: system and difference GMMs do not work well on long (T>=N) panel data
    Dynamic panel-data estimation, two-step system GMM
     Group variable: Country       Number of obs = 188  
     Time variable: month_year     Number of groups = 4 
     Number of instruments = 8                          
    +-------------------+-----------------+---------------------+------------+-----------+
    |   tourism_demand  |      coef.      | Corrected Std. Err. |     z      |   P>|z|   |
    +-------------------+-----------------+---------------------+------------+-----------+
    | L1.tourism_demand |    0.5364657    |      0.0267678      | 20.0414904 | 0.0000000 |
    |  L1.monthly_cases | -216615.8306112 |    177416.0961037   | -1.2209480 | 0.2221057 |
    |        _con       |  -10168.9640333 |     8328.7444649    | -1.2209480 | 0.2221057 |
    +-------------------+-----------------+---------------------+------------+-----------+
    Hansen test of overid. restrictions: chi(5) = 1.876 Prob > Chi2 = 0.866
    Arellano-Bond test for AR(1) in first differences: z = -1.06 Pr > z =0.288
    Arellano-Bond test for AR(2) in first differences: z = 0.99 Pr > z =0.322