Sách
Greene, W. (2012) Econometric Analysis. 7th Edition (Chapter 1, Appendix B )
Gujarati, D. N., & Porter, D. C. (2009). Basic econometrics. McGraw-hill. (Chapter 1)
Stock, J. H., & Watson, M. W. (2020). Introduction to econometrics. Pearson. (Chapter 1)
Wooldridge, J. M. (2000). Introductory Econometrics: A Modern Approach 4rd ed. (Chapter 1, Appendix B)
Nội dung
Section 1: Introduction
Overview of Linear Regression
Definition and purpose of simple linear regression
Importance in statistical analysis and predictive modeling
Section 2: Theoretical Framework
The Model Equation
Form of the simple linear regression model
Explanation of each component
Assumptions of the Model
Linearity
Random sampling ( or Independence of errors)
No perfect multicollinearity
Homoscedasticity
No endogeneity
Normal distribution of errors
Section 3: Estimation of Coefficients
Method of Least Squares
Derivation of the least squares estimators
Geometric interpretation of least squares
Properties of the Estimators
Unbiasedness
Efficiency under Gauss-Markov Theorem
Consistency
Section 4: Model Interpretation
Interpretation of Coefficients
Goodness-of-Fit
Coefficient of Determination
Adjusted R2 and its interpretation
Section 5: Hypothesis Testing
Statistical Inference
Hypothesis tests forbeta
Confidence intervals for beta
Assumptions Testing
Diagnostic plots (residual plots, QQ plots)
Tests for normality, homoscedasticity, and autocorrelation
Section 6: Model Diagnostics and Remedies
Detection of Model Violations
Identifying outliers and high leverage points
Influence measures (Cook's distance)
Remedial Measures
Transformations of dependent and independent variables
Robust regression techniques
Section 7: Extensions and Applications
Using Simple Linear Regression in Practice
Example case studies from economics, business, and science
Interpretation of results in a real-world context
Software Implementation
Code snippets for implementing simple linear regression in R and Python
Interpretation of software output
Section 8: Summary and Conclusions
Key Takeaways
Recap of main points about simple linear regression
Importance of checking assumptions and model fit
Further Reading
Suggested books and articles for deeper understanding
Sách
Stock, J. H., & Watson, M. W. (2020). Introduction to econometrics. Pearson. (Chapter 4 + 5)
Gujarati, D. N., & Porter, D. C. (2009). Basic econometrics. McGraw-hill. (Chapter 3)
Wooldridge, J. M. (2000). Introductory Econometrics: A Modern Approach 4rd ed. (Chapter 2)
Johnston, J., & DiNardo, J. (1963). Econometric methods. (Chapter 2).
Nội dung
Section 1: Introduction
Purpose and Scope
Introduction to multiple linear regression (MLR)
Importance and ubiquity in statistical analysis across disciplines
Section 2: The Model
Formulation of the Multiple Linear Regression Model
Mathematical representation
Explanation of terms: response variable, predictors, coefficients, error term
Section 3: Assumptions of the Model
Key Assumptions
Linearity in parameters
Random sampling (or Independence of errors)
No perfect multicollinearity among predictors
Homoscedasticity of errors
Exogeneity
Normal distribution of error terms
Explanation and Implications of Each Assumption
How these assumptions affect model estimation and interpretation
Section 4: Estimation of Parameters
Ordinary Least Squares (OLS) Method
Derivation of the OLS estimators
Matrix notation and computation
Properties of OLS Estimators
Unbiasedness, consistency, and efficiency under the Gauss-Markov theorem
Section 5: Interpretation of the Model
Coefficient Interpretation
How to interpret coefficients of multiple regressors
Partial effect of predictors on the response variable
Model Fit and Diagnostics
R2 and Adjusted R2
F-test for overall significance of the model
Section 6: Hypothesis Testing and Confidence Intervals
Testing Hypotheses About Regression Coefficients
t-tests for individual coefficients
F-tests for group of coefficients
Building Confidence Intervals
For coefficients and predicted values
Section 7: Diagnostics for Model Assumptions
Residual Analysis
Residual plots for detecting non-linearity, heteroscedasticity, and outliers
Influence and Leverage
Cook's distance, leverage plots, and influence plots
Multicollinearity Diagnosis
Variance inflation factor (VIF) and tolerance
Section 8: Remedial Measures for Assumption Violations
Transformation and Weighting
Log transformations, polynomial terms, interaction effects
Weighted least squares for heteroscedastic errors
Addressing Multicollinearity
Dropping variables, changing model specification, principal component regression
Section 9: Extensions and Advanced Topics
Generalized Least Squares (GLS)
When and how to use GLS
Dummy Variables
Incorporating categorical variables into regression
Interaction Effects
How and when to include interaction terms
Section 10: Practical Applications
Case Studies
Real-world examples from economics, business analytics, and other fields
Software Implementation
Step-by-step guide using R and Python (including code snippets)
Section 11: Summary and Conclusion
Recap of Key Points
Summary of MLR essentials, diagnostics, and applications
Further Reading and Resources
Recommended books, articles, and software tools
Sách
Stock, J. H., & Watson, M. W. (2020). Introduction to econometrics. Pearson. (Chapter 6)
Gujarati & Porter (2009) - Chapter 4, pp. 85-113,
Gujarati & Porter (2009) - Chapter 5, pp. 117-139
Wooldridge, J. M. (2000). Introductory Econometrics: A Modern Approach 4rd ed. (Chapter 3)
Gujarati, D. N., & Porter, D. C. (2009). Basic econometrics. McGraw-hill. (Chapter 2)
Greene, W. (2012) Econometric Analysis. 7th Edition ( Chapter 2, and 3 )
Hayashi, F. (2000). Econometrics. Princeton University Press (Chapter 2)
Bài báo
Bertrand, M., & Mullainathan, S. (2004). Are Emily and Greg more employable than Lakisha and Jamal? A field experiment on labor market discrimination. American economic review, 94(4), 991-1013. (link)
Sacerdote, B. (2007). How large are the effects from changes in family environment? A study of Korean American adoptees. The Quarterly Journal of Economics, 122(1), 119-157. (link)
Carrell, S. E., Hoekstra, M., & West, J. E. (2011). Is poor fitness contagious?: Evidence from randomly assigned friends. Journal of public Economics, 95(7-8), 657-663. (link)
Dale, S., & Krueger, A. B. (2011). Estimating the return to college selectivity over the career using administrative earnings data (No. w17159). National Bureau of Economic Research. (link)
Nội dung
Incorporating Non-Linearity
Polynomial Regression
Introducing polynomial terms to capture non-linear relationships
Double log, semi log model
Splines and Local Regression
Using splines for flexible, piecewise-polynomial regression
Local regression techniques for adaptive fitting
Interaction Effects
Modeling Interaction Effects
How and why to include interaction terms in a model
Interpretation of coefficients with interactions
Case Studies
Examples demonstrating the importance of interaction effects in econometric analyses
Sách
PR: Chapter 5, R : Chapter 6
Stock, J. H., & Watson, M. W. (2020). Introduction to econometrics. Pearson. (Chapter 8)
Quadratic and interaction terms: WOO, Chapters 6, 7 and 9.
Wooldridge, J. M. (2000). Introductory Econometrics: A Modern Approach 4rd ed. (Chapter 6)
Gujarati, D. N., & Porter, D. C. (2009). Basic econometrics. McGraw-hill. (Chapter 2)
Greene, W. (2012) Econometric Analysis. 7th Edition (Chapter 6)
Nội dung
Section 1: Introduction
Purpose of Model Specification
Importance of correct model specification in regression analysis
Overview of the consequences of misspecification
Section 2: Conceptual Framework
Defining the Model
Criteria for choosing dependent and independent variables
Theoretical considerations and empirical evidence guiding variable selection
Form of the Model
Linear vs. non-linear relationships
Decision factors for including interaction terms and polynomial terms
Section 3: Choosing Variables
Variable Selection Techniques
Stepwise selection (forward, backward, and stepwise)
Best subset selection
Criteria for selection (AIC, BIC, Adjusted R2)
Dealing with Data Limitations
Data availability and quality
Handling missing data
Section 4: Dealing with Non-Linearity
Transformations to Achieve Linearity
Logarithmic, square root, and inverse transformations
When and why to apply these transformations
Flexible Forms
Incorporating splines and polynomial terms
Deciding the degree of polynomial
Section 5: Interaction and Dummy Variables
Incorporating Interaction Effects
How to model interactions between continuous variables
Interactions between continuous and categorical variables
Using Dummy Variables
Correctly specifying dummy variables in regression models
Handling the dummy variable trap
Section 6: Assumptions Checking
Linearity and Additivity
Diagnostic tests and visual inspections for linearity
Independence of Errors
Tests for autocorrelation (Durbin-Watson test)
Homoscedasticity
Identifying heteroscedasticity (Breusch-Pagan test)
Remedial measures for heteroscedasticity
Normality of Errors
Normality tests (Shapiro-Wilk test)
Transformations for achieving normality
Section 7: Multicollinearity
Detection
Variance inflation factor (VIF) and tolerance levels
Implications and Solutions
When and how to address multicollinearity
Impact on coefficient estimates and standard errors
Section 8: Model Extension and Refinement
Advanced Variable Interaction
Modelling complex interactions and their practical implications
Threshold and Segment Analysis
Identifying and modeling threshold effects within regression
Robustness Checks
Sensitivity analyses to verify model stability
Section 9: Practical Applications and Case Studies
Industry-specific Modeling
Examples from finance, healthcare, marketing, and environmental economics
Common Pitfalls in Model Specification
Real-world examples of what can go wrong and how to avoid these mistakes
Section 10: Model Validation
Cross-Validation Techniques
Using holdout data and k-fold cross-validation to test model reliability
Model Comparison and Selection
Criteria for comparing multiple models
Sách
Greene, W. (2012) Econometric Analysis. 7th Edition (Chapter 6, 7)
Stock, J. H., & Watson, M. W. (2020). Introduction to econometrics. Pearson. (Chapter 7)
Wooldridge, J. M. (2000). Introductory Econometrics: A Modern Approach 4rd ed. ( Chapter 3, 7, 9)
Nội dung
Section 1: Introduction to Qualitative Variables
Definition and Types of Qualitative Variables
Understanding qualitative vs. quantitative variables
Types of qualitative variables: Nominal, ordinal
Importance of Qualitative Variables in Analysis
Why and when qualitative variables are crucial in research
Common scenarios across different disciplines (e.g., social sciences, marketing, health)
Section 2: Coding Qualitative Variables
Converting Qualitative Data to Numeric Form
Strategies for converting qualitative variables into dummy variables
Section 3: Using Dummy Variables
Dummy Variables Basics
Creating dummy variables for nominal categories
Coding schemes for binary and multi-category variables
Interpreting Dummy Variables in Models
Interpretation of coefficients for dummy variables in linear models
Understanding the impact of reference categories
Section 5: Interaction Effects with Qualitative Variables
Modeling Interactions
Interaction between qualitative and quantitative variables
Interaction between two qualitative variables
Interpreting Interaction Effects
How to interpret complex interactions in the context of the model
Visualization techniques for better understanding
Section 8: Practical Considerations and Challenges
Multicollinearity and the Dummy Variable Trap
Identifying and addressing multicollinearity issues with qualitative variables
Avoiding the dummy variable trap by dropping a reference category
Section 9: Case Studies and Applications
Case Study 1: Marketing Analysis
Using customer demographics as qualitative variables in market segmentation
Case Study 2: Social Sciences
Analyzing survey data with qualitative variables in sociological research
Case Study 3: Health Research
Modeling patient outcomes using qualitative variables for treatment types
Section 10: Implementation in Statistical Software
Coding and Analysis in R
Practical Guide in Python
Other Software Tools
Sách
Wooldridge, J. M. (2000). Introductory Econometrics: A Modern Approach 4rd ed. ( Chapter 6, 7, 9)
Bài báo
Angrist, J. D., & Krueger, A. B. (1991). Does compulsory school attendance affect schooling and earnings?. The Quarterly Journal of Economics, 106(4), 979-1014. (link)
Graddy, K. (1995). Testing for imperfect competition at the Fulton fish market. The RAND Journal of Economics, 75-92. (link)
DiNardo, J. E., & Pischke, J. S. (1997). The returns to computer use revisited: Have pencils changed the wage structure too?. The Quarterly Journal of Economics, 112(1), 291-303. (link)
Angrist, J., & Evans, W. N. (1996). Children and their parents' labor supply: Evidence from exogenous variation in family size. (link)
Dale, S. B., & Krueger, A. B. (2002). Estimating the payoff to attending a more selective college: An application of selection on observables and unobservables. The Quarterly Journal of Economics, 117(4), 1491-1527. (link)
Krueger, A. B. (1993). How computers have changed the wage structure: evidence from microdata, 1984–1989. The Quarterly Journal of Economics, 108(1), 33-60. (link)
Nội dung
Section 1: Introduction to Multicollinearity
Definition and Concept
What is multicollinearity?
Types of multicollinearity: Perfect and imperfect (near) multicollinearity
Mathematical Representation
The concept of linear dependence among explanatory variables
Multicollinearity in the context of the OLS estimator
Section 2: Causes of Multicollinearity
Common Sources
Using highly correlated variables
Dummy variable traps
Polynomial terms and interaction effects
Data Issues
Limited sample size and overfitting
Data aggregation and measurement error
Section 3: Consequences of Multicollinearity
Inflated standard errors and unreliable coefficient estimates
Reduced statistical power of hypothesis tests
Sensitivity to changes in model specification
Misleading Coefficient Estimates
Changes in sign and magnitude of coefficients
Counterintuitive results and lack of precision
Impact on Model Predictions
Potential for overfitting and poor predictive performance
Unstable model outputs in different samples
Section 4: Identifying and Measuring Multicollinearity
Correlation Matrix Analysis
Identifying highly correlated predictor pairs
Visual tools like heatmaps for correlation
Variance Inflation Factor (VIF)
Calculation and interpretation of VIF values
Thresholds for identifying problematic multicollinearity
Condition Index and Eigenvalues
Calculation and interpretation of condition indices
Relationship between eigenvalues and multicollinearity
Section 5: Remedies for Multicollinearity
Variable Selection and Dropping Variables
Strategies for selecting and excluding variables
Using domain knowledge to choose relevant predictors
Combining or Transforming Variables
Principal component analysis (PCA) and factor analysis
Creating composite indices or sums
Regularization Techniques
Ridge regression and Lasso regression as solutions
Trade-offs between bias and variance
Centering and Standardization
Centering predictor variables to reduce multicollinearity
Standardizing variables for better interpretability
Section 6: Practical Applications
Case Study 1: Economic Data Analysis
Multicollinearity in macroeconomic forecasting models
Case Study 2: Marketing Research
Addressing multicollinearity in customer segmentation and profiling
Case Study 3: Medical Research
Managing multicollinearity in clinical trial data
Section 7: Implementing Solutions in Software
R Implementation
Step-by-step guide for detecting and addressing multicollinearity in R
Code snippets for VIF, PCA, and regularization techniques
Python Implementation
Using Python libraries like statsmodels, scikit-learn for multicollinearity diagnostics and remedies
Stata and Other Tools
Quick guide to detecting multicollinearity in Stata, SPSS, and SAS
Section 8: Summary and Best Practices
Key Takeaways
Summary of the causes, detection methods, and solutions for multicollinearity
Guidelines for Model Building
Best practices to prevent and address multicollinearity
Further Reading and Resources
Suggested books, articles, and online resources for a deeper understanding
Sách
Wooldridge, J. M. (2000). Introductory Econometrics: A Modern Approach 4rd ed. ( Chapter 3)
Gujarati, D. N., & Porter, D. C. (2009). Basic econometrics. McGraw-hill. (Chapters 4)
Nội dung
Section 1: Introduction to Heteroskedasticity
Definition and Concept
What is heteroskedasticity?
Distinguishing heteroskedasticity from homoskedasticity
Why heteroskedasticity matters in regression analysis
Common scenarios where heteroskedasticity is observed (e.g., cross-sectional data)
Mathematical Representation
The regression model with heteroskedastic errors
Implications for the Ordinary Least Squares (OLS) estimator
Section 2: Causes of Heteroskedasticity
Common Sources
Changes in the scale of dependent variable with the size of the predictors
Economic or social factors leading to variability in error terms
Examples from Various Fields
Income disparities in household data
Variability in investment returns
Section 3: Consequences of Heteroskedasticity
Biased standard errors leading to unreliable hypothesis tests
Inefficiency of OLS estimators
Impact on confidence intervals and model predictions
Implications for Model Interpretation
Distorted t-tests and F-tests
Misleading significance levels
Impact on Forecasting and Policy Recommendations
The risks of biased estimates in applied research
Section 4: Detecting Heteroskedasticity
Graphical Methods
Residual plots against predicted values
Plotting residuals against each independent variable
Statistical Tests
Goldfeld-Quandt test
Breusch-Pagan test
White test
Park test
Cook and Weisberg test
Szroetzer’s rank test
Information matrix test
Cameron and Trivedi’s decomposition
Many other tests
Interpreting Test Results
How to interpret the outcomes of heteroskedasticity tests
Section 5: Remedies and Solutions
Transformations
Logarithmic transformation of the dependent variable
Generalized least squares (GLS) and weighted least squares (WLS)
Robust Standard Errors
Using heteroskedasticity-consistent standard errors (e.g., White’s standard errors)
Advantages and limitations of robust standard errors
Alternative Estimation Techniques
Feasible Generalized Least Squares (FGLS)
Estimation with weighted observations
Section 7: Applying Solutions in Practice
Choosing the Right Remedy
Criteria for selecting the appropriate solution based on the context
Practical considerations for model specification and interpretation
Case Studies
Case Study 1: Correcting heteroskedasticity in income inequality data
Case Study 2: Using robust standard errors in a corporate finance model
Section 8: Advanced Topics in Heteroskedasticity
Modeling Heteroskedasticity Directly
Autoregressive conditional heteroskedasticity (ARCH) and Generalized ARCH (GARCH) models
Applications in financial time series
Panel Data Models
Heteroskedasticity in panel data and solutions
Applications in macroeconomic studies
Section 9: Software Implementation
Implementing Diagnostics and Remedies in R
Step-by-step guide to performing tests and applying corrections
Example code snippets for detecting and correcting heteroskedasticity
Practical Guide in Python
Using Python libraries like statsmodels and scikit-learn for heteroskedasticity diagnosis and treatment
Other Software Tools
Implementing solutions in Stata, SPSS, and SAS
Section 10: Summary and Best Practices
Recap of Key Concepts
Summary of detection methods and remedies for heteroskedasticity
Guidelines for Applied Research
Best practices for dealing with heteroskedasticity in econometric modeling
Further Reading and Resources
Suggested books, articles, and software tutorials for deeper understanding
Sách
Greene, W. (2012) Econometric Analysis. 7th Edition ( Chapter 9)
DM, Sec. 7.5
Stock, J. H., & Watson, M. W. (2020). Introduction to econometrics. Pearson. (Chapter 5)
GB Chapter 6 - 145-175
Wooldridge, J. M. (2000). Introductory Econometrics: A Modern Approach 4rd ed. ( Chapter 8)
JD, Chapter 6
Gujarati, D. N., & Porter, D. C. (2009). Basic econometrics. McGraw-hill. (Chapters 5)
Bài báo
Andrews, D. W. (1991). Heteroskedasticity and autocorrelation consistent covariance matrix estimation. Econometrica: Journal of the Econometric Society, 817-858.
Andrews, D. W., & Monahan, J. C. (1992). An improved heteroskedasticity and autocorrelation consistent covariance matrix estimator. Econometrica: Journal of the Econometric Society, 953-966.
Dale, S. B., & Krueger, A. B. (2002). Estimating the payoff to attending a more selective college: An application of selection on observables and unobservables. The Quarterly Journal of Economics, 117(4), 1491-1527. (link)
Den Haan, W. J., & Levin, A. T. (2000). Robust covariance matrix estimation with data-dependent VAR prewhitening order.
Newey, W. K., & West, K. D. (1994). Automatic lag selection in covariance matrix estimation. The Review of Economic Studies, 61(4), 631-653.
Nội dung
Section 1: Introduction to Endogeneity
What is endogeneity?
Common sources of endogeneity: omitted variables, measurement errors, simultaneity
Bias and inconsistency in Ordinary Least Squares (OLS) estimates
Impact on hypothesis testing and economic interpretation
Section 2: Primary reasons for endogeneity
Omitted Variable Bias
Measurement Error
Simultaneity or Reverse Causality
Dynamic Endogeneity
Other reasons cause bias
Sample Selection Bias
Self-Selection
Section 3: The Instrumental Variables (IV) Approach
Conditions for a Valid Instrument
Relevance: correlation with the endogenous regressor
Exogeneity: uncorrelated with the error term
Overidentification and the need for multiple instruments
Examples of potential instruments in empirical research
Section 4: The Two-Stage Least Squares (2SLS) Method
Introduction to 2SLS
Concept and rationale behind 2SLS
When and why to use 2SLS instead of OLS
The 2SLS Estimation Process
Stage 1: Regress the endogenous variable on all exogenous variables, including instruments
Stage 2: Use predicted values from Stage 1 as instruments in the original equation
Interpretation of coefficients obtained from 2SLS
Properties of 2SLS Estimators
Consistency and asymptotic normality
Efficiency relative to OLS under endogeneity
Assumptions Required for 2SLS
Linearity and no perfect multicollinearity
Validity of instruments
Potential violations and their consequences
Section 5: Testing the Validity of Instruments
Relevance of Instruments: Weak instruments
First-stage F-statistics for weak instruments
Problems with Weak Instruments
Bias in 2SLS estimates
Reduced power of hypothesis tests
Detection and Remedies
Testing for weak instruments using the Cragg-Donald statistic
Using Limited Information Maximum Likelihood (LIML) and other robust methods
Exogeneity of Instruments
Overidentification tests: Sargan-Hansen test
Intuition and application of overidentification tests
Section 6: Practical Applications and Case Studies
Case Study 1: Demand and Supply Models
Using instruments to estimate supply and demand functions
Case Study 2: Education and Earnings
Addressing endogeneity in returns to education using natural experiments
Case Study 3: Policy Impact Evaluation
Evaluating the effect of policy interventions with endogenous treatment variables
Section 7: Extensions and Advanced Topics
Generalized Method of Moments (GMM)
GMM as a generalization of IV/2SLS methods
Application in panel data and time series contexts
Dynamic Panel Data Models
Addressing endogeneity in panel data using lagged instruments (Arellano-Bond estimator)
Control Function Approach
Alternative to 2SLS for handling endogeneity
Section 8: Implementation in Statistical Software
Implementing 2SLS in R
Step-by-step guide to 2SLS estimation in R using ivreg package
Code snippets and interpretation of output
Using 2SLS in Stata
Command ivregress for 2SLS and diagnostics
Interpretation and common pitfalls
Other Software Tools
Implementing 2SLS in Python (statsmodels), SAS, and SPSS
Section 9: Summary and Conclusion
Key Takeaways
Recap of the importance of addressing endogeneity
Summary of the 2SLS approach and its applications
Best Practices
Guidelines for selecting and testing instruments
Practical advice for applied researchers
Sách
Wooldridge, J. M. (2000). Introductory Econometrics: A Modern Approach 4rd ed. ( Chapter 7+15)
Stock, J. H., & Watson, M. W. (2020). Introduction to econometrics. Pearson. (Chapter 12)
Cameron, A. C., and P. K. Trivedi (2005), Microeconometrics: Methods and Applications. Cambridge University Press, New York, NY. (Chapters 4 and 6.)
Bài báo
Acemoglu, D., Johnson, S., Robinson, J., & Thaicharoen, Y. (2002). Institutional Causes, Macroeconomic Symptoms: Volatility. Crisis and Growth. (link)
Angrist, J. D. (1990). Lifetime earnings and the Vietnam era draft lottery: evidence from social security administrative records. The american economic review, 313-336. (link)
Angrist, J. D., & Krueger, A. B. (2001). Instrumental variables and the search for identification: From supply and demand to natural experiments. Journal of Economic perspectives, 15(4), 69-85. (link)
Angrist, J. D., Graddy, K., & Imbens, G. W. (2000). The interpretation of instrumental variables estimators in simultaneous equations models with an application to the demand for fish. The Review of Economic Studies, 67(3), 499-527. (link)
Ang, D. (2023). The birth of a nation: Media and racial hate. American Economic Review, 113(6), 1424-1460. (link)
Brollo, F., & Troiano, U. (2013). What happens when a woman wins an election. Evidence from Close Races in Brazil. MPRA Paper, 52244. (link)
Clarke, D., & Matta, B. (2018). Practical considerations for questionable IVs. The Stata Journal, 18(3), 663-691. (link)
Campante, F. R., & Do, Q. A. (2014). Isolated capital cities, accountability, and corruption: Evidence from US states. American Economic Review, 104(8), 2456-2481. (link)
DiNardo, J., & Lee, D. S. (2004). Economic impacts of new unionization on private sector employers: 1984–2001. The quarterly journal of economics, 119(4), 1383-1441. (link).
Donald, S. G., & Newey, W. K. (2001). Choosing the number of instruments. Econometrica, 69(5), 1161-1191.
Epple, D., & Romano, R. E. (1996). Public provision of private goods. Journal of political Economy, 104(1), 57-84. (link)
Hahn, J., & Hausman, J. (2002). A new specification test for the validity of instrumental variables. Econometrica, 70(1), 163-189.
Heckman, J. (1997). Instrumental variables: A study of implicit behavioral assumptions used in making program evaluations. Journal of human resources, 441-462.
Imbens, G. (2014). Instrumental variables: An econometrician's perspective (No. w19983). National Bureau of Economic Research. (link)
Krueger, A. (1993). How Computers Have Changed the Wage Structure: Evidence from. (link)
Słoczyński, T., Uysal, S. D., & Wooldridge, J. M. (2024). Abadie's Kappa and Weighting Estimators of the Local Average Treatment Effect. Journal of Business & Economic Statistics, 1-28.
Weak instruments
Bound, J., D. Jaeger and R. Baker (1995), Problems with Instrumental Variables Estimation when the Correlation between the Instruments and the Endogenous Explanatory Variable is Weak, Journal of the American Statistical Association, 90(430), pp. 443-450.
Stock, J., J. Wright, and M. Yogo (2002), A Survey of Weak Instruments and Weak Identification in Generalized Methods of Moments, Journal of Business and Economic Statistics, 20(4), 518-529.
Hahn, J., and J. Hausman (2003), Weak Instruments: Diagnosis and Cures in Empirical Econometrics, American Economic Review, 93(2), 118-125.
Cruz, L.M. and Moreira, J. (2005), On the Validity of Econometric Techniques with Weak Instruments. Inference on Returns to Education Using Compulsory School Attendance Laws, Journal of Human Resources, 40(2), 393-410.
Staiger, D. O., & Stock, J. H. (1994). Instrumental variables regression with weak instruments. Econometrica, 65, 557-586.
Stock, J. H., & Yogo, M. (2002). Testing for weak instruments in linear IV regression. (link)
Nội dung
Section 1: Introduction to GMM
Why GMM? Addressing limitations of traditional estimation methods
Applications in econometrics and beyond
Development and key contributors to GMM theory
Comparison with other estimation methods (OLS, IV, ML)
Section 2: Theoretical Foundation of GMM
Basic Concepts and Notation
Moments and moment conditions: Definition and examples
Population moments vs. sample moments
Moment Conditions
Deriving moment conditions from economic theory
Examples of moment conditions in different contexts
The GMM Estimator
Definition of the GMM estimator
The method of moments: Classical method and limitations
Generalizing to GMM: Using multiple moment conditions
Objective Function
Constructing the GMM objective function
The role of the weighting matrix
Optimal Weighting Matrix
Properties of the optimal weighting matrix
Two-step GMM: Estimating the optimal weighting matrix
Section 4: Properties of GMM Estimators
Consistency and Asymptotic Normality
Proofs and intuitive explanation
Conditions required for consistency and asymptotic normality
Efficiency
Efficiency of GMM relative to other estimators
When GMM is more efficient than OLS and IV
Robustness to Heteroskedasticity and Serial Correlation
Robust standard errors in the context of GMM
Section 5: Implementation of GMM
Choosing Moment Conditions
Guidelines for selecting appropriate moment conditions
Trade-offs between over-identification and efficiency
Practical Steps in GMM Estimation
Defining the model and identifying moment conditions
Estimating the parameters and testing the model
Section 6: Testing and Model Specification
Overidentification Tests
Hansen’s J-test: Purpose, computation, and interpretation
Testing the validity of moment conditions
Hypothesis Testing in GMM Framework
Wald tests, Lagrange Multiplier tests, and likelihood ratio tests
Application in hypothesis testing
Section 7: Common Applications of GMM
Dynamic Panel Data Models
Arellano-Bond estimator: GMM for dynamic panel data
Application to economic growth and investment models
System GMM
Introduction to System GMM and its advantages over difference GMM
Application in panel data with persistent series
Financial Econometrics
Estimating asset pricing models with GMM
GMM in risk modeling and option pricing
Macroeconomic Models
Estimating rational expectations models
GMM in DSGE models
Weak Instruments in GMM
Detection and implications of weak instruments
Remedies and robust estimation techniques
Nonlinear GMM
Application of GMM in nonlinear models
Examples from discrete choice models and nonlinear panel data models
Section 9: Practical Considerations and Challenges
Identification Issues
Problems with weak or irrelevant moment conditions
Strategies to improve identification
Finite Sample Properties
Small sample bias in GMM estimators
Use of bootstrapping techniques for inference
Section 10: Software Implementation
GMM in R
Step-by-step guide using gmm and ivreg packages
Example code and interpretation of output
GMM in Stata
Using xtabond and xtdpdsys commands for panel GMM
Practical tips and troubleshooting common errors
Other Software Tools
GMM implementation in Python (statsmodels), SAS, and EViews
Section 11: Case Studies
Case Study 1: Estimating Consumption Functions
Using GMM to estimate intertemporal consumption models
Case Study 2: Investment Models
Application of GMM in estimating investment under uncertainty
Case Study 3: Financial Market Analysis
Estimating and testing CAPM using GMM
Section 12: Summary and Best Practices
Key Takeaways
Recap of GMM advantages, properties, and applications
Best Practices for Applied Research
Guidelines for implementing GMM effectively
Common pitfalls and how to avoid them
Further Reading
Recommended books, articles, and tutorials for deeper understanding
Sách
Wooldridge, J. (2002): Econometric Analysis of Cross Section and Panel Data. MIT Press. (Chapter 10-11)
AP, Chapter 4.
Wooldridge, J. M. (2000). Introductory Econometrics: A Modern Approach 4rd ed. (Chapter 15)
Hayashi, F. (2000). Econometrics. Princeton University Press (Chapter 3 and 4)
Hall, A.R., ch. 4-5
Bài giảng đọc thêm về GMM
Bài báo
Andrews, D. W., Moreira, M. J., & Stock, J. H. (2006). Optimal two‐sided invariant similar tests for instrumental variables regression. Econometrica, 74(3), 715-752.
Andrews, I., & Mikusheva, A. (2016). Conditional inference with a functional nuisance parameter. Econometrica, 84(4), 1571-1612.
Chamberlain, G. (1987). Asymptotic efficiency in estimation with conditional moment restrictions. Journal of econometrics, 34(3), 305-334.
Hansen, L. P., & Singleton, K. J. (1982). Generalized instrumental variables estimation of nonlinear rational expectations models. Econometrica: Journal of the Econometric Society, 1269-1286.
Hansen, L. P., Heaton, J., & Yaron, A. (1996). Finite-sample properties of some alternative GMM estimators. Journal of Business & Economic Statistics, 14(3), 262-280.
Hansen, L. P. (1982). Large sample properties of generalized method of moments estimators. Econometrica: Journal of the econometric society, 1029-1054.
Kitamura, Y., & Stutzer, M. (1997). An information-theoretic alternative to generalized method of moments estimation. Econometrica: Journal of the Econometric Society, 861-874.
Mavroeidis, S., Plagborg-Møller, M., & Stock, J. H. (2014). Empirical evidence on inflation expectations in the New Keynesian Phillips Curve. American Economic Journal: Journal of Economic Literature, 52(1), 124-188.
Newey, W. K. (1985). Generalized method of moments specification testing. Journal of econometrics, 29(3), 229-256.
Newey, W. K., & West, K. D. (1986). A simple, positive semi-definite, heteroskedasticity and autocorrelationconsistent covariance matrix.
Staiger, D. O., & Stock, J. H. (1994). Instrumental variables regression with weak instruments.
Stock, J. H., & Wright, J. H. (2000). GMM with weak identification. Econometrica, 68(5), 1055-1096.
Nội dung
Section 1: Introduction to Seemingly Unrelated Regression
Overview of the SUR Model
Definition and purpose of SUR
Historical context and development
Motivation for Using SUR
Why SUR is used instead of separate OLS regressions
Situations where SUR is beneficial
Section 2: Theoretical Foundation of SUR
Model Structure
The system of equations
Correlation of Disturbances
Importance of contemporaneous correlation between error terms
Implications of ignoring correlated disturbances
Section 3: Estimation Methods
Ordinary Least Squares (OLS) vs. SUR
Limitations of separate OLS estimation
Efficiency gains from using SUR
Feasible Generalized Least Squares (FGLS)
Derivation of the FGLS estimator for SUR
Calculating the covariance matrix of the error terms
Iterative estimation procedure for FGLS
Section 4: Properties of the SUR Estimator
Consistency and Efficiency
Conditions under which SUR is more efficient than OLS
Comparison of SUR and seemingly independent regression (SIR) models
Asymptotic Properties
Asymptotic normality and consistency of the SUR estimator
Section 5: Model Testing and Validation
Testing for the Need of SUR
Breusch-Pagan test for independence of errors
Likelihood ratio test for joint estimation
Hypothesis Testing in SUR Framework
Joint hypothesis testing across multiple equations
Testing for parameter equality across equations
Section 6: Applications of the SUR Model
Economic and Financial Applications
Estimating systems of demand equations
Simultaneous modeling of asset returns
Cross-Sectional Analysis
Application to household surveys with multiple outcomes
Time-Series Applications
Modeling interrelated economic indicators
Section 7: Special Topics and Extensions
Panel Data and SUR
SUR in the context of panel data models
Combining SUR with fixed and random effects
Nonlinear SUR Models
Introduction to nonlinear systems with correlated errors
Examples and estimation techniques
Dynamic SUR Models
Extending SUR to dynamic models with lagged variables
Applications in macroeconomic modeling
Section 8: Practical Implementation of SUR
Implementation in R
Using the systemfit package for SUR estimation
Example code and interpretation of results
Implementation in Stata
Step-by-step guide using sureg command
Practical tips and troubleshooting common issues
Other Software Tools
Implementing SUR in SAS and EViews
Section 9: Case Studies
Case Study 1: Demand Systems
Application of SUR to estimate demand functions for multiple goods
Case Study 2: Labor Economics
Modeling wage determination and labor supply jointly
Case Study 3: Financial Econometrics
Joint modeling of returns on multiple asset classes
Section 10: Limitations and Considerations
When Not to Use SUR
Situations where SUR does not offer efficiency gains
Practical limitations in the presence of weak correlations
Multicollinearity in SUR
Impact of multicollinearity among independent variables
Solutions and diagnostic tools
Section 11: Summary and Best Practices
Key Takeaways
Summary of the advantages and applications of SUR
Guidelines for choosing between SUR and other estimation methods
Practical Recommendations
Best practices for implementing SUR in empirical research
Common pitfalls and how to avoid them
Further Reading
Suggested books, articles, and resources for deeper understanding
Bài báo
Zellner, A. (1962). An efficient method of estimating seemingly unrelated regressions and tests for aggregation bias. Journal of the American statistical Association, 57(298), 348-368.
Tài liệu đọc thêm
Nội dung
Overview of Econometric Models
Definition of SEMs
Importance in economic analysis
Historical Background
Development of SEMs in econometrics
Key contributions and theorists
Basic Concepts
Endogeneity and exogeneity
Identification problem
Structural and Reduced Forms
Derivation and interpretation
Difference between structural and reduced forms
Conditions for Identification
Order and rank conditions
Theoretical criteria for model identification
Problems and Solutions
Common pitfalls and how to avoid them
Practical examples of identification issues
Ordinary Least Squares (OLS) Limitations
Why OLS fails in the context of SEMs
Instrumental Variables (IV)
Concept and application
Choosing valid instruments
Two-Stage Least Squares (2SLS)
Implementation and examples
Comparison with other estimators
Three-Stage Least Squares (3SLS) and Full Information Maximum Likelihood (FIML)
Advanced estimation techniques
When to use each method
Model Specification
Choosing the correct model structure
Tests for structural stability
Hypothesis Testing
Tests for over-identifying restrictions
Tests for endogeneity
Economic Models
Supply and demand systems
Keynesian models of income determination
Policy Analysis
Impact of fiscal and monetary policy
Case studies in policy evaluation
Financial Markets
SEMs in asset pricing
Risk and return models
Dynamic Models
Introduction to dynamic SEMs
Application in time series data
Panel Data
Panel SEMs
Differences and challenges compared to cross-sectional data
Nonlinear SEMs
Nonlinear models and estimation techniques
Case studies and applications
Computational Considerations
Software options for SEM analysis
Example codes and algorithms
Data Issues
Data collection and preparation challenges
Dealing with missing data and measurement error
Detailed Walkthroughs of SEM Analysis
Real-world examples from economics and finance
Step-by-step analysis and interpretation
Phần mềm
Stata: reg3 — Three-stage estimation for systems of simultaneous equations (link)
Eviews: Dealing with simultaneous equation in Eviews (link)
R: systemfit - A Package for Estimating Systems of Simultaneous Equations (link) (link)
Estimating and interpreting structural equation models in Stata 12 - David M. Drukker (link)
Nội dung
Section 1: Introduction to Sample Selection Bias
What is sample selection bias?
How sample selection bias differs from other biases (e.g., omitted variable bias, measurement error)
Consequences of ignoring sample selection bias
Real-world examples where sample selection bias affects results
Historical Background and Development
Key contributions to the theory of sample selection bias
Overview of seminal works, including Heckman’s 1979 paper
Section 2: Causes of Sample Selection Bias
Non-Random Sample Selection
Overview of non-random selection mechanisms
Examples: Survey non-response, self-selection, treatment effects
Truncation and Censoring
Understanding truncation in samples (e.g., left and right truncation)
Censored data and its impact on estimation
Endogenous Sampling
Selection based on the outcome variable
Behavioral and economic factors leading to self-selection
Section 3: Theoretical Framework of Sample Selection Bias
The Econometric Model with Selection Bias
Mathematical representation of selection bias
The classical regression model vs. the model with sample selection
Selection Equation and Outcome Equation
The role of the selection equation in modeling sample selection
Separating the selection and outcome equations in empirical models
Bias in Ordinary Least Squares (OLS) Estimators
Demonstrating how OLS estimators become biased and inconsistent
Graphical illustration of the impact of sample selection bias on regression lines
Section 4: Detection and Diagnosis of Sample Selection Bias
Visual and Descriptive Techniques
Inspecting data for potential selection bias (e.g., histograms, scatterplots)
Comparing characteristics of included vs. excluded observations
Statistical Tests for Selection Bias
Heckman’s two-step test for sample selection bias
Other statistical tests and methods to diagnose bias
Identifying Selection Mechanisms
Using theory and data to hypothesize selection mechanisms
Testing for common selection processes in empirical studies
Section 5: The Heckman Two-Step Correction Method
Overview of the Heckman Model
Introduction to Heckman’s correction procedure
Theoretical basis and mathematical derivation
Step 1: Estimating the Selection Equation
Probit model for the selection equation
Calculating the inverse Mills ratio (IMR)
Step 2: Correcting the Outcome Equation
Incorporating the IMR into the outcome equation
Interpretation of coefficients and the selection term
Properties of the Heckman Estimator
Consistency, efficiency, and potential limitations of the Heckman method
Section 6: Alternative Methods to Correct for Sample Selection Bias
Full Information Maximum Likelihood (FIML) Estimation
Introduction to FIML in the context of sample selection models
Comparison of FIML and Heckman’s two-step method
Control Function Approach
Using control functions to address endogeneity and selection bias
Practical implementation and interpretation
Propensity Score Matching (PSM)
Using matching techniques to address selection on observables
Application of PSM in observational studies
Section 7: Extensions and Advanced Topics
Panel Data Models with Sample Selection Bias
Modeling sample selection in panel data frameworks
Dynamic sample selection models
Multiple Equation Selection Models
Simultaneous modeling of multiple outcomes with selection bias
Applications in household surveys and labor economics
Nonparametric and Semiparametric Methods
Introduction to nonparametric approaches to correct for selection bias
Kernel methods and matching estimators
Section 8: Practical Applications and Case Studies
Case Study 1: Labor Market Outcomes
Correcting for sample selection bias in wage equations
Using the Heckman model to analyze labor force participation
Case Study 2: Health Economics
Addressing selection bias in studies of healthcare utilization
Application of sample selection models to insurance claims data
Case Study 3: Educational Research
Analyzing returns to education with selection bias
Dealing with selection into schooling levels
Section 9: Implementation in Statistical Software
Stata Implementation
Using the heckman command for Heckman’s two-step model
Step-by-step guide with examples and interpretation
R Implementation
Implementing the Heckman model using the sampleSelection package
Example code and interpretation of output
Other Software Tools
Implementing sample selection models in SAS, Python (statsmodels), and SPSS
Section 10: Common Pitfalls and Best Practices
Misinterpretation of the Selection Coefficient
Common errors in interpreting the inverse Mills ratio
Overreliance on Parametric Assumptions
Limitations of parametric approaches in the presence of non-normality
Choosing the Correct Selection Model
Guidelines for selecting the appropriate correction method
Practical advice for empirical researchers
Section 11: Summary and Conclusions
Recap of Key Concepts
Implications for Applied Research
Importance of addressing sample selection bias in empirical studies
Future directions in sample selection research
Sách
Greene, William H. 2018. Econometric Analysis, Eighth Edition. Pearson. (Chapter 19.1-19.4)
Wooldridge, Jeffrey M. 2010. Econometric Analysis of Cross Section and Panel Data, Second Edition. MIT Press. (Chapter 17, 19.1-19.6, 19.9)
Bài báo
Heckman, J. J. (1977). Sample selection bias as a specification error (with an application to the estimation of labor supply functions) (Vol. 172). Cambridge, MA: National Bureau of Economic Research.
Vella, F. (1998). Estimating models with sample selection bias: a survey. Journal of Human Resources, 127-169.
Bourguignon, F., Fournier, M., & Gurgand, M. (2007). Selection bias corrections based on the multinomial logit model: Monte Carlo comparisons. Journal of Economic surveys, 21(1), 174-205.
Nội dung
Section 1: Introduction to Truncated and Censored Data
What are truncated data?
What are censored data?
Differences between truncation and censoring
Real-world examples in various fields such as economics, healthcare, and finance
Why standard OLS is inappropriate for truncated and censored data
Consequences of ignoring truncation and censoring
Section 2: Theoretical Framework of Truncated Data
Understanding Truncated Data
Definition and types: left truncation, right truncation, and interval truncation
Probability Density Function for Truncated Data
Adjusted probability distribution for truncated samples
Implications for likelihood functions and estimation
Truncated Regression Models
Specification and estimation of the truncated regression model
Maximum Likelihood Estimation (MLE) for truncated data
Consequences of Truncation on Estimation
Bias and inconsistency in OLS estimates
Examples illustrating the effect of truncation on parameter estimates
Section 3: Theoretical Framework of Censored Data
Understanding Censored Data
Definition and types: left censoring, right censoring, interval censoring, and top-coding
Probability Density Function for Censored Data
Adjusted probability distribution for censored samples
Likelihood functions and censored data
The Tobit Model for Censored Data
Introduction to the Tobit model: specification and estimation
Interpretation of Tobit model coefficients
Limitations and assumptions of the Tobit model
Consequences of Censoring on Estimation
Bias and inefficiency in OLS estimates
Impact of censoring on the variance and mean of estimates
Section 4: Estimation Techniques for Truncated Data
Maximum Likelihood Estimation (MLE) for Truncated Models
Step-by-step guide to MLE for truncated regression
Likelihood functions for left, right, and interval truncation
Estimation and interpretation of parameters
Applications of Truncated Models
Real-world applications: labor market analysis, healthcare data, and survival analysis
Case study: Estimating income distribution with truncated data
Section 5: Estimation Techniques for Censored Data
Maximum Likelihood Estimation (MLE) for Censored Models
Step-by-step guide to MLE for censored regression
Likelihood functions for different types of censoring
The Tobit Model in Practice
Estimation and interpretation of the Tobit model
Extensions of the Tobit model: Type I, Type II, and Type III Tobit models
Limitations and alternative models (e.g., censored least absolute deviations)
Two-Part and Hurdle Models
Introduction to two-part models for semi-continuous data
Hurdle models as alternatives to the Tobit model
Applications and case studies
Section 6: Dealing with Both Truncation and Censoring
Combined Models
Handling data that are both truncated and censored
Specification and estimation of models that combine truncation and censoring
Selection Models for Truncated and Censored Data
Heckman’s sample selection model for censored and truncated data
Application in labor market studies and treatment effect analysis
Section 7: Advanced Topics and Extensions
Panel Data Models with Truncation and Censoring
Handling truncation and censoring in panel data contexts
Dynamic models with censored and truncated data
Nonparametric and Semiparametric Approaches
Nonparametric methods for censored and truncated data
Semiparametric estimators: Cox proportional hazards model for censored data
Copula-Based Models
Modeling dependencies in truncated and censored data using copulas
Applications in finance and risk management
Section 8: Practical Applications and Case Studies
Case Study 1: Income and Expenditure Analysis
Analyzing expenditure data with top-coding using the Tobit model
Case Study 2: Duration Models
Modeling time to event with right-censored and left-truncated data
Case Study 3: Healthcare Utilization
Modeling healthcare expenditure with censored data
Case Study 4: Financial Markets
Modeling truncated returns data in financial markets
Section 9: Implementation in Statistical Software
Implementation in Stata
Using the truncreg and tobit commands
Step-by-step guide with example datasets
Implementation in R
Using the censReg package for censored and truncated models
Example code and interpretation of output
Implementation in Other Software
Using SAS, SPSS, and Python (statsmodels) for truncated and censored models
Section 10: Summary and Conclusions
Key Takeaways
Summary of methods and their applications for handling truncated and censored data
Best Practices for Empirical Research
Guidelines for choosing and implementing the appropriate model
Future Directions
Emerging techniques and ongoing research in handling truncated and censored data
Sách
Greene, William H. 2018. Econometric Analysis, Eighth Edition. Pearson. (Chapter 19.1-19.4)
Wooldridge, Jeffrey M. 2010. Econometric Analysis of Cross Section and Panel Data, Second Edition. MIT Press. (Chapter 17, 19.1-19.6, 19.9)
Long, J. S. (1997). Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks, CA: Sage Publications.
Bài báo
Winkelmann, R. (2004). Health care reform and the number of doctor visits—an econometric analysis. Journal of Applied Econometrics, 19(4), 455-472.