Tài liệu đọc về hồi quy với dữ liệu chéo

Giới thiệu về kinh tế lượng (Introduction to Econometrics)

Nội dung nghiên cứu của môn Kinh tế lượng

Sách

Greene, W. (2012) Econometric Analysis. 7th Edition (Chapter 1, Appendix B )
Gujarati, D. N., & Porter, D. C. (2009). Basic econometrics. McGraw-hill. (Chapter 1)
Stock, J. H., & Watson, M. W. (2020). Introduction to econometrics. Pearson. (Chapter 1)
Wooldridge, J. M. (2000). Introductory Econometrics: A Modern Approach 4rd ed. (Chapter 1, Appendix B)

Mô hình hồi quy tuyến tính đơn (The Simple Linear Regression Model)

Nội dung

Section 1: Introduction

Overview of Linear Regression
- Definition and purpose of simple linear regression
- Importance in statistical analysis and predictive modeling

Section 2: Theoretical Framework

The Model Equation
- Form of the simple linear regression model
- Explanation of each component
Assumptions of the Model
- Linearity
- Random sampling ( or Independence of errors)
- No perfect multicollinearity
- Homoscedasticity
- No endogeneity
- Normal distribution of errors

Section 3: Estimation of Coefficients

Method of Least Squares
- Derivation of the least squares estimators
- Geometric interpretation of least squares
Properties of the Estimators
- Unbiasedness
- Efficiency under Gauss-Markov Theorem
- Consistency

Section 4: Model Interpretation

Interpretation of Coefficients
Goodness-of-Fit
- Coefficient of Determination
- Adjusted R2 and its interpretation

Section 5: Hypothesis Testing

Statistical Inference
- Hypothesis tests forbeta
- Confidence intervals for beta
Assumptions Testing
- Diagnostic plots (residual plots, QQ plots)
- Tests for normality, homoscedasticity, and autocorrelation

Section 6: Model Diagnostics and Remedies

Detection of Model Violations
- Identifying outliers and high leverage points
- Influence measures (Cook's distance)
Remedial Measures
- Transformations of dependent and independent variables
- Robust regression techniques

Section 7: Extensions and Applications

Using Simple Linear Regression in Practice
- Example case studies from economics, business, and science
- Interpretation of results in a real-world context
Software Implementation
- Code snippets for implementing simple linear regression in R and Python
- Interpretation of software output

Section 8: Summary and Conclusions

Key Takeaways
- Recap of main points about simple linear regression
- Importance of checking assumptions and model fit
Further Reading
- Suggested books and articles for deeper understanding

Sách

Stock, J. H., & Watson, M. W. (2020). Introduction to econometrics. Pearson. (Chapter 4 + 5)
Gujarati, D. N., & Porter, D. C. (2009). Basic econometrics. McGraw-hill. (Chapter 3)
Wooldridge, J. M. (2000). Introductory Econometrics: A Modern Approach 4rd ed. (Chapter 2)
Johnston, J., & DiNardo, J. (1963). Econometric methods. (Chapter 2).

Mô hình hồi quy tuyến tính bội (The Multiple Linear Regression Model)

Nội dung

Section 1: Introduction

Purpose and Scope
- Introduction to multiple linear regression (MLR)
- Importance and ubiquity in statistical analysis across disciplines

Section 2: The Model

Formulation of the Multiple Linear Regression Model
- Mathematical representation
- Explanation of terms: response variable, predictors, coefficients, error term

Section 3: Assumptions of the Model

Key Assumptions
- Linearity in parameters
- Random sampling (or Independence of errors)
- No perfect multicollinearity among predictors
- Homoscedasticity of errors
- Exogeneity
- Normal distribution of error terms
Explanation and Implications of Each Assumption
- How these assumptions affect model estimation and interpretation

Section 4: Estimation of Parameters

Ordinary Least Squares (OLS) Method
- Derivation of the OLS estimators
- Matrix notation and computation
Properties of OLS Estimators
- Unbiasedness, consistency, and efficiency under the Gauss-Markov theorem

Section 5: Interpretation of the Model

Coefficient Interpretation
- How to interpret coefficients of multiple regressors
- Partial effect of predictors on the response variable
Model Fit and Diagnostics
- R2 and Adjusted R2
- F-test for overall significance of the model

Section 6: Hypothesis Testing and Confidence Intervals

Testing Hypotheses About Regression Coefficients
- t-tests for individual coefficients
- F-tests for group of coefficients
Building Confidence Intervals
- For coefficients and predicted values

Section 7: Diagnostics for Model Assumptions

Residual Analysis
- Residual plots for detecting non-linearity, heteroscedasticity, and outliers
Influence and Leverage
- Cook's distance, leverage plots, and influence plots
Multicollinearity Diagnosis
- Variance inflation factor (VIF) and tolerance

Section 8: Remedial Measures for Assumption Violations

Transformation and Weighting
- Log transformations, polynomial terms, interaction effects
- Weighted least squares for heteroscedastic errors
Addressing Multicollinearity
- Dropping variables, changing model specification, principal component regression

Section 9: Extensions and Advanced Topics

Generalized Least Squares (GLS)
- When and how to use GLS
Dummy Variables
- Incorporating categorical variables into regression
Interaction Effects
- How and when to include interaction terms

Section 10: Practical Applications

Case Studies
- Real-world examples from economics, business analytics, and other fields
Software Implementation
- Step-by-step guide using R and Python (including code snippets)

Section 11: Summary and Conclusion

Recap of Key Points
- Summary of MLR essentials, diagnostics, and applications
Further Reading and Resources
- Recommended books, articles, and software tools

Sách

Stock, J. H., & Watson, M. W. (2020). Introduction to econometrics. Pearson. (Chapter 6)
Gujarati & Porter (2009) - Chapter 4, pp. 85-113,
Gujarati & Porter (2009) - Chapter 5, pp. 117-139
Wooldridge, J. M. (2000). Introductory Econometrics: A Modern Approach 4rd ed. (Chapter 3)
Gujarati, D. N., & Porter, D. C. (2009). Basic econometrics. McGraw-hill. (Chapter 2)
Greene, W. (2012) Econometric Analysis. 7th Edition ( Chapter 2, and 3 )
Hayashi, F. (2000). Econometrics. Princeton University Press (Chapter 2)

Bài báo

Bertrand, M., & Mullainathan, S. (2004). Are Emily and Greg more employable than Lakisha and Jamal? A field experiment on labor market discrimination. American economic review, 94(4), 991-1013. (link)
Sacerdote, B. (2007). How large are the effects from changes in family environment? A study of Korean American adoptees. The Quarterly Journal of Economics, 122(1), 119-157. (link)
Carrell, S. E., Hoekstra, M., & West, J. E. (2011). Is poor fitness contagious?: Evidence from randomly assigned friends. Journal of public Economics, 95(7-8), 657-663. (link)
Dale, S., & Krueger, A. B. (2011). Estimating the return to college selectivity over the career using administrative earnings data (No. w17159). National Bureau of Economic Research. (link)

Phần mềm

Stata: regress command (link)
Stata: Regression (link)
Eviews: (link)
R: Fitting Linear Models (link)

Mở rộng dạng hàm hồi quy bội (Extensions of the Multiple Variable Regression Model)

Nội dung

Incorporating Non-Linearity

Polynomial Regression
- Introducing polynomial terms to capture non-linear relationships
Double log, semi log model
Splines and Local Regression
- Using splines for flexible, piecewise-polynomial regression
- Local regression techniques for adaptive fitting

Interaction Effects

Modeling Interaction Effects
- How and why to include interaction terms in a model
- Interpretation of coefficients with interactions
Case Studies
- Examples demonstrating the importance of interaction effects in econometric analyses

Sách

PR: Chapter 5, R : Chapter 6
Stock, J. H., & Watson, M. W. (2020). Introduction to econometrics. Pearson. (Chapter 8)
Quadratic and interaction terms: WOO, Chapters 6, 7 and 9.
Wooldridge, J. M. (2000). Introductory Econometrics: A Modern Approach 4rd ed. (Chapter 6)
Gujarati, D. N., & Porter, D. C. (2009). Basic econometrics. McGraw-hill. (Chapter 2)
Greene, W. (2012) Econometric Analysis. 7th Edition (Chapter 6)

Vấn đề định dạng mô hình(Model Specification)

Nội dung

Section 1: Introduction

Purpose of Model Specification
- Importance of correct model specification in regression analysis
- Overview of the consequences of misspecification

Section 2: Conceptual Framework

Defining the Model
- Criteria for choosing dependent and independent variables
- Theoretical considerations and empirical evidence guiding variable selection
Form of the Model
- Linear vs. non-linear relationships
- Decision factors for including interaction terms and polynomial terms

Section 3: Choosing Variables

Variable Selection Techniques
- Stepwise selection (forward, backward, and stepwise)
- Best subset selection
- Criteria for selection (AIC, BIC, Adjusted R2)
Dealing with Data Limitations
- Data availability and quality
- Handling missing data

Section 4: Dealing with Non-Linearity

Transformations to Achieve Linearity
- Logarithmic, square root, and inverse transformations
- When and why to apply these transformations
Flexible Forms
- Incorporating splines and polynomial terms
- Deciding the degree of polynomial

Section 5: Interaction and Dummy Variables

Incorporating Interaction Effects
- How to model interactions between continuous variables
- Interactions between continuous and categorical variables
Using Dummy Variables
- Correctly specifying dummy variables in regression models
- Handling the dummy variable trap

Section 6: Assumptions Checking

Linearity and Additivity
- Diagnostic tests and visual inspections for linearity
Independence of Errors
- Tests for autocorrelation (Durbin-Watson test)
Homoscedasticity
- Identifying heteroscedasticity (Breusch-Pagan test)
- Remedial measures for heteroscedasticity
Normality of Errors
- Normality tests (Shapiro-Wilk test)
- Transformations for achieving normality

Section 7: Multicollinearity

Detection
- Variance inflation factor (VIF) and tolerance levels
Implications and Solutions
- When and how to address multicollinearity
- Impact on coefficient estimates and standard errors

Section 8: Model Extension and Refinement

Advanced Variable Interaction
- Modelling complex interactions and their practical implications
Threshold and Segment Analysis
- Identifying and modeling threshold effects within regression
Robustness Checks
- Sensitivity analyses to verify model stability

Section 9: Practical Applications and Case Studies

Industry-specific Modeling
- Examples from finance, healthcare, marketing, and environmental economics
Common Pitfalls in Model Specification
- Real-world examples of what can go wrong and how to avoid these mistakes

Section 10: Model Validation

Cross-Validation Techniques
- Using holdout data and k-fold cross-validation to test model reliability
Model Comparison and Selection
- Criteria for comparing multiple models

Sách

- Greene, W. (2012) Econometric Analysis. 7th Edition (Chapter 6, 7)
- Stock, J. H., & Watson, M. W. (2020). Introduction to econometrics. Pearson. (Chapter 7)
Wooldridge, J. M. (2000). Introductory Econometrics: A Modern Approach 4rd ed. ( Chapter 3, 7, 9)

Phần mềm

Stata: Model Specification (link)
Stata: Specification link test for single-equation models (link)
Stata: Test linear hypotheses after estimation (link)
Stata: Stata: Regression (link)
R: Model Specification (link)
R: Model Specification for Multiple Regression (link)

Hồi quy với biến giả (Dummy Variables)

Nội dung

Section 1: Introduction to Qualitative Variables

Definition and Types of Qualitative Variables
- Understanding qualitative vs. quantitative variables
- Types of qualitative variables: Nominal, ordinal
Importance of Qualitative Variables in Analysis
- Why and when qualitative variables are crucial in research
- Common scenarios across different disciplines (e.g., social sciences, marketing, health)

Section 2: Coding Qualitative Variables

Converting Qualitative Data to Numeric Form
- Strategies for converting qualitative variables into dummy variables

Section 3: Using Dummy Variables

Dummy Variables Basics
- Creating dummy variables for nominal categories
- Coding schemes for binary and multi-category variables
Interpreting Dummy Variables in Models
- Interpretation of coefficients for dummy variables in linear models
- Understanding the impact of reference categories

Section 5: Interaction Effects with Qualitative Variables

Modeling Interactions
- Interaction between qualitative and quantitative variables
- Interaction between two qualitative variables
Interpreting Interaction Effects
- How to interpret complex interactions in the context of the model
- Visualization techniques for better understanding

Section 8: Practical Considerations and Challenges

Multicollinearity and the Dummy Variable Trap
- Identifying and addressing multicollinearity issues with qualitative variables
- Avoiding the dummy variable trap by dropping a reference category

Section 9: Case Studies and Applications

Case Study 1: Marketing Analysis
- Using customer demographics as qualitative variables in market segmentation
Case Study 2: Social Sciences
- Analyzing survey data with qualitative variables in sociological research
Case Study 3: Health Research
- Modeling patient outcomes using qualitative variables for treatment types

Section 10: Implementation in Statistical Software

Coding and Analysis in R
Practical Guide in Python
Other Software Tools

Sách

Wooldridge, J. M. (2000). Introductory Econometrics: A Modern Approach 4rd ed. ( Chapter 6, 7, 9)

Bài báo

Angrist, J. D., & Krueger, A. B. (1991). Does compulsory school attendance affect schooling and earnings?. The Quarterly Journal of Economics, 106(4), 979-1014. (link)
Graddy, K. (1995). Testing for imperfect competition at the Fulton fish market. The RAND Journal of Economics, 75-92. (link)
DiNardo, J. E., & Pischke, J. S. (1997). The returns to computer use revisited: Have pencils changed the wage structure too?. The Quarterly Journal of Economics, 112(1), 291-303. (link)
Angrist, J., & Evans, W. N. (1996). Children and their parents' labor supply: Evidence from exogenous variation in family size. (link)
Dale, S. B., & Krueger, A. B. (2002). Estimating the payoff to attending a more selective college: An application of selection on observables and unobservables. The Quarterly Journal of Economics, 117(4), 1491-1527. (link)
Krueger, A. B. (1993). How computers have changed the wage structure: evidence from microdata, 1984–1989. The Quarterly Journal of Economics, 108(1), 33-60. (link)

Phần mềm

Stata: Create dummy variables (link)
Stata: How can I create dummy variables in Stata? (link)
Eviews: Dummy Variables (link)

Đa cộng tuyến (Multicollinearity)

Nội dung

Section 1: Introduction to Multicollinearity

Definition and Concept
- What is multicollinearity?
- Types of multicollinearity: Perfect and imperfect (near) multicollinearity

Mathematical Representation
- The concept of linear dependence among explanatory variables
- Multicollinearity in the context of the OLS estimator

Section 2: Causes of Multicollinearity

Common Sources
- Using highly correlated variables
- Dummy variable traps
- Polynomial terms and interaction effects
Data Issues
- Limited sample size and overfitting
- Data aggregation and measurement error

Section 3: Consequences of Multicollinearity

- 1. 1. - Inflated standard errors and unreliable coefficient estimates
      - Reduced statistical power of hypothesis tests
      - Sensitivity to changes in model specification
Misleading Coefficient Estimates
- Changes in sign and magnitude of coefficients
- Counterintuitive results and lack of precision
Impact on Model Predictions
- Potential for overfitting and poor predictive performance
- Unstable model outputs in different samples

Section 4: Identifying and Measuring Multicollinearity

Correlation Matrix Analysis
- Identifying highly correlated predictor pairs
- Visual tools like heatmaps for correlation
Variance Inflation Factor (VIF)
- Calculation and interpretation of VIF values
- Thresholds for identifying problematic multicollinearity
Condition Index and Eigenvalues
- Calculation and interpretation of condition indices
- Relationship between eigenvalues and multicollinearity

Section 5: Remedies for Multicollinearity

Variable Selection and Dropping Variables
- Strategies for selecting and excluding variables
- Using domain knowledge to choose relevant predictors
Combining or Transforming Variables
- Principal component analysis (PCA) and factor analysis
- Creating composite indices or sums
Regularization Techniques
- Ridge regression and Lasso regression as solutions
- Trade-offs between bias and variance
Centering and Standardization
- Centering predictor variables to reduce multicollinearity
- Standardizing variables for better interpretability

Section 6: Practical Applications

Case Study 1: Economic Data Analysis
- Multicollinearity in macroeconomic forecasting models
Case Study 2: Marketing Research
- Addressing multicollinearity in customer segmentation and profiling
Case Study 3: Medical Research
- Managing multicollinearity in clinical trial data

Section 7: Implementing Solutions in Software

R Implementation
- Step-by-step guide for detecting and addressing multicollinearity in R
- Code snippets for VIF, PCA, and regularization techniques
Python Implementation
- Using Python libraries like statsmodels, scikit-learn for multicollinearity diagnostics and remedies
Stata and Other Tools
- Quick guide to detecting multicollinearity in Stata, SPSS, and SAS

Section 8: Summary and Best Practices

Key Takeaways
- Summary of the causes, detection methods, and solutions for multicollinearity
Guidelines for Model Building
- Best practices to prevent and address multicollinearity
Further Reading and Resources
- Suggested books, articles, and online resources for a deeper understanding

Sách

Wooldridge, J. M. (2000). Introductory Econometrics: A Modern Approach 4rd ed. ( Chapter 3)
Gujarati, D. N., & Porter, D. C. (2009). Basic econometrics. McGraw-hill. (Chapters 4)

Phần mềm

Stata: Regression with Stata Chapter 2 – Regression Diagnostics (link)
Stata: Multicollinearity Testing in Stata (link)
Eviews: Multicollinearity - Chapter 8 (link)
R: Collinearity Diagnostics, Model Fit & Variable Contribution (link)

Hiện tượng phương sai thay đổi (Heteroskedasticity)

Nội dung

Section 1: Introduction to Heteroskedasticity

Definition and Concept
- What is heteroskedasticity?
- Distinguishing heteroskedasticity from homoskedasticity
- Why heteroskedasticity matters in regression analysis
- Common scenarios where heteroskedasticity is observed (e.g., cross-sectional data)

Mathematical Representation
- The regression model with heteroskedastic errors
- Implications for the Ordinary Least Squares (OLS) estimator

Section 2: Causes of Heteroskedasticity

Common Sources
- Changes in the scale of dependent variable with the size of the predictors
- Economic or social factors leading to variability in error terms
Examples from Various Fields
- Income disparities in household data
- Variability in investment returns

Section 3: Consequences of Heteroskedasticity

- 1. 1. - Biased standard errors leading to unreliable hypothesis tests
        Inefficiency of OLS estimators
        Impact on confidence intervals and model predictions
      - Implications for Model Interpretation
        Distorted t-tests and F-tests
        Misleading significance levels
      - Impact on Forecasting and Policy Recommendations
        The risks of biased estimates in applied research

Section 4: Detecting Heteroskedasticity

Graphical Methods
- Residual plots against predicted values
- Plotting residuals against each independent variable
Statistical Tests
- 1. 1. 1. Goldfeld-Quandt test
- Breusch-Pagan test
- White test
- Park test
- Cook and Weisberg test
- Szroetzer’s rank test
- Information matrix test
- Cameron and Trivedi’s decomposition
- Many other tests
Interpreting Test Results
- How to interpret the outcomes of heteroskedasticity tests

Section 5: Remedies and Solutions

Transformations
- Logarithmic transformation of the dependent variable
- Generalized least squares (GLS) and weighted least squares (WLS)
Robust Standard Errors
- Using heteroskedasticity-consistent standard errors (e.g., White’s standard errors)
- Advantages and limitations of robust standard errors
Alternative Estimation Techniques
- Feasible Generalized Least Squares (FGLS)
- Estimation with weighted observations

Section 7: Applying Solutions in Practice

Choosing the Right Remedy
- Criteria for selecting the appropriate solution based on the context
- Practical considerations for model specification and interpretation
Case Studies
- Case Study 1: Correcting heteroskedasticity in income inequality data
- Case Study 2: Using robust standard errors in a corporate finance model

Section 8: Advanced Topics in Heteroskedasticity

Modeling Heteroskedasticity Directly
- Autoregressive conditional heteroskedasticity (ARCH) and Generalized ARCH (GARCH) models
- Applications in financial time series
Panel Data Models
- Heteroskedasticity in panel data and solutions
- Applications in macroeconomic studies

Section 9: Software Implementation

Implementing Diagnostics and Remedies in R
- Step-by-step guide to performing tests and applying corrections
- Example code snippets for detecting and correcting heteroskedasticity
Practical Guide in Python
- Using Python libraries like statsmodels and scikit-learn for heteroskedasticity diagnosis and treatment
Other Software Tools
- Implementing solutions in Stata, SPSS, and SAS

Section 10: Summary and Best Practices

Recap of Key Concepts
- Summary of detection methods and remedies for heteroskedasticity
Guidelines for Applied Research
- Best practices for dealing with heteroskedasticity in econometric modeling
Further Reading and Resources
- Suggested books, articles, and software tutorials for deeper understanding

Sách

Greene, W. (2012) Econometric Analysis. 7th Edition ( Chapter 9)
DM, Sec. 7.5
Stock, J. H., & Watson, M. W. (2020). Introduction to econometrics. Pearson. (Chapter 5)
GB Chapter 6 - 145-175
- Wooldridge, J. M. (2000). Introductory Econometrics: A Modern Approach 4rd ed. ( Chapter 8)
JD, Chapter 6
Gujarati, D. N., & Porter, D. C. (2009). Basic econometrics. McGraw-hill. (Chapters 5)

Bài báo

Andrews, D. W. (1991). Heteroskedasticity and autocorrelation consistent covariance matrix estimation. Econometrica: Journal of the Econometric Society, 817-858.
Andrews, D. W., & Monahan, J. C. (1992). An improved heteroskedasticity and autocorrelation consistent covariance matrix estimator. Econometrica: Journal of the Econometric Society, 953-966.
Dale, S. B., & Krueger, A. B. (2002). Estimating the payoff to attending a more selective college: An application of selection on observables and unobservables. The Quarterly Journal of Economics, 117(4), 1491-1527. (link)
Den Haan, W. J., & Levin, A. T. (2000). Robust covariance matrix estimation with data-dependent VAR prewhitening order.
Newey, W. K., & West, K. D. (1994). Automatic lag selection in covariance matrix estimation. The Review of Economic Studies, 61(4), 631-653.

Phần mềm

Stata: Postestimation tools for regress (link)
Stata: Heteroskedasticity (link) hoặc (link)
Stata: regress postestimation — Postestimation tools for regress (link)
Stata: hetregress — Heteroskedastic linear regression
Eviews: Basic Regression - Part E (link)

Hiện tượng nội sinh và phương pháp 2SLS (Endogeneity and The Two-Stage Least Squares method)

Nội dung

Section 1: Introduction to Endogeneity

- What is endogeneity?
- Common sources of endogeneity: omitted variables, measurement errors, simultaneity
- Bias and inconsistency in Ordinary Least Squares (OLS) estimates
- Impact on hypothesis testing and economic interpretation

Section 2: Primary reasons for endogeneity

- - Omitted Variable Bias
  - Measurement Error
  - Simultaneity or Reverse Causality
  - Dynamic Endogeneity

Other reasons cause bias

Sample Selection Bias
Self-Selection

Section 3: The Instrumental Variables (IV) Approach

Conditions for a Valid Instrument
- Relevance: correlation with the endogenous regressor
- Exogeneity: uncorrelated with the error term
- Overidentification and the need for multiple instruments
  - 1. - Examples of potential instruments in empirical research

Section 4: The Two-Stage Least Squares (2SLS) Method

Introduction to 2SLS
- Concept and rationale behind 2SLS
- When and why to use 2SLS instead of OLS
The 2SLS Estimation Process
- Stage 1: Regress the endogenous variable on all exogenous variables, including instruments
- Stage 2: Use predicted values from Stage 1 as instruments in the original equation
- Interpretation of coefficients obtained from 2SLS

Properties of 2SLS Estimators
- Consistency and asymptotic normality
- Efficiency relative to OLS under endogeneity
Assumptions Required for 2SLS
- Linearity and no perfect multicollinearity
- Validity of instruments
- Potential violations and their consequences

Section 5: Testing the Validity of Instruments

Relevance of Instruments: Weak instruments
- First-stage F-statistics for weak instruments
  1. - - Problems with Weak Instruments
        Bias in 2SLS estimates
        Reduced power of hypothesis tests
      - Detection and Remedies
        Testing for weak instruments using the Cragg-Donald statistic
        Using Limited Information Maximum Likelihood (LIML) and other robust methods
Exogeneity of Instruments
- Overidentification tests: Sargan-Hansen test
- Intuition and application of overidentification tests

Section 6: Practical Applications and Case Studies

Case Study 1: Demand and Supply Models
- Using instruments to estimate supply and demand functions
Case Study 2: Education and Earnings
- Addressing endogeneity in returns to education using natural experiments
Case Study 3: Policy Impact Evaluation
- Evaluating the effect of policy interventions with endogenous treatment variables

Section 7: Extensions and Advanced Topics

Generalized Method of Moments (GMM)
- GMM as a generalization of IV/2SLS methods
- Application in panel data and time series contexts
Dynamic Panel Data Models
- Addressing endogeneity in panel data using lagged instruments (Arellano-Bond estimator)
Control Function Approach
- Alternative to 2SLS for handling endogeneity

Section 8: Implementation in Statistical Software

Implementing 2SLS in R
- Step-by-step guide to 2SLS estimation in R using ivreg package
- Code snippets and interpretation of output
Using 2SLS in Stata
- Command ivregress for 2SLS and diagnostics
- Interpretation and common pitfalls
Other Software Tools
- Implementing 2SLS in Python (statsmodels), SAS, and SPSS

Section 9: Summary and Conclusion

Key Takeaways
- Recap of the importance of addressing endogeneity
- Summary of the 2SLS approach and its applications
Best Practices
- Guidelines for selecting and testing instruments
- Practical advice for applied researchers

Sách

- Wooldridge, J. M. (2000). Introductory Econometrics: A Modern Approach 4rd ed. ( Chapter 7+15)
Stock, J. H., & Watson, M. W. (2020). Introduction to econometrics. Pearson. (Chapter 12)
Cameron, A. C., and P. K. Trivedi (2005), Microeconometrics: Methods and Applications. Cambridge University Press, New York, NY. (Chapters 4 and 6.)

Bài báo

Acemoglu, D., Johnson, S., Robinson, J., & Thaicharoen, Y. (2002). Institutional Causes, Macroeconomic Symptoms: Volatility. Crisis and Growth. (link)
Angrist, J. D. (1990). Lifetime earnings and the Vietnam era draft lottery: evidence from social security administrative records. The american economic review, 313-336. (link)
Angrist, J. D., & Krueger, A. B. (2001). Instrumental variables and the search for identification: From supply and demand to natural experiments. Journal of Economic perspectives, 15(4), 69-85. (link)
Angrist, J. D., Graddy, K., & Imbens, G. W. (2000). The interpretation of instrumental variables estimators in simultaneous equations models with an application to the demand for fish. The Review of Economic Studies, 67(3), 499-527. (link)
Ang, D. (2023). The birth of a nation: Media and racial hate. American Economic Review, 113(6), 1424-1460. (link)
Brollo, F., & Troiano, U. (2013). What happens when a woman wins an election. Evidence from Close Races in Brazil. MPRA Paper, 52244. (link)
Clarke, D., & Matta, B. (2018). Practical considerations for questionable IVs. The Stata Journal, 18(3), 663-691. (link)
Campante, F. R., & Do, Q. A. (2014). Isolated capital cities, accountability, and corruption: Evidence from US states. American Economic Review, 104(8), 2456-2481. (link)
DiNardo, J., & Lee, D. S. (2004). Economic impacts of new unionization on private sector employers: 1984–2001. The quarterly journal of economics, 119(4), 1383-1441. (link).
Donald, S. G., & Newey, W. K. (2001). Choosing the number of instruments. Econometrica, 69(5), 1161-1191.
Epple, D., & Romano, R. E. (1996). Public provision of private goods. Journal of political Economy, 104(1), 57-84. (link)
Hahn, J., & Hausman, J. (2002). A new specification test for the validity of instrumental variables. Econometrica, 70(1), 163-189.
Heckman, J. (1997). Instrumental variables: A study of implicit behavioral assumptions used in making program evaluations. Journal of human resources, 441-462.
Imbens, G. (2014). Instrumental variables: An econometrician's perspective (No. w19983). National Bureau of Economic Research. (link)
Krueger, A. (1993). How Computers Have Changed the Wage Structure: Evidence from. (link)
Słoczyński, T., Uysal, S. D., & Wooldridge, J. M. (2024). Abadie's Kappa and Weighting Estimators of the Local Average Treatment Effect. Journal of Business & Economic Statistics, 1-28.

Weak instruments

- Bound, J., D. Jaeger and R. Baker (1995), Problems with Instrumental Variables Estimation when the Correlation between the Instruments and the Endogenous Explanatory Variable is Weak, Journal of the American Statistical Association, 90(430), pp. 443-450.
- Stock, J., J. Wright, and M. Yogo (2002), A Survey of Weak Instruments and Weak Identification in Generalized Methods of Moments, Journal of Business and Economic Statistics, 20(4), 518-529.
- Hahn, J., and J. Hausman (2003), Weak Instruments: Diagnosis and Cures in Empirical Econometrics, American Economic Review, 93(2), 118-125.
- Cruz, L.M. and Moreira, J. (2005), On the Validity of Econometric Techniques with Weak Instruments. Inference on Returns to Education Using Compulsory School Attendance Laws, Journal of Human Resources, 40(2), 393-410.
- Staiger, D. O., & Stock, J. H. (1994). Instrumental variables regression with weak instruments. Econometrica, 65, 557-586.
- Stock, J. H., & Yogo, M. (2002). Testing for weak instruments in linear IV regression. (link)

Phần mềm

Stata: ivregress — Single-equation instrumental-variables regression (link)
Stata: ivregress postestimation — Postestimation tools for ivregress (link)
Eviews: two-stage least squares (link)
R: ivreg: Two-Stage Least-Squares Regression with Diagnostics (link)

Phương pháp GMM (Generalized Method of Moments)

Nội dung

Section 1: Introduction to GMM

- Why GMM? Addressing limitations of traditional estimation methods
- Applications in econometrics and beyond
- Development and key contributors to GMM theory
- Comparison with other estimation methods (OLS, IV, ML)

Section 2: Theoretical Foundation of GMM

Basic Concepts and Notation
- Moments and moment conditions: Definition and examples
- Population moments vs. sample moments
  1. - Moment Conditions
      - Deriving moment conditions from economic theory
      - Examples of moment conditions in different contexts
The GMM Estimator
- Definition of the GMM estimator
- The method of moments: Classical method and limitations
- Generalizing to GMM: Using multiple moment conditions

Objective Function
- Constructing the GMM objective function
- The role of the weighting matrix
Optimal Weighting Matrix
- Properties of the optimal weighting matrix
- Two-step GMM: Estimating the optimal weighting matrix

Section 4: Properties of GMM Estimators

Consistency and Asymptotic Normality
- Proofs and intuitive explanation
- Conditions required for consistency and asymptotic normality
Efficiency
- Efficiency of GMM relative to other estimators
- When GMM is more efficient than OLS and IV
Robustness to Heteroskedasticity and Serial Correlation
- Robust standard errors in the context of GMM

Section 5: Implementation of GMM

Choosing Moment Conditions
- Guidelines for selecting appropriate moment conditions
- Trade-offs between over-identification and efficiency
Practical Steps in GMM Estimation
- Defining the model and identifying moment conditions
- Estimating the parameters and testing the model

Section 6: Testing and Model Specification

Overidentification Tests
- Hansen’s J-test: Purpose, computation, and interpretation
- Testing the validity of moment conditions
Hypothesis Testing in GMM Framework
- Wald tests, Lagrange Multiplier tests, and likelihood ratio tests
- Application in hypothesis testing

Section 7: Common Applications of GMM

Dynamic Panel Data Models
- Arellano-Bond estimator: GMM for dynamic panel data
- Application to economic growth and investment models
  - - System GMM
      - Introduction to System GMM and its advantages over difference GMM
      - Application in panel data with persistent series
Financial Econometrics
- Estimating asset pricing models with GMM
- GMM in risk modeling and option pricing
Macroeconomic Models
- Estimating rational expectations models
- GMM in DSGE models

Weak Instruments in GMM
- Detection and implications of weak instruments
- Remedies and robust estimation techniques
Nonlinear GMM
- Application of GMM in nonlinear models
- Examples from discrete choice models and nonlinear panel data models

Section 9: Practical Considerations and Challenges

Identification Issues
- Problems with weak or irrelevant moment conditions
- Strategies to improve identification
Finite Sample Properties
- Small sample bias in GMM estimators
- Use of bootstrapping techniques for inference

Section 10: Software Implementation

GMM in R
- Step-by-step guide using gmm and ivreg packages
- Example code and interpretation of output
GMM in Stata
- Using xtabond and xtdpdsys commands for panel GMM
- Practical tips and troubleshooting common errors
Other Software Tools
- GMM implementation in Python (statsmodels), SAS, and EViews

Section 11: Case Studies

Case Study 1: Estimating Consumption Functions
- Using GMM to estimate intertemporal consumption models
Case Study 2: Investment Models
- Application of GMM in estimating investment under uncertainty
Case Study 3: Financial Market Analysis
- Estimating and testing CAPM using GMM

Section 12: Summary and Best Practices

Key Takeaways
- Recap of GMM advantages, properties, and applications
Best Practices for Applied Research
- Guidelines for implementing GMM effectively
- Common pitfalls and how to avoid them
Further Reading
- Recommended books, articles, and tutorials for deeper understanding

Sách

Wooldridge, J. (2002): Econometric Analysis of Cross Section and Panel Data. MIT Press. (Chapter 10-11)
AP, Chapter 4.
Wooldridge, J. M. (2000). Introductory Econometrics: A Modern Approach 4rd ed. (Chapter 15)
Hayashi, F. (2000). Econometrics. Princeton University Press (Chapter 3 and 4)
Hall, A.R., ch. 4-5

Bài giảng đọc thêm về GMM

Bài báo

Andrews, D. W., Moreira, M. J., & Stock, J. H. (2006). Optimal two‐sided invariant similar tests for instrumental variables regression. Econometrica, 74(3), 715-752.
Andrews, I., & Mikusheva, A. (2016). Conditional inference with a functional nuisance parameter. Econometrica, 84(4), 1571-1612.
Chamberlain, G. (1987). Asymptotic efficiency in estimation with conditional moment restrictions. Journal of econometrics, 34(3), 305-334.
Hansen, L. P., & Singleton, K. J. (1982). Generalized instrumental variables estimation of nonlinear rational expectations models. Econometrica: Journal of the Econometric Society, 1269-1286.
Hansen, L. P., Heaton, J., & Yaron, A. (1996). Finite-sample properties of some alternative GMM estimators. Journal of Business & Economic Statistics, 14(3), 262-280.
Hansen, L. P. (1982). Large sample properties of generalized method of moments estimators. Econometrica: Journal of the econometric society, 1029-1054.
Kitamura, Y., & Stutzer, M. (1997). An information-theoretic alternative to generalized method of moments estimation. Econometrica: Journal of the Econometric Society, 861-874.
Mavroeidis, S., Plagborg-Møller, M., & Stock, J. H. (2014). Empirical evidence on inflation expectations in the New Keynesian Phillips Curve. American Economic Journal: Journal of Economic Literature, 52(1), 124-188.
Newey, W. K. (1985). Generalized method of moments specification testing. Journal of econometrics, 29(3), 229-256.
Newey, W. K., & West, K. D. (1986). A simple, positive semi-definite, heteroskedasticity and autocorrelationconsistent covariance matrix.
Staiger, D. O., & Stock, J. H. (1994). Instrumental variables regression with weak instruments.
Stock, J. H., & Wright, J. H. (2000). GMM with weak identification. Econometrica, 68(5), 1055-1096.

Phần mềm

Stata: gmm — Generalized method of moments estimation (link)
An introduction to GMM estimation using Stata - David M. Drukker (link)
GMM trên Eviews (link)

Mô hình SUR (Seemingly Unrelated Regression)

Nội dung

Section 1: Introduction to Seemingly Unrelated Regression

Overview of the SUR Model
- Definition and purpose of SUR
- Historical context and development
Motivation for Using SUR
- Why SUR is used instead of separate OLS regressions
- Situations where SUR is beneficial

Section 2: Theoretical Foundation of SUR

Model Structure
- The system of equations
Correlation of Disturbances
- Importance of contemporaneous correlation between error terms
- Implications of ignoring correlated disturbances

Section 3: Estimation Methods

Ordinary Least Squares (OLS) vs. SUR
- Limitations of separate OLS estimation
- Efficiency gains from using SUR
Feasible Generalized Least Squares (FGLS)
- Derivation of the FGLS estimator for SUR
- Calculating the covariance matrix of the error terms
- Iterative estimation procedure for FGLS

Section 4: Properties of the SUR Estimator

Consistency and Efficiency
- Conditions under which SUR is more efficient than OLS
- Comparison of SUR and seemingly independent regression (SIR) models
Asymptotic Properties
- Asymptotic normality and consistency of the SUR estimator

Section 5: Model Testing and Validation

Testing for the Need of SUR
- Breusch-Pagan test for independence of errors
- Likelihood ratio test for joint estimation
Hypothesis Testing in SUR Framework
- Joint hypothesis testing across multiple equations
- Testing for parameter equality across equations

Section 6: Applications of the SUR Model

Economic and Financial Applications
- Estimating systems of demand equations
- Simultaneous modeling of asset returns
Cross-Sectional Analysis
- Application to household surveys with multiple outcomes
Time-Series Applications
- Modeling interrelated economic indicators

Section 7: Special Topics and Extensions

Panel Data and SUR
- SUR in the context of panel data models
- Combining SUR with fixed and random effects
Nonlinear SUR Models
- Introduction to nonlinear systems with correlated errors
- Examples and estimation techniques
Dynamic SUR Models
- Extending SUR to dynamic models with lagged variables
- Applications in macroeconomic modeling

Section 8: Practical Implementation of SUR

Implementation in R
- Using the systemfit package for SUR estimation
- Example code and interpretation of results
Implementation in Stata
- Step-by-step guide using sureg command
- Practical tips and troubleshooting common issues
Other Software Tools
- Implementing SUR in SAS and EViews

Section 9: Case Studies

Case Study 1: Demand Systems
- Application of SUR to estimate demand functions for multiple goods
Case Study 2: Labor Economics
- Modeling wage determination and labor supply jointly
Case Study 3: Financial Econometrics
- Joint modeling of returns on multiple asset classes

Section 10: Limitations and Considerations

When Not to Use SUR
- Situations where SUR does not offer efficiency gains
- Practical limitations in the presence of weak correlations
Multicollinearity in SUR
- Impact of multicollinearity among independent variables
- Solutions and diagnostic tools

Section 11: Summary and Best Practices

Key Takeaways
- Summary of the advantages and applications of SUR
- Guidelines for choosing between SUR and other estimation methods
Practical Recommendations
- Best practices for implementing SUR in empirical research
- Common pitfalls and how to avoid them
Further Reading
- Suggested books, articles, and resources for deeper understanding

Bài báo

Zellner, A. (1962). An efficient method of estimating seemingly unrelated regressions and tests for aggregation bias. Journal of the American statistical Association, 57(298), 348-368.

Phần mềm

Stata: sureg — Zellner's seemingly unrelated regression (link) hoặc (link)
Stata: suest — Seemingly unrelated estimation (link)

Tài liệu đọc thêm

Seemingly Unrelated Regressions

Hệ phương trình đồng thời (Simultaneous Equations Models)

Nội dung

Topic 1: Introduction to Simultaneous Equations Models

Overview of Econometric Models
- Definition of SEMs
- Importance in economic analysis
Historical Background
- Development of SEMs in econometrics
- Key contributions and theorists

Topic 2: Theoretical Foundations

Basic Concepts
- Endogeneity and exogeneity
- Identification problem
Structural and Reduced Forms
- Derivation and interpretation
- Difference between structural and reduced forms

Topic 3: Identification

Conditions for Identification
- Order and rank conditions
- Theoretical criteria for model identification
Problems and Solutions
- Common pitfalls and how to avoid them
- Practical examples of identification issues

Topic 4: Estimation Methods

Ordinary Least Squares (OLS) Limitations
- Why OLS fails in the context of SEMs
Instrumental Variables (IV)
- Concept and application
- Choosing valid instruments
Two-Stage Least Squares (2SLS)
- Implementation and examples
- Comparison with other estimators
Three-Stage Least Squares (3SLS) and Full Information Maximum Likelihood (FIML)
- Advanced estimation techniques
- When to use each method

Topic 5: Specification and Testing

Model Specification
- Choosing the correct model structure
- Tests for structural stability
Hypothesis Testing
- Tests for over-identifying restrictions
- Tests for endogeneity

Topic 6: Applications of SEMs

Economic Models
- Supply and demand systems
- Keynesian models of income determination
Policy Analysis
- Impact of fiscal and monetary policy
- Case studies in policy evaluation
Financial Markets
- SEMs in asset pricing
- Risk and return models

Topic 7: Extensions and Recent Developments

Dynamic Models
- Introduction to dynamic SEMs
- Application in time series data
Panel Data
- Panel SEMs
- Differences and challenges compared to cross-sectional data
Nonlinear SEMs
- Nonlinear models and estimation techniques
- Case studies and applications

Topic 8: Practical Issues in SEMs

Computational Considerations
- Software options for SEM analysis
- Example codes and algorithms
Data Issues
- Data collection and preparation challenges
- Dealing with missing data and measurement error

Topic 9: Case Studies

Detailed Walkthroughs of SEM Analysis
- Real-world examples from economics and finance
- Step-by-step analysis and interpretation

Bài giảng tham khảo

Simultaneous Equations Models (link)
Simultaneous Equations Models: what are they and how are they estimated Omar M.G. Keshk (link)

Sách

Gujarati, D. N. (2009). Basic econometrics - Chapter 18 (link)
Maddala, G. S. (1983). Limited-dependent and qualitative variables in econometrics (Vol. 149). Cambridge University Press. (link)

Bài báo

Clemens, M. A. (2007). Do visas kill? Health effects of African health professional emigration. Center for Global Development Working Paper, (114). (link)
Graddy, K. (1995). Testing for imperfect competition at the Fulton fish market. The RAND Journal of Economics, 75-92. (link)

Phần mềm

Stata: reg3 — Three-stage estimation for systems of simultaneous equations (link)
Eviews: Dealing with simultaneous equation in Eviews (link)
R: systemfit - A Package for Estimating Systems of Simultaneous Equations (link) (link)
Estimating and interpreting structural equation models in Stata 12 - David M. Drukker (link)

Tính chệch do chọn mẫu (Sample selection bias)

Nội dung

Section 1: Introduction to Sample Selection Bias

- What is sample selection bias?
- How sample selection bias differs from other biases (e.g., omitted variable bias, measurement error)
- Consequences of ignoring sample selection bias
- Real-world examples where sample selection bias affects results
Historical Background and Development
- Key contributions to the theory of sample selection bias
- Overview of seminal works, including Heckman’s 1979 paper

Section 2: Causes of Sample Selection Bias

Non-Random Sample Selection
- Overview of non-random selection mechanisms
- Examples: Survey non-response, self-selection, treatment effects
Truncation and Censoring
- Understanding truncation in samples (e.g., left and right truncation)
- Censored data and its impact on estimation
Endogenous Sampling
- Selection based on the outcome variable
- Behavioral and economic factors leading to self-selection

Section 3: Theoretical Framework of Sample Selection Bias

The Econometric Model with Selection Bias
- Mathematical representation of selection bias
- The classical regression model vs. the model with sample selection
Selection Equation and Outcome Equation
- The role of the selection equation in modeling sample selection
- Separating the selection and outcome equations in empirical models
Bias in Ordinary Least Squares (OLS) Estimators
- Demonstrating how OLS estimators become biased and inconsistent
- Graphical illustration of the impact of sample selection bias on regression lines

Section 4: Detection and Diagnosis of Sample Selection Bias

Visual and Descriptive Techniques
- Inspecting data for potential selection bias (e.g., histograms, scatterplots)
- Comparing characteristics of included vs. excluded observations
Statistical Tests for Selection Bias
- Heckman’s two-step test for sample selection bias
- Other statistical tests and methods to diagnose bias
Identifying Selection Mechanisms
- Using theory and data to hypothesize selection mechanisms
- Testing for common selection processes in empirical studies

Section 5: The Heckman Two-Step Correction Method

Overview of the Heckman Model
- Introduction to Heckman’s correction procedure
- Theoretical basis and mathematical derivation
Step 1: Estimating the Selection Equation
- Probit model for the selection equation
- Calculating the inverse Mills ratio (IMR)
Step 2: Correcting the Outcome Equation
- Incorporating the IMR into the outcome equation
- Interpretation of coefficients and the selection term
Properties of the Heckman Estimator
- Consistency, efficiency, and potential limitations of the Heckman method

Section 6: Alternative Methods to Correct for Sample Selection Bias

Full Information Maximum Likelihood (FIML) Estimation
- Introduction to FIML in the context of sample selection models
- Comparison of FIML and Heckman’s two-step method
Control Function Approach
- Using control functions to address endogeneity and selection bias
- Practical implementation and interpretation
Propensity Score Matching (PSM)
- Using matching techniques to address selection on observables
- Application of PSM in observational studies

Section 7: Extensions and Advanced Topics

Panel Data Models with Sample Selection Bias
- Modeling sample selection in panel data frameworks
- Dynamic sample selection models
Multiple Equation Selection Models
- Simultaneous modeling of multiple outcomes with selection bias
- Applications in household surveys and labor economics
Nonparametric and Semiparametric Methods
- Introduction to nonparametric approaches to correct for selection bias
- Kernel methods and matching estimators

Section 8: Practical Applications and Case Studies

Case Study 1: Labor Market Outcomes
- Correcting for sample selection bias in wage equations
- Using the Heckman model to analyze labor force participation
Case Study 2: Health Economics
- Addressing selection bias in studies of healthcare utilization
- Application of sample selection models to insurance claims data
Case Study 3: Educational Research
- Analyzing returns to education with selection bias
- Dealing with selection into schooling levels

Section 9: Implementation in Statistical Software

Stata Implementation
- Using the heckman command for Heckman’s two-step model
- Step-by-step guide with examples and interpretation
R Implementation
- Implementing the Heckman model using the sampleSelection package
- Example code and interpretation of output
Other Software Tools
- Implementing sample selection models in SAS, Python (statsmodels), and SPSS

Section 10: Common Pitfalls and Best Practices

Misinterpretation of the Selection Coefficient
- Common errors in interpreting the inverse Mills ratio
Overreliance on Parametric Assumptions
- Limitations of parametric approaches in the presence of non-normality
Choosing the Correct Selection Model
- Guidelines for selecting the appropriate correction method
- Practical advice for empirical researchers

Section 11: Summary and Conclusions

Recap of Key Concepts
Implications for Applied Research
- Importance of addressing sample selection bias in empirical studies
- Future directions in sample selection research

Sách

Greene, William H. 2018. Econometric Analysis, Eighth Edition. Pearson. (Chapter 19.1-19.4)
Wooldridge, Jeffrey M. 2010. Econometric Analysis of Cross Section and Panel Data, Second Edition. MIT Press. (Chapter 17, 19.1-19.6, 19.9)

Bài báo

Heckman, J. J. (1977). Sample selection bias as a specification error (with an application to the estimation of labor supply functions) (Vol. 172). Cambridge, MA: National Bureau of Economic Research.
Vella, F. (1998). Estimating models with sample selection bias: a survey. Journal of Human Resources, 127-169.
Bourguignon, F., Fournier, M., & Gurgand, M. (2007). Selection bias corrections based on the multinomial logit model: Monte Carlo comparisons. Journal of Economic surveys, 21(1), 174-205.

Phần mềm

Stata: heckman — Heckman selection model (link)
Stata: xtheckman - Heckman selection model for panel data (link)

Regression with truncated and censored data

Nội dung

Section 1: Introduction to Truncated and Censored Data

- What are truncated data?
- What are censored data?
- Differences between truncation and censoring
  1. 1. - Real-world examples in various fields such as economics, healthcare, and finance
- Why standard OLS is inappropriate for truncated and censored data
- Consequences of ignoring truncation and censoring

Section 2: Theoretical Framework of Truncated Data

Understanding Truncated Data
- Definition and types: left truncation, right truncation, and interval truncation
Probability Density Function for Truncated Data
- Adjusted probability distribution for truncated samples
- Implications for likelihood functions and estimation
Truncated Regression Models
- Specification and estimation of the truncated regression model
- Maximum Likelihood Estimation (MLE) for truncated data
  1. 1. - Consequences of Truncation on Estimation
- Bias and inconsistency in OLS estimates
- Examples illustrating the effect of truncation on parameter estimates

Section 3: Theoretical Framework of Censored Data

Understanding Censored Data
- Definition and types: left censoring, right censoring, interval censoring, and top-coding
Probability Density Function for Censored Data
- Adjusted probability distribution for censored samples
- Likelihood functions and censored data
The Tobit Model for Censored Data
- Introduction to the Tobit model: specification and estimation
- Interpretation of Tobit model coefficients
- Limitations and assumptions of the Tobit model
Consequences of Censoring on Estimation
- Bias and inefficiency in OLS estimates
- Impact of censoring on the variance and mean of estimates

Section 4: Estimation Techniques for Truncated Data

Maximum Likelihood Estimation (MLE) for Truncated Models
- Step-by-step guide to MLE for truncated regression
- Likelihood functions for left, right, and interval truncation
- Estimation and interpretation of parameters
Applications of Truncated Models
- Real-world applications: labor market analysis, healthcare data, and survival analysis
- Case study: Estimating income distribution with truncated data

Section 5: Estimation Techniques for Censored Data

Maximum Likelihood Estimation (MLE) for Censored Models
- Step-by-step guide to MLE for censored regression
- Likelihood functions for different types of censoring
The Tobit Model in Practice
- Estimation and interpretation of the Tobit model
- Extensions of the Tobit model: Type I, Type II, and Type III Tobit models
- Limitations and alternative models (e.g., censored least absolute deviations)
Two-Part and Hurdle Models
- Introduction to two-part models for semi-continuous data
- Hurdle models as alternatives to the Tobit model
- Applications and case studies

Section 6: Dealing with Both Truncation and Censoring

Combined Models
- Handling data that are both truncated and censored
- Specification and estimation of models that combine truncation and censoring
Selection Models for Truncated and Censored Data
- Heckman’s sample selection model for censored and truncated data
- Application in labor market studies and treatment effect analysis

Section 7: Advanced Topics and Extensions

Panel Data Models with Truncation and Censoring
- Handling truncation and censoring in panel data contexts
- Dynamic models with censored and truncated data
Nonparametric and Semiparametric Approaches
- Nonparametric methods for censored and truncated data
- Semiparametric estimators: Cox proportional hazards model for censored data
Copula-Based Models
- Modeling dependencies in truncated and censored data using copulas
- Applications in finance and risk management

Section 8: Practical Applications and Case Studies

Case Study 1: Income and Expenditure Analysis
- Analyzing expenditure data with top-coding using the Tobit model
Case Study 2: Duration Models
- Modeling time to event with right-censored and left-truncated data
Case Study 3: Healthcare Utilization
- Modeling healthcare expenditure with censored data
Case Study 4: Financial Markets
- Modeling truncated returns data in financial markets

Section 9: Implementation in Statistical Software

Implementation in Stata
- Using the truncreg and tobit commands
- Step-by-step guide with example datasets
Implementation in R
- Using the censReg package for censored and truncated models
- Example code and interpretation of output
Implementation in Other Software
- Using SAS, SPSS, and Python (statsmodels) for truncated and censored models

Section 10: Summary and Conclusions

Key Takeaways
- Summary of methods and their applications for handling truncated and censored data
Best Practices for Empirical Research
- Guidelines for choosing and implementing the appropriate model
Future Directions
- Emerging techniques and ongoing research in handling truncated and censored data

Sách

Greene, William H. 2018. Econometric Analysis, Eighth Edition. Pearson. (Chapter 19.1-19.4)
Wooldridge, Jeffrey M. 2010. Econometric Analysis of Cross Section and Panel Data, Second Edition. MIT Press. (Chapter 17, 19.1-19.6, 19.9)
Long, J. S. (1997). Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks, CA: Sage Publications.

Bài báo