How to interpret interaction dummies of multiple categories and main effect












1














I have a panel data crosscountry regression with following structure ($y$ as a drug addiction rate of the country, $x$ as number of homeless of the country and $m$ as HIV infection rate of the country) and I categorize my countries in four world regions which I code as Dummys $D_1$, $D_2$, $D_3$ and the fourth region as reference category:



$y = b_1x + b_2m + b_3D_1m + b_4D_2m + b_5D_3m$ (1)



When I change my base category every coefficient and significance value except $b_1$ changes.



When I change my regression to:



$y = b_1x + b_3D_1m + b_4D_2m + b_5D_3m + b_6D_4m$ (2)



the coefficients in (2) are the same as $b_2$ in regression (1) with the same significance values depending on the reference category



Now I don't understand what I am seeing. the maineffect coefficient $b_2$ is the effect of the reference category and not the mean of the HIV infection rate effect? What does my main effect coefficient $b_2$ say? In regression (1) why does my significance values $b_3$, $b_4$, and $b_5$ change if I change my reference category and what does the significance of $b_3$, $b_4$, and $b_5$ mean regarding my main effect $b_2$? I am completely confused right now.



Best regards,
Rub_n










share|cite|improve this question









New contributor




Rub_n is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




















  • How are you modelling the error terms? What kind of model is this? Ordinary least squares? Logistic regression?
    – StatsStudent
    5 hours ago










  • Do you really have crosscountry data or is this supposed to be cross-sectional?
    – StatsStudent
    5 hours ago










  • I use an OLS regression with group and time fixed effects. Yes I have crosscountry data.
    – Rub_n
    5 hours ago
















1














I have a panel data crosscountry regression with following structure ($y$ as a drug addiction rate of the country, $x$ as number of homeless of the country and $m$ as HIV infection rate of the country) and I categorize my countries in four world regions which I code as Dummys $D_1$, $D_2$, $D_3$ and the fourth region as reference category:



$y = b_1x + b_2m + b_3D_1m + b_4D_2m + b_5D_3m$ (1)



When I change my base category every coefficient and significance value except $b_1$ changes.



When I change my regression to:



$y = b_1x + b_3D_1m + b_4D_2m + b_5D_3m + b_6D_4m$ (2)



the coefficients in (2) are the same as $b_2$ in regression (1) with the same significance values depending on the reference category



Now I don't understand what I am seeing. the maineffect coefficient $b_2$ is the effect of the reference category and not the mean of the HIV infection rate effect? What does my main effect coefficient $b_2$ say? In regression (1) why does my significance values $b_3$, $b_4$, and $b_5$ change if I change my reference category and what does the significance of $b_3$, $b_4$, and $b_5$ mean regarding my main effect $b_2$? I am completely confused right now.



Best regards,
Rub_n










share|cite|improve this question









New contributor




Rub_n is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




















  • How are you modelling the error terms? What kind of model is this? Ordinary least squares? Logistic regression?
    – StatsStudent
    5 hours ago










  • Do you really have crosscountry data or is this supposed to be cross-sectional?
    – StatsStudent
    5 hours ago










  • I use an OLS regression with group and time fixed effects. Yes I have crosscountry data.
    – Rub_n
    5 hours ago














1












1








1







I have a panel data crosscountry regression with following structure ($y$ as a drug addiction rate of the country, $x$ as number of homeless of the country and $m$ as HIV infection rate of the country) and I categorize my countries in four world regions which I code as Dummys $D_1$, $D_2$, $D_3$ and the fourth region as reference category:



$y = b_1x + b_2m + b_3D_1m + b_4D_2m + b_5D_3m$ (1)



When I change my base category every coefficient and significance value except $b_1$ changes.



When I change my regression to:



$y = b_1x + b_3D_1m + b_4D_2m + b_5D_3m + b_6D_4m$ (2)



the coefficients in (2) are the same as $b_2$ in regression (1) with the same significance values depending on the reference category



Now I don't understand what I am seeing. the maineffect coefficient $b_2$ is the effect of the reference category and not the mean of the HIV infection rate effect? What does my main effect coefficient $b_2$ say? In regression (1) why does my significance values $b_3$, $b_4$, and $b_5$ change if I change my reference category and what does the significance of $b_3$, $b_4$, and $b_5$ mean regarding my main effect $b_2$? I am completely confused right now.



Best regards,
Rub_n










share|cite|improve this question









New contributor




Rub_n is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











I have a panel data crosscountry regression with following structure ($y$ as a drug addiction rate of the country, $x$ as number of homeless of the country and $m$ as HIV infection rate of the country) and I categorize my countries in four world regions which I code as Dummys $D_1$, $D_2$, $D_3$ and the fourth region as reference category:



$y = b_1x + b_2m + b_3D_1m + b_4D_2m + b_5D_3m$ (1)



When I change my base category every coefficient and significance value except $b_1$ changes.



When I change my regression to:



$y = b_1x + b_3D_1m + b_4D_2m + b_5D_3m + b_6D_4m$ (2)



the coefficients in (2) are the same as $b_2$ in regression (1) with the same significance values depending on the reference category



Now I don't understand what I am seeing. the maineffect coefficient $b_2$ is the effect of the reference category and not the mean of the HIV infection rate effect? What does my main effect coefficient $b_2$ say? In regression (1) why does my significance values $b_3$, $b_4$, and $b_5$ change if I change my reference category and what does the significance of $b_3$, $b_4$, and $b_5$ mean regarding my main effect $b_2$? I am completely confused right now.



Best regards,
Rub_n







regression mean interpretation categorical-encoding






share|cite|improve this question









New contributor




Rub_n is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|cite|improve this question









New contributor




Rub_n is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|cite|improve this question




share|cite|improve this question








edited 5 hours ago









StatsStudent

4,45732041




4,45732041






New contributor




Rub_n is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 5 hours ago









Rub_n

61




61




New contributor




Rub_n is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Rub_n is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Rub_n is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












  • How are you modelling the error terms? What kind of model is this? Ordinary least squares? Logistic regression?
    – StatsStudent
    5 hours ago










  • Do you really have crosscountry data or is this supposed to be cross-sectional?
    – StatsStudent
    5 hours ago










  • I use an OLS regression with group and time fixed effects. Yes I have crosscountry data.
    – Rub_n
    5 hours ago


















  • How are you modelling the error terms? What kind of model is this? Ordinary least squares? Logistic regression?
    – StatsStudent
    5 hours ago










  • Do you really have crosscountry data or is this supposed to be cross-sectional?
    – StatsStudent
    5 hours ago










  • I use an OLS regression with group and time fixed effects. Yes I have crosscountry data.
    – Rub_n
    5 hours ago
















How are you modelling the error terms? What kind of model is this? Ordinary least squares? Logistic regression?
– StatsStudent
5 hours ago




How are you modelling the error terms? What kind of model is this? Ordinary least squares? Logistic regression?
– StatsStudent
5 hours ago












Do you really have crosscountry data or is this supposed to be cross-sectional?
– StatsStudent
5 hours ago




Do you really have crosscountry data or is this supposed to be cross-sectional?
– StatsStudent
5 hours ago












I use an OLS regression with group and time fixed effects. Yes I have crosscountry data.
– Rub_n
5 hours ago




I use an OLS regression with group and time fixed effects. Yes I have crosscountry data.
– Rub_n
5 hours ago










1 Answer
1






active

oldest

votes


















3














Consider a model with only 3 regions and hence two dummies $D_1$ and $D_2$. Assume the data is crosscountry so $i=1,...,n$ are countries. Let the model equation be



$$y_{it} = b_1 x_{it} + b_2 m_{it} + b_3 D_1 m_{it} + b_4 D_2 m_{it} + epsilon_{it}$$



implying that the conditional expected rate of drug addiction is



$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it} + b_3 D_1 m_{it} + b_4 D_2 m_{it}$$



hence the model allows for different regions to have different marginal effects of HIV infection rate $m$ on drug addiction rate $y$ - so their drug addiction rate responds differently to change HIV infection rate compared to the reference region.



For the reference region $D_1=D_2=0$ the conditional effect reduces to



$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it}$$



differentiating with respect to $m_{it}$ to get



$$frac{partial mathbb E[y lvert data]}{partial m_{it}} = b_2$$



which is the marginal effect of HIV infection rate $m$ on drug addiction rate $y$ for contries in the reference region. An increase of one unit in HIV infection rate in a country $i$ from the reference region result in a change of $b_2$ units in the drug addiction rate of country $i$.



For countries from the region defined by $D_1=1$ and $D_2=0$ the conditional expectation is



$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it} + b_3m_{it} $$



and the marginal effect



$$frac{partial mathbb E[y lvert data]}{partial m_{it}} = b_2 + b_3$$



hence $b_3$ is the difference in the marginal effect of HIV infection rate $m$ on drug addiction rate $y$ for contries in region $D_1=1$ compared to the reference region, for which the marginal effect was simply $b_2$. Hence if $b_3$ is positive then it appears that countries from region $D_1=1$ reacts stronger changes in the HIV infection rate with respect to the drug addiction rate.



So $b_2$ measures the increase in drug addiction rate as a result of a 1 unit increase in the HIV infection rate $m$ for the countries in the reference region. An the values of $b_3$ changes when you change the reference because it is the difference the marginal effect between some region - here $D_1=1$ and the reference - and offcourse the difference depend on what the region is compared to. The significance of $b_3$ means that you can reject the null hypothesis that countries from region $D_1=1$ have the same marginal effect as countries from the reference region.



In the second model there is no reference category so now the coefficients $b_3,b_4,b_5$ and $b_6$ are region specific marginal effects (not differences in the marginal effect). The purpose of this model is that it will allow you to test for the significant marginal effect of HIV infection rate on drug addiction rate for each region simply by testing the significance of the coefficients. To test for differences between regions in this model you have to test differences in coefficients for example $H0: b_3 = b_4$, which can easily be performed as a Wald test for example. However in model (1) this comparison between regions in the responsiveness of drug addcition rate to HIV infection rate was performed simply by testing the significance of a coefficient.






share|cite|improve this answer























  • Oh man, that really helps! So the significance of b2 is the significance of the effect of region 3 on my drug addiction rate?
    – Rub_n
    5 hours ago












  • b_2 measure the effect on drug addiction rate of a 1 unit increase in the HIV infection rate for countries belonging to the reference region. It's significance means it is significantly different from 0 therefore you can reject the null hypothesis that HIV infection rate do not affect drug addiction rate in countries from this region (I dont know what you define as region 3??)
    – Jesper Hybel
    4 hours ago












  • perfect thank you so much, so do i have a benefit of using regression (2) instead of doing four different regressions for each region except having a bigger sample size for the effect of x?
    – Rub_n
    4 hours ago










  • See edit of my repsonse last two paragraphs.
    – Jesper Hybel
    4 hours ago










  • pls. accept and upvote if you think the answer was helpful :)
    – Jesper Hybel
    4 hours ago











Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});






Rub_n is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f383994%2fhow-to-interpret-interaction-dummies-of-multiple-categories-and-main-effect%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









3














Consider a model with only 3 regions and hence two dummies $D_1$ and $D_2$. Assume the data is crosscountry so $i=1,...,n$ are countries. Let the model equation be



$$y_{it} = b_1 x_{it} + b_2 m_{it} + b_3 D_1 m_{it} + b_4 D_2 m_{it} + epsilon_{it}$$



implying that the conditional expected rate of drug addiction is



$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it} + b_3 D_1 m_{it} + b_4 D_2 m_{it}$$



hence the model allows for different regions to have different marginal effects of HIV infection rate $m$ on drug addiction rate $y$ - so their drug addiction rate responds differently to change HIV infection rate compared to the reference region.



For the reference region $D_1=D_2=0$ the conditional effect reduces to



$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it}$$



differentiating with respect to $m_{it}$ to get



$$frac{partial mathbb E[y lvert data]}{partial m_{it}} = b_2$$



which is the marginal effect of HIV infection rate $m$ on drug addiction rate $y$ for contries in the reference region. An increase of one unit in HIV infection rate in a country $i$ from the reference region result in a change of $b_2$ units in the drug addiction rate of country $i$.



For countries from the region defined by $D_1=1$ and $D_2=0$ the conditional expectation is



$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it} + b_3m_{it} $$



and the marginal effect



$$frac{partial mathbb E[y lvert data]}{partial m_{it}} = b_2 + b_3$$



hence $b_3$ is the difference in the marginal effect of HIV infection rate $m$ on drug addiction rate $y$ for contries in region $D_1=1$ compared to the reference region, for which the marginal effect was simply $b_2$. Hence if $b_3$ is positive then it appears that countries from region $D_1=1$ reacts stronger changes in the HIV infection rate with respect to the drug addiction rate.



So $b_2$ measures the increase in drug addiction rate as a result of a 1 unit increase in the HIV infection rate $m$ for the countries in the reference region. An the values of $b_3$ changes when you change the reference because it is the difference the marginal effect between some region - here $D_1=1$ and the reference - and offcourse the difference depend on what the region is compared to. The significance of $b_3$ means that you can reject the null hypothesis that countries from region $D_1=1$ have the same marginal effect as countries from the reference region.



In the second model there is no reference category so now the coefficients $b_3,b_4,b_5$ and $b_6$ are region specific marginal effects (not differences in the marginal effect). The purpose of this model is that it will allow you to test for the significant marginal effect of HIV infection rate on drug addiction rate for each region simply by testing the significance of the coefficients. To test for differences between regions in this model you have to test differences in coefficients for example $H0: b_3 = b_4$, which can easily be performed as a Wald test for example. However in model (1) this comparison between regions in the responsiveness of drug addcition rate to HIV infection rate was performed simply by testing the significance of a coefficient.






share|cite|improve this answer























  • Oh man, that really helps! So the significance of b2 is the significance of the effect of region 3 on my drug addiction rate?
    – Rub_n
    5 hours ago












  • b_2 measure the effect on drug addiction rate of a 1 unit increase in the HIV infection rate for countries belonging to the reference region. It's significance means it is significantly different from 0 therefore you can reject the null hypothesis that HIV infection rate do not affect drug addiction rate in countries from this region (I dont know what you define as region 3??)
    – Jesper Hybel
    4 hours ago












  • perfect thank you so much, so do i have a benefit of using regression (2) instead of doing four different regressions for each region except having a bigger sample size for the effect of x?
    – Rub_n
    4 hours ago










  • See edit of my repsonse last two paragraphs.
    – Jesper Hybel
    4 hours ago










  • pls. accept and upvote if you think the answer was helpful :)
    – Jesper Hybel
    4 hours ago
















3














Consider a model with only 3 regions and hence two dummies $D_1$ and $D_2$. Assume the data is crosscountry so $i=1,...,n$ are countries. Let the model equation be



$$y_{it} = b_1 x_{it} + b_2 m_{it} + b_3 D_1 m_{it} + b_4 D_2 m_{it} + epsilon_{it}$$



implying that the conditional expected rate of drug addiction is



$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it} + b_3 D_1 m_{it} + b_4 D_2 m_{it}$$



hence the model allows for different regions to have different marginal effects of HIV infection rate $m$ on drug addiction rate $y$ - so their drug addiction rate responds differently to change HIV infection rate compared to the reference region.



For the reference region $D_1=D_2=0$ the conditional effect reduces to



$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it}$$



differentiating with respect to $m_{it}$ to get



$$frac{partial mathbb E[y lvert data]}{partial m_{it}} = b_2$$



which is the marginal effect of HIV infection rate $m$ on drug addiction rate $y$ for contries in the reference region. An increase of one unit in HIV infection rate in a country $i$ from the reference region result in a change of $b_2$ units in the drug addiction rate of country $i$.



For countries from the region defined by $D_1=1$ and $D_2=0$ the conditional expectation is



$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it} + b_3m_{it} $$



and the marginal effect



$$frac{partial mathbb E[y lvert data]}{partial m_{it}} = b_2 + b_3$$



hence $b_3$ is the difference in the marginal effect of HIV infection rate $m$ on drug addiction rate $y$ for contries in region $D_1=1$ compared to the reference region, for which the marginal effect was simply $b_2$. Hence if $b_3$ is positive then it appears that countries from region $D_1=1$ reacts stronger changes in the HIV infection rate with respect to the drug addiction rate.



So $b_2$ measures the increase in drug addiction rate as a result of a 1 unit increase in the HIV infection rate $m$ for the countries in the reference region. An the values of $b_3$ changes when you change the reference because it is the difference the marginal effect between some region - here $D_1=1$ and the reference - and offcourse the difference depend on what the region is compared to. The significance of $b_3$ means that you can reject the null hypothesis that countries from region $D_1=1$ have the same marginal effect as countries from the reference region.



In the second model there is no reference category so now the coefficients $b_3,b_4,b_5$ and $b_6$ are region specific marginal effects (not differences in the marginal effect). The purpose of this model is that it will allow you to test for the significant marginal effect of HIV infection rate on drug addiction rate for each region simply by testing the significance of the coefficients. To test for differences between regions in this model you have to test differences in coefficients for example $H0: b_3 = b_4$, which can easily be performed as a Wald test for example. However in model (1) this comparison between regions in the responsiveness of drug addcition rate to HIV infection rate was performed simply by testing the significance of a coefficient.






share|cite|improve this answer























  • Oh man, that really helps! So the significance of b2 is the significance of the effect of region 3 on my drug addiction rate?
    – Rub_n
    5 hours ago












  • b_2 measure the effect on drug addiction rate of a 1 unit increase in the HIV infection rate for countries belonging to the reference region. It's significance means it is significantly different from 0 therefore you can reject the null hypothesis that HIV infection rate do not affect drug addiction rate in countries from this region (I dont know what you define as region 3??)
    – Jesper Hybel
    4 hours ago












  • perfect thank you so much, so do i have a benefit of using regression (2) instead of doing four different regressions for each region except having a bigger sample size for the effect of x?
    – Rub_n
    4 hours ago










  • See edit of my repsonse last two paragraphs.
    – Jesper Hybel
    4 hours ago










  • pls. accept and upvote if you think the answer was helpful :)
    – Jesper Hybel
    4 hours ago














3












3








3






Consider a model with only 3 regions and hence two dummies $D_1$ and $D_2$. Assume the data is crosscountry so $i=1,...,n$ are countries. Let the model equation be



$$y_{it} = b_1 x_{it} + b_2 m_{it} + b_3 D_1 m_{it} + b_4 D_2 m_{it} + epsilon_{it}$$



implying that the conditional expected rate of drug addiction is



$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it} + b_3 D_1 m_{it} + b_4 D_2 m_{it}$$



hence the model allows for different regions to have different marginal effects of HIV infection rate $m$ on drug addiction rate $y$ - so their drug addiction rate responds differently to change HIV infection rate compared to the reference region.



For the reference region $D_1=D_2=0$ the conditional effect reduces to



$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it}$$



differentiating with respect to $m_{it}$ to get



$$frac{partial mathbb E[y lvert data]}{partial m_{it}} = b_2$$



which is the marginal effect of HIV infection rate $m$ on drug addiction rate $y$ for contries in the reference region. An increase of one unit in HIV infection rate in a country $i$ from the reference region result in a change of $b_2$ units in the drug addiction rate of country $i$.



For countries from the region defined by $D_1=1$ and $D_2=0$ the conditional expectation is



$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it} + b_3m_{it} $$



and the marginal effect



$$frac{partial mathbb E[y lvert data]}{partial m_{it}} = b_2 + b_3$$



hence $b_3$ is the difference in the marginal effect of HIV infection rate $m$ on drug addiction rate $y$ for contries in region $D_1=1$ compared to the reference region, for which the marginal effect was simply $b_2$. Hence if $b_3$ is positive then it appears that countries from region $D_1=1$ reacts stronger changes in the HIV infection rate with respect to the drug addiction rate.



So $b_2$ measures the increase in drug addiction rate as a result of a 1 unit increase in the HIV infection rate $m$ for the countries in the reference region. An the values of $b_3$ changes when you change the reference because it is the difference the marginal effect between some region - here $D_1=1$ and the reference - and offcourse the difference depend on what the region is compared to. The significance of $b_3$ means that you can reject the null hypothesis that countries from region $D_1=1$ have the same marginal effect as countries from the reference region.



In the second model there is no reference category so now the coefficients $b_3,b_4,b_5$ and $b_6$ are region specific marginal effects (not differences in the marginal effect). The purpose of this model is that it will allow you to test for the significant marginal effect of HIV infection rate on drug addiction rate for each region simply by testing the significance of the coefficients. To test for differences between regions in this model you have to test differences in coefficients for example $H0: b_3 = b_4$, which can easily be performed as a Wald test for example. However in model (1) this comparison between regions in the responsiveness of drug addcition rate to HIV infection rate was performed simply by testing the significance of a coefficient.






share|cite|improve this answer














Consider a model with only 3 regions and hence two dummies $D_1$ and $D_2$. Assume the data is crosscountry so $i=1,...,n$ are countries. Let the model equation be



$$y_{it} = b_1 x_{it} + b_2 m_{it} + b_3 D_1 m_{it} + b_4 D_2 m_{it} + epsilon_{it}$$



implying that the conditional expected rate of drug addiction is



$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it} + b_3 D_1 m_{it} + b_4 D_2 m_{it}$$



hence the model allows for different regions to have different marginal effects of HIV infection rate $m$ on drug addiction rate $y$ - so their drug addiction rate responds differently to change HIV infection rate compared to the reference region.



For the reference region $D_1=D_2=0$ the conditional effect reduces to



$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it}$$



differentiating with respect to $m_{it}$ to get



$$frac{partial mathbb E[y lvert data]}{partial m_{it}} = b_2$$



which is the marginal effect of HIV infection rate $m$ on drug addiction rate $y$ for contries in the reference region. An increase of one unit in HIV infection rate in a country $i$ from the reference region result in a change of $b_2$ units in the drug addiction rate of country $i$.



For countries from the region defined by $D_1=1$ and $D_2=0$ the conditional expectation is



$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it} + b_3m_{it} $$



and the marginal effect



$$frac{partial mathbb E[y lvert data]}{partial m_{it}} = b_2 + b_3$$



hence $b_3$ is the difference in the marginal effect of HIV infection rate $m$ on drug addiction rate $y$ for contries in region $D_1=1$ compared to the reference region, for which the marginal effect was simply $b_2$. Hence if $b_3$ is positive then it appears that countries from region $D_1=1$ reacts stronger changes in the HIV infection rate with respect to the drug addiction rate.



So $b_2$ measures the increase in drug addiction rate as a result of a 1 unit increase in the HIV infection rate $m$ for the countries in the reference region. An the values of $b_3$ changes when you change the reference because it is the difference the marginal effect between some region - here $D_1=1$ and the reference - and offcourse the difference depend on what the region is compared to. The significance of $b_3$ means that you can reject the null hypothesis that countries from region $D_1=1$ have the same marginal effect as countries from the reference region.



In the second model there is no reference category so now the coefficients $b_3,b_4,b_5$ and $b_6$ are region specific marginal effects (not differences in the marginal effect). The purpose of this model is that it will allow you to test for the significant marginal effect of HIV infection rate on drug addiction rate for each region simply by testing the significance of the coefficients. To test for differences between regions in this model you have to test differences in coefficients for example $H0: b_3 = b_4$, which can easily be performed as a Wald test for example. However in model (1) this comparison between regions in the responsiveness of drug addcition rate to HIV infection rate was performed simply by testing the significance of a coefficient.







share|cite|improve this answer














share|cite|improve this answer



share|cite|improve this answer








edited 4 hours ago

























answered 5 hours ago









Jesper Hybel

45829




45829












  • Oh man, that really helps! So the significance of b2 is the significance of the effect of region 3 on my drug addiction rate?
    – Rub_n
    5 hours ago












  • b_2 measure the effect on drug addiction rate of a 1 unit increase in the HIV infection rate for countries belonging to the reference region. It's significance means it is significantly different from 0 therefore you can reject the null hypothesis that HIV infection rate do not affect drug addiction rate in countries from this region (I dont know what you define as region 3??)
    – Jesper Hybel
    4 hours ago












  • perfect thank you so much, so do i have a benefit of using regression (2) instead of doing four different regressions for each region except having a bigger sample size for the effect of x?
    – Rub_n
    4 hours ago










  • See edit of my repsonse last two paragraphs.
    – Jesper Hybel
    4 hours ago










  • pls. accept and upvote if you think the answer was helpful :)
    – Jesper Hybel
    4 hours ago


















  • Oh man, that really helps! So the significance of b2 is the significance of the effect of region 3 on my drug addiction rate?
    – Rub_n
    5 hours ago












  • b_2 measure the effect on drug addiction rate of a 1 unit increase in the HIV infection rate for countries belonging to the reference region. It's significance means it is significantly different from 0 therefore you can reject the null hypothesis that HIV infection rate do not affect drug addiction rate in countries from this region (I dont know what you define as region 3??)
    – Jesper Hybel
    4 hours ago












  • perfect thank you so much, so do i have a benefit of using regression (2) instead of doing four different regressions for each region except having a bigger sample size for the effect of x?
    – Rub_n
    4 hours ago










  • See edit of my repsonse last two paragraphs.
    – Jesper Hybel
    4 hours ago










  • pls. accept and upvote if you think the answer was helpful :)
    – Jesper Hybel
    4 hours ago
















Oh man, that really helps! So the significance of b2 is the significance of the effect of region 3 on my drug addiction rate?
– Rub_n
5 hours ago






Oh man, that really helps! So the significance of b2 is the significance of the effect of region 3 on my drug addiction rate?
– Rub_n
5 hours ago














b_2 measure the effect on drug addiction rate of a 1 unit increase in the HIV infection rate for countries belonging to the reference region. It's significance means it is significantly different from 0 therefore you can reject the null hypothesis that HIV infection rate do not affect drug addiction rate in countries from this region (I dont know what you define as region 3??)
– Jesper Hybel
4 hours ago






b_2 measure the effect on drug addiction rate of a 1 unit increase in the HIV infection rate for countries belonging to the reference region. It's significance means it is significantly different from 0 therefore you can reject the null hypothesis that HIV infection rate do not affect drug addiction rate in countries from this region (I dont know what you define as region 3??)
– Jesper Hybel
4 hours ago














perfect thank you so much, so do i have a benefit of using regression (2) instead of doing four different regressions for each region except having a bigger sample size for the effect of x?
– Rub_n
4 hours ago




perfect thank you so much, so do i have a benefit of using regression (2) instead of doing four different regressions for each region except having a bigger sample size for the effect of x?
– Rub_n
4 hours ago












See edit of my repsonse last two paragraphs.
– Jesper Hybel
4 hours ago




See edit of my repsonse last two paragraphs.
– Jesper Hybel
4 hours ago












pls. accept and upvote if you think the answer was helpful :)
– Jesper Hybel
4 hours ago




pls. accept and upvote if you think the answer was helpful :)
– Jesper Hybel
4 hours ago










Rub_n is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















Rub_n is a new contributor. Be nice, and check out our Code of Conduct.













Rub_n is a new contributor. Be nice, and check out our Code of Conduct.












Rub_n is a new contributor. Be nice, and check out our Code of Conduct.
















Thanks for contributing an answer to Cross Validated!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f383994%2fhow-to-interpret-interaction-dummies-of-multiple-categories-and-main-effect%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Accessing regular linux commands in Huawei's Dopra Linux

Can't connect RFCOMM socket: Host is down

Kernel panic - not syncing: Fatal Exception in Interrupt