How to interpret interaction dummies of multiple categories and main effect
I have a panel data crosscountry regression with following structure ($y$ as a drug addiction rate of the country, $x$ as number of homeless of the country and $m$ as HIV infection rate of the country) and I categorize my countries in four world regions which I code as Dummys $D_1$, $D_2$, $D_3$ and the fourth region as reference category:
$y = b_1x + b_2m + b_3D_1m + b_4D_2m + b_5D_3m$ (1)
When I change my base category every coefficient and significance value except $b_1$ changes.
When I change my regression to:
$y = b_1x + b_3D_1m + b_4D_2m + b_5D_3m + b_6D_4m$ (2)
the coefficients in (2) are the same as $b_2$ in regression (1) with the same significance values depending on the reference category
Now I don't understand what I am seeing. the maineffect coefficient $b_2$ is the effect of the reference category and not the mean of the HIV infection rate effect? What does my main effect coefficient $b_2$ say? In regression (1) why does my significance values $b_3$, $b_4$, and $b_5$ change if I change my reference category and what does the significance of $b_3$, $b_4$, and $b_5$ mean regarding my main effect $b_2$? I am completely confused right now.
Best regards,
Rub_n
regression mean interpretation categorical-encoding
New contributor
add a comment |
I have a panel data crosscountry regression with following structure ($y$ as a drug addiction rate of the country, $x$ as number of homeless of the country and $m$ as HIV infection rate of the country) and I categorize my countries in four world regions which I code as Dummys $D_1$, $D_2$, $D_3$ and the fourth region as reference category:
$y = b_1x + b_2m + b_3D_1m + b_4D_2m + b_5D_3m$ (1)
When I change my base category every coefficient and significance value except $b_1$ changes.
When I change my regression to:
$y = b_1x + b_3D_1m + b_4D_2m + b_5D_3m + b_6D_4m$ (2)
the coefficients in (2) are the same as $b_2$ in regression (1) with the same significance values depending on the reference category
Now I don't understand what I am seeing. the maineffect coefficient $b_2$ is the effect of the reference category and not the mean of the HIV infection rate effect? What does my main effect coefficient $b_2$ say? In regression (1) why does my significance values $b_3$, $b_4$, and $b_5$ change if I change my reference category and what does the significance of $b_3$, $b_4$, and $b_5$ mean regarding my main effect $b_2$? I am completely confused right now.
Best regards,
Rub_n
regression mean interpretation categorical-encoding
New contributor
How are you modelling the error terms? What kind of model is this? Ordinary least squares? Logistic regression?
– StatsStudent
5 hours ago
Do you really have crosscountry data or is this supposed to be cross-sectional?
– StatsStudent
5 hours ago
I use an OLS regression with group and time fixed effects. Yes I have crosscountry data.
– Rub_n
4 hours ago
add a comment |
I have a panel data crosscountry regression with following structure ($y$ as a drug addiction rate of the country, $x$ as number of homeless of the country and $m$ as HIV infection rate of the country) and I categorize my countries in four world regions which I code as Dummys $D_1$, $D_2$, $D_3$ and the fourth region as reference category:
$y = b_1x + b_2m + b_3D_1m + b_4D_2m + b_5D_3m$ (1)
When I change my base category every coefficient and significance value except $b_1$ changes.
When I change my regression to:
$y = b_1x + b_3D_1m + b_4D_2m + b_5D_3m + b_6D_4m$ (2)
the coefficients in (2) are the same as $b_2$ in regression (1) with the same significance values depending on the reference category
Now I don't understand what I am seeing. the maineffect coefficient $b_2$ is the effect of the reference category and not the mean of the HIV infection rate effect? What does my main effect coefficient $b_2$ say? In regression (1) why does my significance values $b_3$, $b_4$, and $b_5$ change if I change my reference category and what does the significance of $b_3$, $b_4$, and $b_5$ mean regarding my main effect $b_2$? I am completely confused right now.
Best regards,
Rub_n
regression mean interpretation categorical-encoding
New contributor
I have a panel data crosscountry regression with following structure ($y$ as a drug addiction rate of the country, $x$ as number of homeless of the country and $m$ as HIV infection rate of the country) and I categorize my countries in four world regions which I code as Dummys $D_1$, $D_2$, $D_3$ and the fourth region as reference category:
$y = b_1x + b_2m + b_3D_1m + b_4D_2m + b_5D_3m$ (1)
When I change my base category every coefficient and significance value except $b_1$ changes.
When I change my regression to:
$y = b_1x + b_3D_1m + b_4D_2m + b_5D_3m + b_6D_4m$ (2)
the coefficients in (2) are the same as $b_2$ in regression (1) with the same significance values depending on the reference category
Now I don't understand what I am seeing. the maineffect coefficient $b_2$ is the effect of the reference category and not the mean of the HIV infection rate effect? What does my main effect coefficient $b_2$ say? In regression (1) why does my significance values $b_3$, $b_4$, and $b_5$ change if I change my reference category and what does the significance of $b_3$, $b_4$, and $b_5$ mean regarding my main effect $b_2$? I am completely confused right now.
Best regards,
Rub_n
regression mean interpretation categorical-encoding
regression mean interpretation categorical-encoding
New contributor
New contributor
edited 5 hours ago
StatsStudent
4,45732041
4,45732041
New contributor
asked 5 hours ago
Rub_n
61
61
New contributor
New contributor
How are you modelling the error terms? What kind of model is this? Ordinary least squares? Logistic regression?
– StatsStudent
5 hours ago
Do you really have crosscountry data or is this supposed to be cross-sectional?
– StatsStudent
5 hours ago
I use an OLS regression with group and time fixed effects. Yes I have crosscountry data.
– Rub_n
4 hours ago
add a comment |
How are you modelling the error terms? What kind of model is this? Ordinary least squares? Logistic regression?
– StatsStudent
5 hours ago
Do you really have crosscountry data or is this supposed to be cross-sectional?
– StatsStudent
5 hours ago
I use an OLS regression with group and time fixed effects. Yes I have crosscountry data.
– Rub_n
4 hours ago
How are you modelling the error terms? What kind of model is this? Ordinary least squares? Logistic regression?
– StatsStudent
5 hours ago
How are you modelling the error terms? What kind of model is this? Ordinary least squares? Logistic regression?
– StatsStudent
5 hours ago
Do you really have crosscountry data or is this supposed to be cross-sectional?
– StatsStudent
5 hours ago
Do you really have crosscountry data or is this supposed to be cross-sectional?
– StatsStudent
5 hours ago
I use an OLS regression with group and time fixed effects. Yes I have crosscountry data.
– Rub_n
4 hours ago
I use an OLS regression with group and time fixed effects. Yes I have crosscountry data.
– Rub_n
4 hours ago
add a comment |
1 Answer
1
active
oldest
votes
Consider a model with only 3 regions and hence two dummies $D_1$ and $D_2$. Assume the data is crosscountry so $i=1,...,n$ are countries. Let the model equation be
$$y_{it} = b_1 x_{it} + b_2 m_{it} + b_3 D_1 m_{it} + b_4 D_2 m_{it} + epsilon_{it}$$
implying that the conditional expected rate of drug addiction is
$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it} + b_3 D_1 m_{it} + b_4 D_2 m_{it}$$
hence the model allows for different regions to have different marginal effects of HIV infection rate $m$ on drug addiction rate $y$ - so their drug addiction rate responds differently to change HIV infection rate compared to the reference region.
For the reference region $D_1=D_2=0$ the conditional effect reduces to
$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it}$$
differentiating with respect to $m_{it}$ to get
$$frac{partial mathbb E[y lvert data]}{partial m_{it}} = b_2$$
which is the marginal effect of HIV infection rate $m$ on drug addiction rate $y$ for contries in the reference region. An increase of one unit in HIV infection rate in a country $i$ from the reference region result in a change of $b_2$ units in the drug addiction rate of country $i$.
For countries from the region defined by $D_1=1$ and $D_2=0$ the conditional expectation is
$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it} + b_3m_{it} $$
and the marginal effect
$$frac{partial mathbb E[y lvert data]}{partial m_{it}} = b_2 + b_3$$
hence $b_3$ is the difference in the marginal effect of HIV infection rate $m$ on drug addiction rate $y$ for contries in region $D_1=1$ compared to the reference region, for which the marginal effect was simply $b_2$. Hence if $b_3$ is positive then it appears that countries from region $D_1=1$ reacts stronger changes in the HIV infection rate with respect to the drug addiction rate.
So $b_2$ measures the increase in drug addiction rate as a result of a 1 unit increase in the HIV infection rate $m$ for the countries in the reference region. An the values of $b_3$ changes when you change the reference because it is the difference the marginal effect between some region - here $D_1=1$ and the reference - and offcourse the difference depend on what the region is compared to. The significance of $b_3$ means that you can reject the null hypothesis that countries from region $D_1=1$ have the same marginal effect as countries from the reference region.
In the second model there is no reference category so now the coefficients $b_3,b_4,b_5$ and $b_6$ are region specific marginal effects (not differences in the marginal effect). The purpose of this model is that it will allow you to test for the significant marginal effect of HIV infection rate on drug addiction rate for each region simply by testing the significance of the coefficients. To test for differences between regions in this model you have to test differences in coefficients for example $H0: b_3 = b_4$, which can easily be performed as a Wald test for example. However in model (1) this comparison between regions in the responsiveness of drug addcition rate to HIV infection rate was performed simply by testing the significance of a coefficient.
Oh man, that really helps! So the significance of b2 is the significance of the effect of region 3 on my drug addiction rate?
– Rub_n
4 hours ago
b_2 measure the effect on drug addiction rate of a 1 unit increase in the HIV infection rate for countries belonging to the reference region. It's significance means it is significantly different from 0 therefore you can reject the null hypothesis that HIV infection rate do not affect drug addiction rate in countries from this region (I dont know what you define as region 3??)
– Jesper Hybel
4 hours ago
perfect thank you so much, so do i have a benefit of using regression (2) instead of doing four different regressions for each region except having a bigger sample size for the effect of x?
– Rub_n
4 hours ago
See edit of my repsonse last two paragraphs.
– Jesper Hybel
4 hours ago
pls. accept and upvote if you think the answer was helpful :)
– Jesper Hybel
4 hours ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Rub_n is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f383994%2fhow-to-interpret-interaction-dummies-of-multiple-categories-and-main-effect%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Consider a model with only 3 regions and hence two dummies $D_1$ and $D_2$. Assume the data is crosscountry so $i=1,...,n$ are countries. Let the model equation be
$$y_{it} = b_1 x_{it} + b_2 m_{it} + b_3 D_1 m_{it} + b_4 D_2 m_{it} + epsilon_{it}$$
implying that the conditional expected rate of drug addiction is
$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it} + b_3 D_1 m_{it} + b_4 D_2 m_{it}$$
hence the model allows for different regions to have different marginal effects of HIV infection rate $m$ on drug addiction rate $y$ - so their drug addiction rate responds differently to change HIV infection rate compared to the reference region.
For the reference region $D_1=D_2=0$ the conditional effect reduces to
$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it}$$
differentiating with respect to $m_{it}$ to get
$$frac{partial mathbb E[y lvert data]}{partial m_{it}} = b_2$$
which is the marginal effect of HIV infection rate $m$ on drug addiction rate $y$ for contries in the reference region. An increase of one unit in HIV infection rate in a country $i$ from the reference region result in a change of $b_2$ units in the drug addiction rate of country $i$.
For countries from the region defined by $D_1=1$ and $D_2=0$ the conditional expectation is
$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it} + b_3m_{it} $$
and the marginal effect
$$frac{partial mathbb E[y lvert data]}{partial m_{it}} = b_2 + b_3$$
hence $b_3$ is the difference in the marginal effect of HIV infection rate $m$ on drug addiction rate $y$ for contries in region $D_1=1$ compared to the reference region, for which the marginal effect was simply $b_2$. Hence if $b_3$ is positive then it appears that countries from region $D_1=1$ reacts stronger changes in the HIV infection rate with respect to the drug addiction rate.
So $b_2$ measures the increase in drug addiction rate as a result of a 1 unit increase in the HIV infection rate $m$ for the countries in the reference region. An the values of $b_3$ changes when you change the reference because it is the difference the marginal effect between some region - here $D_1=1$ and the reference - and offcourse the difference depend on what the region is compared to. The significance of $b_3$ means that you can reject the null hypothesis that countries from region $D_1=1$ have the same marginal effect as countries from the reference region.
In the second model there is no reference category so now the coefficients $b_3,b_4,b_5$ and $b_6$ are region specific marginal effects (not differences in the marginal effect). The purpose of this model is that it will allow you to test for the significant marginal effect of HIV infection rate on drug addiction rate for each region simply by testing the significance of the coefficients. To test for differences between regions in this model you have to test differences in coefficients for example $H0: b_3 = b_4$, which can easily be performed as a Wald test for example. However in model (1) this comparison between regions in the responsiveness of drug addcition rate to HIV infection rate was performed simply by testing the significance of a coefficient.
Oh man, that really helps! So the significance of b2 is the significance of the effect of region 3 on my drug addiction rate?
– Rub_n
4 hours ago
b_2 measure the effect on drug addiction rate of a 1 unit increase in the HIV infection rate for countries belonging to the reference region. It's significance means it is significantly different from 0 therefore you can reject the null hypothesis that HIV infection rate do not affect drug addiction rate in countries from this region (I dont know what you define as region 3??)
– Jesper Hybel
4 hours ago
perfect thank you so much, so do i have a benefit of using regression (2) instead of doing four different regressions for each region except having a bigger sample size for the effect of x?
– Rub_n
4 hours ago
See edit of my repsonse last two paragraphs.
– Jesper Hybel
4 hours ago
pls. accept and upvote if you think the answer was helpful :)
– Jesper Hybel
4 hours ago
add a comment |
Consider a model with only 3 regions and hence two dummies $D_1$ and $D_2$. Assume the data is crosscountry so $i=1,...,n$ are countries. Let the model equation be
$$y_{it} = b_1 x_{it} + b_2 m_{it} + b_3 D_1 m_{it} + b_4 D_2 m_{it} + epsilon_{it}$$
implying that the conditional expected rate of drug addiction is
$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it} + b_3 D_1 m_{it} + b_4 D_2 m_{it}$$
hence the model allows for different regions to have different marginal effects of HIV infection rate $m$ on drug addiction rate $y$ - so their drug addiction rate responds differently to change HIV infection rate compared to the reference region.
For the reference region $D_1=D_2=0$ the conditional effect reduces to
$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it}$$
differentiating with respect to $m_{it}$ to get
$$frac{partial mathbb E[y lvert data]}{partial m_{it}} = b_2$$
which is the marginal effect of HIV infection rate $m$ on drug addiction rate $y$ for contries in the reference region. An increase of one unit in HIV infection rate in a country $i$ from the reference region result in a change of $b_2$ units in the drug addiction rate of country $i$.
For countries from the region defined by $D_1=1$ and $D_2=0$ the conditional expectation is
$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it} + b_3m_{it} $$
and the marginal effect
$$frac{partial mathbb E[y lvert data]}{partial m_{it}} = b_2 + b_3$$
hence $b_3$ is the difference in the marginal effect of HIV infection rate $m$ on drug addiction rate $y$ for contries in region $D_1=1$ compared to the reference region, for which the marginal effect was simply $b_2$. Hence if $b_3$ is positive then it appears that countries from region $D_1=1$ reacts stronger changes in the HIV infection rate with respect to the drug addiction rate.
So $b_2$ measures the increase in drug addiction rate as a result of a 1 unit increase in the HIV infection rate $m$ for the countries in the reference region. An the values of $b_3$ changes when you change the reference because it is the difference the marginal effect between some region - here $D_1=1$ and the reference - and offcourse the difference depend on what the region is compared to. The significance of $b_3$ means that you can reject the null hypothesis that countries from region $D_1=1$ have the same marginal effect as countries from the reference region.
In the second model there is no reference category so now the coefficients $b_3,b_4,b_5$ and $b_6$ are region specific marginal effects (not differences in the marginal effect). The purpose of this model is that it will allow you to test for the significant marginal effect of HIV infection rate on drug addiction rate for each region simply by testing the significance of the coefficients. To test for differences between regions in this model you have to test differences in coefficients for example $H0: b_3 = b_4$, which can easily be performed as a Wald test for example. However in model (1) this comparison between regions in the responsiveness of drug addcition rate to HIV infection rate was performed simply by testing the significance of a coefficient.
Oh man, that really helps! So the significance of b2 is the significance of the effect of region 3 on my drug addiction rate?
– Rub_n
4 hours ago
b_2 measure the effect on drug addiction rate of a 1 unit increase in the HIV infection rate for countries belonging to the reference region. It's significance means it is significantly different from 0 therefore you can reject the null hypothesis that HIV infection rate do not affect drug addiction rate in countries from this region (I dont know what you define as region 3??)
– Jesper Hybel
4 hours ago
perfect thank you so much, so do i have a benefit of using regression (2) instead of doing four different regressions for each region except having a bigger sample size for the effect of x?
– Rub_n
4 hours ago
See edit of my repsonse last two paragraphs.
– Jesper Hybel
4 hours ago
pls. accept and upvote if you think the answer was helpful :)
– Jesper Hybel
4 hours ago
add a comment |
Consider a model with only 3 regions and hence two dummies $D_1$ and $D_2$. Assume the data is crosscountry so $i=1,...,n$ are countries. Let the model equation be
$$y_{it} = b_1 x_{it} + b_2 m_{it} + b_3 D_1 m_{it} + b_4 D_2 m_{it} + epsilon_{it}$$
implying that the conditional expected rate of drug addiction is
$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it} + b_3 D_1 m_{it} + b_4 D_2 m_{it}$$
hence the model allows for different regions to have different marginal effects of HIV infection rate $m$ on drug addiction rate $y$ - so their drug addiction rate responds differently to change HIV infection rate compared to the reference region.
For the reference region $D_1=D_2=0$ the conditional effect reduces to
$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it}$$
differentiating with respect to $m_{it}$ to get
$$frac{partial mathbb E[y lvert data]}{partial m_{it}} = b_2$$
which is the marginal effect of HIV infection rate $m$ on drug addiction rate $y$ for contries in the reference region. An increase of one unit in HIV infection rate in a country $i$ from the reference region result in a change of $b_2$ units in the drug addiction rate of country $i$.
For countries from the region defined by $D_1=1$ and $D_2=0$ the conditional expectation is
$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it} + b_3m_{it} $$
and the marginal effect
$$frac{partial mathbb E[y lvert data]}{partial m_{it}} = b_2 + b_3$$
hence $b_3$ is the difference in the marginal effect of HIV infection rate $m$ on drug addiction rate $y$ for contries in region $D_1=1$ compared to the reference region, for which the marginal effect was simply $b_2$. Hence if $b_3$ is positive then it appears that countries from region $D_1=1$ reacts stronger changes in the HIV infection rate with respect to the drug addiction rate.
So $b_2$ measures the increase in drug addiction rate as a result of a 1 unit increase in the HIV infection rate $m$ for the countries in the reference region. An the values of $b_3$ changes when you change the reference because it is the difference the marginal effect between some region - here $D_1=1$ and the reference - and offcourse the difference depend on what the region is compared to. The significance of $b_3$ means that you can reject the null hypothesis that countries from region $D_1=1$ have the same marginal effect as countries from the reference region.
In the second model there is no reference category so now the coefficients $b_3,b_4,b_5$ and $b_6$ are region specific marginal effects (not differences in the marginal effect). The purpose of this model is that it will allow you to test for the significant marginal effect of HIV infection rate on drug addiction rate for each region simply by testing the significance of the coefficients. To test for differences between regions in this model you have to test differences in coefficients for example $H0: b_3 = b_4$, which can easily be performed as a Wald test for example. However in model (1) this comparison between regions in the responsiveness of drug addcition rate to HIV infection rate was performed simply by testing the significance of a coefficient.
Consider a model with only 3 regions and hence two dummies $D_1$ and $D_2$. Assume the data is crosscountry so $i=1,...,n$ are countries. Let the model equation be
$$y_{it} = b_1 x_{it} + b_2 m_{it} + b_3 D_1 m_{it} + b_4 D_2 m_{it} + epsilon_{it}$$
implying that the conditional expected rate of drug addiction is
$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it} + b_3 D_1 m_{it} + b_4 D_2 m_{it}$$
hence the model allows for different regions to have different marginal effects of HIV infection rate $m$ on drug addiction rate $y$ - so their drug addiction rate responds differently to change HIV infection rate compared to the reference region.
For the reference region $D_1=D_2=0$ the conditional effect reduces to
$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it}$$
differentiating with respect to $m_{it}$ to get
$$frac{partial mathbb E[y lvert data]}{partial m_{it}} = b_2$$
which is the marginal effect of HIV infection rate $m$ on drug addiction rate $y$ for contries in the reference region. An increase of one unit in HIV infection rate in a country $i$ from the reference region result in a change of $b_2$ units in the drug addiction rate of country $i$.
For countries from the region defined by $D_1=1$ and $D_2=0$ the conditional expectation is
$$mathbb E[y lvert data] = b_1 x_{it} + b_2 m_{it} + b_3m_{it} $$
and the marginal effect
$$frac{partial mathbb E[y lvert data]}{partial m_{it}} = b_2 + b_3$$
hence $b_3$ is the difference in the marginal effect of HIV infection rate $m$ on drug addiction rate $y$ for contries in region $D_1=1$ compared to the reference region, for which the marginal effect was simply $b_2$. Hence if $b_3$ is positive then it appears that countries from region $D_1=1$ reacts stronger changes in the HIV infection rate with respect to the drug addiction rate.
So $b_2$ measures the increase in drug addiction rate as a result of a 1 unit increase in the HIV infection rate $m$ for the countries in the reference region. An the values of $b_3$ changes when you change the reference because it is the difference the marginal effect between some region - here $D_1=1$ and the reference - and offcourse the difference depend on what the region is compared to. The significance of $b_3$ means that you can reject the null hypothesis that countries from region $D_1=1$ have the same marginal effect as countries from the reference region.
In the second model there is no reference category so now the coefficients $b_3,b_4,b_5$ and $b_6$ are region specific marginal effects (not differences in the marginal effect). The purpose of this model is that it will allow you to test for the significant marginal effect of HIV infection rate on drug addiction rate for each region simply by testing the significance of the coefficients. To test for differences between regions in this model you have to test differences in coefficients for example $H0: b_3 = b_4$, which can easily be performed as a Wald test for example. However in model (1) this comparison between regions in the responsiveness of drug addcition rate to HIV infection rate was performed simply by testing the significance of a coefficient.
edited 4 hours ago
answered 4 hours ago
Jesper Hybel
45829
45829
Oh man, that really helps! So the significance of b2 is the significance of the effect of region 3 on my drug addiction rate?
– Rub_n
4 hours ago
b_2 measure the effect on drug addiction rate of a 1 unit increase in the HIV infection rate for countries belonging to the reference region. It's significance means it is significantly different from 0 therefore you can reject the null hypothesis that HIV infection rate do not affect drug addiction rate in countries from this region (I dont know what you define as region 3??)
– Jesper Hybel
4 hours ago
perfect thank you so much, so do i have a benefit of using regression (2) instead of doing four different regressions for each region except having a bigger sample size for the effect of x?
– Rub_n
4 hours ago
See edit of my repsonse last two paragraphs.
– Jesper Hybel
4 hours ago
pls. accept and upvote if you think the answer was helpful :)
– Jesper Hybel
4 hours ago
add a comment |
Oh man, that really helps! So the significance of b2 is the significance of the effect of region 3 on my drug addiction rate?
– Rub_n
4 hours ago
b_2 measure the effect on drug addiction rate of a 1 unit increase in the HIV infection rate for countries belonging to the reference region. It's significance means it is significantly different from 0 therefore you can reject the null hypothesis that HIV infection rate do not affect drug addiction rate in countries from this region (I dont know what you define as region 3??)
– Jesper Hybel
4 hours ago
perfect thank you so much, so do i have a benefit of using regression (2) instead of doing four different regressions for each region except having a bigger sample size for the effect of x?
– Rub_n
4 hours ago
See edit of my repsonse last two paragraphs.
– Jesper Hybel
4 hours ago
pls. accept and upvote if you think the answer was helpful :)
– Jesper Hybel
4 hours ago
Oh man, that really helps! So the significance of b2 is the significance of the effect of region 3 on my drug addiction rate?
– Rub_n
4 hours ago
Oh man, that really helps! So the significance of b2 is the significance of the effect of region 3 on my drug addiction rate?
– Rub_n
4 hours ago
b_2 measure the effect on drug addiction rate of a 1 unit increase in the HIV infection rate for countries belonging to the reference region. It's significance means it is significantly different from 0 therefore you can reject the null hypothesis that HIV infection rate do not affect drug addiction rate in countries from this region (I dont know what you define as region 3??)
– Jesper Hybel
4 hours ago
b_2 measure the effect on drug addiction rate of a 1 unit increase in the HIV infection rate for countries belonging to the reference region. It's significance means it is significantly different from 0 therefore you can reject the null hypothesis that HIV infection rate do not affect drug addiction rate in countries from this region (I dont know what you define as region 3??)
– Jesper Hybel
4 hours ago
perfect thank you so much, so do i have a benefit of using regression (2) instead of doing four different regressions for each region except having a bigger sample size for the effect of x?
– Rub_n
4 hours ago
perfect thank you so much, so do i have a benefit of using regression (2) instead of doing four different regressions for each region except having a bigger sample size for the effect of x?
– Rub_n
4 hours ago
See edit of my repsonse last two paragraphs.
– Jesper Hybel
4 hours ago
See edit of my repsonse last two paragraphs.
– Jesper Hybel
4 hours ago
pls. accept and upvote if you think the answer was helpful :)
– Jesper Hybel
4 hours ago
pls. accept and upvote if you think the answer was helpful :)
– Jesper Hybel
4 hours ago
add a comment |
Rub_n is a new contributor. Be nice, and check out our Code of Conduct.
Rub_n is a new contributor. Be nice, and check out our Code of Conduct.
Rub_n is a new contributor. Be nice, and check out our Code of Conduct.
Rub_n is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f383994%2fhow-to-interpret-interaction-dummies-of-multiple-categories-and-main-effect%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
How are you modelling the error terms? What kind of model is this? Ordinary least squares? Logistic regression?
– StatsStudent
5 hours ago
Do you really have crosscountry data or is this supposed to be cross-sectional?
– StatsStudent
5 hours ago
I use an OLS regression with group and time fixed effects. Yes I have crosscountry data.
– Rub_n
4 hours ago