# Why transform continuous variables into categorical variables?

(Difference between revisions)
 Revision as of 02:56, 28 May 2008 (view source)Doug (Talk | contribs)← Older edit Latest revision as of 20:57, 7 September 2009 (view source)Doug (Talk | contribs) (2 intermediate revisions not shown) Line 1: Line 1: '''It is possible to transform continuous variables into categorical variables''' '''It is possible to transform continuous variables into categorical variables''' - * For example, imagine a study about happiness where your happiness question (or composite) ranges from 1 to 7. You might be interested in categorizing the subjects as either high happiness (4 through 7 on the scale) or low happiness (1 through 4 on the scale). This is called "dichotomizing" the variable because you are creating a new variable that has only two options. + * "Dichotomizing" is where you split a continuous variable into a categorical variable that has two levels. [[Image:Fe40.png]] - Imagine a study about happiness where your happiness question (or composite) ranges from 1 to 7. You might be interested in categorizing the subjects as either high happiness (4 through 7 on the scale) or low happiness (1 through 4 on the scale). This is called "dichotomizing" the variable because you are creating a new variable that has only two options. - * Another example of why you would want to transform a continuous variable into a categorical variable is if there are only a few responses on some of the answer choices in the continuous variable. For example, imagine a scale range from 1-11 in which answer choice 4 and/or answer choice 9 received only 1 response each. 1 response is not enough data for meaningful interpretation. You may want to collapse the 11 point scale into 3 or 4 categories. As another example, look at the “rel_category” in our dataset which measures the religious category memberships of the subjects. The frequency distribution is listed on the next page. Hindu received only 6 responses, and Jewish received only 9 responses. You may want to merge those responses into “other” and/or merge all the data into “Christian” versus “other”. Notice that creating the new categorical variable is answering a different research question than the original categorical variable. + * You can also split a continuous variable into thirds, fourths, fifths, or as many different categories as you wish. + * Another example of why you would want to transform a continuous variable into a categorical variable is if there are only a few responses on some of the answer choices in the continuous variable. [[Image:Fe40.png]] - Imagine a scale range from 1-11 in which answer choice 4 and answer choice 9 received only 1 response each. One response is not enough data for meaningful interpretation. You may want to collapse the 11 point scale into 3 or 4 categories. + * Keep in mind that creating the new categorical variable is answering a different research question than the original continuous variable. When you splice the continuous variable into categories, any data analysis using the new categorical variable must be interpreted in line with the categories. + + + ---- ---- - ◄ Back to [[Research_Tools |Research Tools mainpage]] + ◄ Back to [[Analyzing Data]] page

## Latest revision as of 20:57, 7 September 2009

It is possible to transform continuous variables into categorical variables

• "Dichotomizing" is where you split a continuous variable into a categorical variable that has two levels. - Imagine a study about happiness where your happiness question (or composite) ranges from 1 to 7. You might be interested in categorizing the subjects as either high happiness (4 through 7 on the scale) or low happiness (1 through 4 on the scale). This is called "dichotomizing" the variable because you are creating a new variable that has only two options.
• You can also split a continuous variable into thirds, fourths, fifths, or as many different categories as you wish.
• Another example of why you would want to transform a continuous variable into a categorical variable is if there are only a few responses on some of the answer choices in the continuous variable. - Imagine a scale range from 1-11 in which answer choice 4 and answer choice 9 received only 1 response each. One response is not enough data for meaningful interpretation. You may want to collapse the 11 point scale into 3 or 4 categories.
• Keep in mind that creating the new categorical variable is answering a different research question than the original continuous variable. When you splice the continuous variable into categories, any data analysis using the new categorical variable must be interpreted in line with the categories.

◄ Back to Analyzing Data page