banner



How To Create Dummy Variables In Stata Panel Data

How to Generate Dummy Variables in Stata

Dummy variables are categorical variables that take on binary values of 0 or 1. For example, a dummy for gender might take a value of ane for 'Male person' observations and 0 for 'Female' observations. Coding string values ('Male', 'Female') in such a manner allows us to use these variables in regression analysis with meaningful interpretations. In this post we are going to understand how to generate dummy variable in Stata.

In this commodity we use the 1978 Machine dataset built into Stata. This data can be accessed through the command:

sysuse auto

Before kickoff to work with a dataset, it is a good idea to examine the variable list, storage type and labels. The descriptive information for this dataset is displayed through:

describe
Describe Stata data, stata dummy variable

To begin with our discussion effectually dummy variables, let'southward observe the variable for repair records in 1978, rep78.

tabulate rep78, missing
Tabulate categorical variable in stata, stata dummy variable

The tabulate command above allows u.s.a. to run across that the variable is characterised by five categories coded numerically from ane to five. The option of missing allows u.s. to observe the number of missing values (.) in the variable as well. We volition explore 5 methods of generating dummy variables in this article:

  1. Two-step method
  2. One-step method
  3. Dummy based on an inequality condition
  4. Dummies for multiple categories
  5. Dummies based on multiple weather

Ii-Step Method to Generate Dummy Variable in Stata:

Step ane:

generate rep2 = 1 if rep78==2

This command generates a new variable named 'rep2' which takes on the value of one simply for observations where rep78 is equal to 2. Where rep78 equals 1, 3, 4, 5, rep2 will be populated with missing values (.).

generate dummy variable using two step method in stata

Step ii:

replace rep2 = 0 if missing(rep2)

This command deals with the missing values generated in rep2. It replaces all observations in rep2 with a 0 if rep2 has missing values.

Even so, this too means that rep2 takes on a value of 0 when rep78 had missing values (.). This is an inaccuracy that needs to be addressed.

how to generate dummy variable in stata

The (incomplete) command to a higher place served to illustrate the importance of being mindful of missing information in relevant variables; otherwise information cleaning, variable creation and other data operations will be plagued with information entry and misspecification errors. We shall modify this command to account for missing values in rep78 likewise.

replace rep2 = 0 if missing(rep2) & !missing(rep78)

This boosted provisional directs Stata to populate rep2 with 0 if in that location are no missing values in rep78. !missing indicates 'not missing', where '!' is an operator for 'not'.

Some other Way of Generating Dummies:

There is another like but slightly different approach to generating a dummy variable. Let's generate a dummy, rep3, that takes a value of one when rep78 is equal to 3. This likewise involves two commands:

generate rep3 = 0 if !missing(rep78) supervene upon rep3 = 1 if rep78 == 3
stata dummy variable using single step

In this example, we first generate rep3 which equals zilch whenever rep78 does not have missing values. In the case of missing values in rep78, rep3 will also have a missing value. We then replace rep3 with 1 whenever rep78 takes on a value of three.

Ane-Step Method to Create Dummy Variable in Stata:

generate rep4 = rep78 == four if !missing(rep78)

This command achieves the verbal issue that we obtained for rep2 and rep3, but within 1 command.

stata dummy variable

Dummy Created Based on an Inequality Condition

In the above examples, we generated dummies based on static weather condition. Now, nosotros wish to generate a dummy repg, which takes on a value of 1 if rep78 is greater than or equal to iii.

generate repg = rep78>=3 if !missing(rep78)

repg takes on a value of i only when rep78 is greater than or equal to 3 and does not have a missing value.

stata dummy variable

Dummies For Multiple Categories

Nosotros saw how rep78 had five categories with numeric values of 1 to 5. Instead of generating a dummy variable for each category individually, we can utilize the tabulate command with the choice of gen to create five dummies at once.

tabulate rep78, generate(dummy)

This comprehensive command creates 5 dummy variables from the five categories of rep78. The dummies take on the names of 'dummy1', 'dummy2', 'dummy3', 'dummy4' and 'dummy5'. 'dummy1' will equal 1 whenever rep78 equals 1, 'dummy2' will equal 1 whenever rep78 equals 2, and and then on (the control also takes into account the event of missing values we discussed earlier).

stata dummy variable 9

Dummy Created Based on Multiple Conditions

What if nosotros wish to create a dummy variable that takes on the value of 1 whenever more than one atmospheric condition are satisfied? To illustrate this, allow's bring in the 'cost' variable in our example.

Nosotros want to create a dummy (called 'dummy') which equals 1 if the cost variable is less than or equal to 6000, and if rep78 is greater than or equal to 3. Both these weather need to be met simultaneously. If, for case, cost is less than or equal to 6000 simply rep78 is non greater than or equal to 3, 'dummy' will have on a value of 0.

generate dummy = cost<=6000 & rep78>=three if !missing(price, rep78)

The command above makes use of operators similar & and the conditional if qualifier to attain that. It also addresses missing values in both the toll and rep78 variables.

dummy variable

How To Create Dummy Variables In Stata Panel Data,

Source: https://thedatahall.com/stata-dummy-variable/

Posted by: martincouseed1937.blogspot.com

0 Response to "How To Create Dummy Variables In Stata Panel Data"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel