Mathematical and Statistical Application – Stata – Common FAQs

Why do I get “.smcl” files when I try to output “.log” files in Stata?

How can I change the format of data for my axes in my graph?

How can I convert a string date variable into date formats that can be used for age calculations?

Why is that when I attempted to paste into Stata data editor, all the columns from Microsoft Excel appear jammed together in one column of Stata?

How can I save the graph without displaying it?

Is there any way to compute the time which a do-file spends since you run it until it is finished?

How can I compute the time which a Stata command spends since you run it until it is finished?

How can I specify all observations for which two or more of a set of dummy variables equal one?

How can I eliminate duplicate observations?

How do I to create two random normal variables with a given correlation?

Can Stata perform ANOVA with both random and fixed effects and estimate the variance components?

How can I run Stata interactively on the Research Computing server?

How do I write a Stata startup script when running it on the Research Computing server?

Can I change the search path of Stata so it can find my own ado-files that are placed on the Research Computing server?

How do I print Stata graphs when running it on the Research Computing server?

Additional help

Why do I get “.smcl” files when I try to output “.log” files in Stata?

The default format for Stata log files is “.smcl” (pronounced “smickle” ), which can only be displayed with the Stata Viewer (this is the case when you type “log using filename”; when you use Stata’s pull-down menu, you can choose between “.smcl” and “.log” ). To convert “.smcl” files into “.log” (ASCII/plain text) files, you should use the translate command:

  . translate mylog.smcl mylog.log

Alternatively, you can use:

  . log using filename.txt, text
  . log using filename.log

to start logging to an ASCII log file. To set it as a default, use:

  . set logtype text

To recover a log, when you forgot to open one, use:

  . translate @Results mylog.txt

This uses a translator called “Results2txt”. To view a list of available translators, type:

  . translator query

How can I change the format of data for my axes in my graph?

To change the format of a number (or date) on an axis of a graph, you must first change the format of the variable itself. For example, given the data set

  . list
 
   x y 
   1. .1369841 .2551499 
   2. .6432207 .0445188 
   3. .5578017 .4241557 
   4. .6047949 .8983462 
   5. .684176 .5219247 
   6. .1086679 .8414094 
   7. .6184582 .2110077 
   8. .0610638 .5644092 
   9. .5552388 .2648021 
   10. .8714491 .9477426

where the display formats for both variables are %9.0g. You should use

  . format x y %6.2f

to reformat the variables, so that you can get a new display format.

  . list
   x y 
   1. 0.14 0.26 
   2. 0.64 0.04 
   3. 0.56 0.42 
   4. 0.60 0.90 
   5. 0.68 0.52 
   6. 0.11 0.84 
   7. 0.62 0.21 
   8. 0.06 0.56 
   9. 0.56 0.26 
   10. 0.87 0.95

How can I convert a string date variable into date formats that can be used for age calculations?

If you have dates in this format: str8 %8s, and they look like this: 01/24/93, 1/24/93 or 01/5/93 (M,D,Y, sometimes the leading 0 for a single digit month or day is missing. The leading zeroes shouldn’t matter, you can try using the date() function.

  . gen edate=date("string_varname", md19y)

This creates an “elapsed” date which is the number of days since January 1st 1960. Then format it as you like, eg: “. format edate %dN/D/CY”. Create the ages by subtracting the birthdate from the current date, dividing by 365.25. Then use the int() function to trim fractions off. Like so:

  . gen age = (interview_date - birthdate) / 365.25
  . replace age = int(age)

Why is that when I attempted to paste into Stata data editor, all the columns from Microsoft Excel appear jammed together in one column of Stata?

Known problems are documented at http://www.stata.com/support/faqs/data/newexcel.html Check against your specifics.

This problem is not limited to Excel or Access. For example, I pasted the text

  Type Id h1 h2 h3 h4 h5 h8 h18
   C 2 . 1 9 . 8 . 2
   L 2 2 3 5 9 . 2 4
   B 2 6 9 8 2 5 3 7

into the data editor and Stata decided it was all one string variable. (I should have left off the header line, probably.) But anyway, my fix was

  . strparse var1, gen(d)
  . renvars d* \ Id h1-h5 h8 h18
  . destring Id h1-h5 h8 h18, replace

which in this case is, admittedly, not that competitive with re-pasting or typing in the raw data. Alternatively, you can try closing the do-file editor. Sometimes it does the trick, too.

How can I save the graph without displaying it?

  . set graphics off

later, if desired

 . set graphics on

Is there any way to compute the time which a do-file spends since you run it until it is finished?

You can use elapse to time an arbitrary number of commands. Find it with “findit etime”. Alternatively, Stata has a global system macro $S_TIME that could be displayed at the start and end of the do-file.

How can I compute the time which a Stata command spends since you run it until it is finished?

Stata can report timing for runs through a magical flag. “set rmsg” determines whether the return message is to be displayed at the completion of each command. The initial setting is off. Therefore, if you want to turn it on and time each command you submit, you should type “set rmsg on”.

How can I specify all observations for which two or more of a set of dummy variables equal one?

Assuming you can live with missing values treated as zero, one option is to generate a new variable using the egen and the rowtotal (formally rsum) function to add up the dummies and then select cases where the sum variable is greater than or equal to 2:

  . egen temp=rowtotal(x1 x2 x3 x4)

How can I eliminate duplicate observations?

Using the auto data as an example,

  . use auto (1978 Automobile Data)

The duplicates was designed specifically for this purpose, but the following is good example of how Stata can be made to compare values from the previous observation.

To generate an identifier,

  . gen id = _n

and change some identifiers to equal their predecessors,

  . replace id = id[_n-1] if uniform( ) < 0.1

(5 real changes made) After counting duplicates,

  . sort id
  . by id : gen count = _n
  . tab count
 
   count | Freq. Percent Cum.
   ------------+-----------------------------------
   1 | 69 93.24 93.24
   2 | 5 6.76 100.00
   ------------+-----------------------------------
   Total | 74 100.00

You can drop the duplicates:

  . drop if count > 1

(5 observations deleted)

How do I to create two random normal variables with a given correlation?

Under Stata 8, we can use “drawnorm” to draw random deviates from a multivariate normal distribution with known correlation matrix, or “corr2data” to generate data whose sample correlation matrix is precisely what is specified.

  . set obs 100
   obs was 0, now 100
   . matrix c = (1,0.2 \0.2,1)
   . mat list c
   symmetric c[2,2]
   c1 c2
   r1 1
   r2 .2 1
 
   . drawnorm x y, corr(c)
 
   . correlate x y
   (obs=100)
 
   | x y
   -------------+------------------
   x | 1.0000
   y | 0.2123 1.0000
 
 
   . drop x y
 
   . set obs 100
   obs was 0, now 100
 
   . corr2data x y, corr(c)
 
   . correlate x y
   (obs=100)
   | x y
   -------------+------------------
   x | 1.0000
   y | 0.2000 1.0000

Can Stata perform ANOVA with both random and fixed effects and estimate the variance components?

Yes, Stata has several commands for doing most everything that SAS’s PROC GLM and PROC MIXED do. They include the “glm” and the “xt” set of commands. However, I don’t believe Stata can estimate variance components in a linear model with more than two levels of random errors (which it can do with xtreg ). Although Stata has a “general” ANOVA package, you have to tell it which are the appropriate error terms for testing various effects in complicated mixed and/or partially nested models. Even so, it won’t provide estimates of the variance components.

How can I run Stata interactively on the Research Computing server?

Before you run GUI based Stata interactively on a Research Computing Server, you have to have an X-Windows client installed and running on your PC or terminal. We highly encourage you to use X-Win32 clients in conjunctions with an SSH Secure Shell upon your access to the Research Computing Server. Once you have installed and set up the X-Windows program properly, you can simply log on to the Research Computing server via a SSH Secure Shell and type “xstata” (for Kure) or “bsub -IS xstata” (for Killdevil) to run Stata interactively. Please refer to Stata and X-Win32 help pages for detailed information regarding this issue.

How do I write a Stata startup script when running it on the Research Computing server?

If you want to execute the same commands every time you invoke Stata on the server, you should create a file called “profile.do”. There are fours steps to set a profile.do file on the Research Computing server:

Step 1: Write a “profile.do” file. Here is a simple do file.

  . set memory 1000
  . set logtype text
  . set matsize 100

Remember these are things you want Stata to do every time you invoke it.

Note: Your “profile.do” will be run even when you run a Stata do-file in batch mode so keep that in mind.

Step 2: Place “profile.do” in the appropriate location, where Stata on the Research Computing server can access.

  • Option 1: put “profile.do” on your directory on /nas space. First create a bin directory and put your “profile.do” file under /nas/uncch/home/o/y/onyen/bin directory.
Note: /nas space is permanent storage space, and users may need to purchase the space in order to access it.
  • Option 2: put “profile.do” on your directory on scratch space, e.g. /netscr, /largefs. First create a “bin” directory and put your “profile.do” file under /netscr/onyen/bin directory.
Note: scratch space is temporary storage space, which is shared by many other users. Files on the scratch space will be deleted if the files are not used or modified for 21 days.

Step 3: Modify your personal environment file on the Research Computing server by adding the correct path to the “profile.do” file. Stata can find “profile.do” file according to your Linux path variable. The personal environment file can be created or modified in your /afs/isis/home/o/y/onyen/public directory.

  • For bash and ksh users, add /nas/uncch/home/o/y/onyen/bin or /netscr/onyen/bin in your .profile.personal file
  export PATH=“${PATH}:/nas/uncch/home/o/y/onyen/bin”
  • For csh and tcsh users, add /nas/uncch/home/o/y/onyen/bin or /netscr/onyen/bin in your .cshrc.personal file
  set PATH=“${PATH}:/nas/uncch/home/o/y/onyen/bin”

Step 4: Now, most importantly, log out and log back into the Research Computing server. Start Stata and you should see the “profile.do” file is running as it should.

Can I change the search path of Stata so it can find my own ado-files that are placed on the Research Computing server?

If you want Stata to find your own ado-files on the server try the following.

Step 1: Log into Kure or KillDevil and create a /ado directory in your /nas02/home/o/n/onyendirectory space.

Step 2: Place your ado files in the above created /ado directory. If you are using a Windows machine and need to transfer files from your personal machine to the server you can use the file transfer client in SSH Secure Shell. If you are using a UNIX/Linux or a Mac computer you can use the scp utility to transfer files.

Step 3: Create a file called “profile.do” and place this file in your /nas02/home/o/n/onyen/bin directory.

Step 4: In the file “profile.do” put the text

  adopath ++ /nas02/home/o/n/onyen/ado

Step 5: Log out and log back into the server.

How do I print Stata graphs when running it on the Research Computing server?

Stata includes a windowed interface for Unix, and you can print your current active graph if your system printer can be recognized by Stata. Additionally, Stata has a powerful translation tool “translate” available that converts “.gph” files to “.ps” files. The printing procedure with windowed Stata is as follows:

  • Option 1: Define your system printer by typing:
  . printer define prn ps "lp -d lpxx @"

where “Ipxx” is your system printer (For more detailed information about identifying your local printer on the Research Computing Servers, please see Printing from Research Computing Servers or type “man lp” after your Unix prompt). After creating a graph in your X-Window Stata session, type

  . graph print

Or if the graph has been stored on disk, say as “myplot.gph” on your X-Window Stata session, type

  . graph use myplot.gph 
  . graph print 
  • Option 2: After creating a “.gph” file, say, “myplot.gph” on your X-Window Stata session, type
  . translate myplot.gph myplot.ps

This will produce a PostScript file named myplot.ps. You can then apply the “lp -d lpxx myplot.ps” command after your Unix prompt (not on your windowed Stata).

  • Option 3: Using the character-based version of Stata, you are also able to define your own default printer and print graphs. For example if your system printer is called lpxx, then by typing:
  . printer def prn ps "lp -d lpxx @"
  . graph use myplot.gph
  . graph print

you should be able to print your Stata graph file myplot.gph.

Additional help

More on Stata

Research Computing home page