A Gentle Introduction to Stata - Oregon State acock/hdfs361/stata1_3.pdf · A Gentle Introduction to…

Embed Size (px)

Text of A Gentle Introduction to Stata - Oregon State acock/hdfs361/stata1_3.pdf · A Gentle Introduction...

  • A Gentle Introduction to Stata

  • A Gentle Introduction to Stata

    Alan AcockOregon State University

    A Stata Press PublicationSTATA CORPORATIONCollege Station, Texas

  • Stata Press, 4905 Lakeway Drive, College Station, Texas 77845

    Copyright c 2005 by StataCorp LPAll rights reservedTypeset in LATEX2Printed in the United States of America

    10 9 8 7 6 5 4 3 2 1

    ISBN !!

    This book is protected by copyright. All rights are reserved. No part of this book may be repro-duced, stored in a retrieval system, or transcribed, in any form or by any meanselectronic,mechanical, photocopying, recording, or otherwisewithout the prior written permission ofStataCorp LP.

    Stata is a registered trademark of StataCorp LP. LATEX2 is a trademark of the AmericanMathematical Society.

  • AcknowledgementsI would like to acknowledge the support of the Stata staff who have worked with

    me on this project. Special thanks goes to Lisa Gilmore, the Production Manager,xxxxx, the Copy Editor, and xxx for verifying all the commands used in this volume. Ialso want to thank my students who have tested my ideas for the book. They are toonumerous to mention, but special thanks goes to Patricia Meierdiercks and ShannonWanless.

    Bennet Fauber, during the time he was affiliated with Stata Corporation, providedhours and hours of support on all aspects of this project. He taught me the LATEX2document preparation system used by Stata Press and his patience with many of myproblems and mistakes has inspired me to have more patience with my own students.Bennet also had a major input on the topical coverage and organization of this volume.He provided the initial draft of chapter 4 and his superior expertise on Stata commands,data management, and do-files was critical. Bennet also provided extensive editorialsuggestions and substantive editing for the first three chapters of the book. Whateverquality this book has owes an enormous debt to Bennets conceptual and technicalcontributions. The books completion owes a lot to his encouragement.

    Finally, I would like to thank my wife, Toni Acock, for her support and for hertolerance for my endless excuses for why I could not do things. She had to pick upmany tasks I should have done and she usually smiled when told it was because I hadto finish this book.

  • Contents

    Preface xiii

    Notation and Typography xv

    1 Getting Started 1

    1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

    1.2 The Stata Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    1.3 Using an existing dataset . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    1.4 An example of a short Stata session . . . . . . . . . . . . . . . . . . . . 9

    1.5 Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

    1.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

    1.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

    2 Entering Data 19

    2.1 Creating a dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    2.2 An example questionnaire . . . . . . . . . . . . . . . . . . . . . . . . . 21

    2.3 Develop a coding system . . . . . . . . . . . . . . . . . . . . . . . . . . 22

    2.4 Entering data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

    2.4.1 Labeling values . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

    2.5 Saving your dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    2.6 Checking the data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

    2.7 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

    2.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

    3 Preparing Data for Analysis 37

    3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

  • viii Contents

    3.2 Plan your work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

    3.3 Create value labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

    3.4 Reverse code variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

    3.5 Create and modify variables . . . . . . . . . . . . . . . . . . . . . . . . 46

    3.6 Create scales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

    3.7 Save some of your data . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

    3.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

    3.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

    Author index 59

    Subject index 61

  • List of Tables

    2.1 Example questionnaire . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

    2.2 Example codebook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    2.3 Example coding sheet . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

    3.1 Sample project task list . . . . . . . . . . . . . . . . . . . . . . . . . . 39

    3.2 NLSY97 sample codebook entries . . . . . . . . . . . . . . . . . . . . . 40

    3.3 Reverse coding plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

    3.4 Arithmetic symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

  • List of Figures

    1.1 Statas opening screen . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    1.2 The . Prefs . Save Windowing Preferences menu . . . . . . . . . . . . . 4

    1.3 The . Prefs . Stata compact setting appearance menu . . . . . . . . . . 5

    1.4 The Stata screen layout used in this book . . . . . . . . . . . . . . . . 6

    1.5 Statas tool bar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    1.6 Stata command to open cancer dataset . . . . . . . . . . . . . . . . . . 8

    1.7 Histogram of age . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    1.8 Histogram dialog box . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    1.9 The tabs on the histogram dialog box . . . . . . . . . . . . . . . . . . 12

    1.10 The Title tab of the histogram dialog box . . . . . . . . . . . . . . . . 13

    1.11 The Options tab of the histogram dialog box . . . . . . . . . . . . . . 13

    1.12 First attempt at an improved histogram . . . . . . . . . . . . . . . . . 14

    1.13 Final histogram of age . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

    2.1 Data editor window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

    2.2 Variable name and variable label . . . . . . . . . . . . . . . . . . . . . 27

    2.3 Define schemes dialog box . . . . . . . . . . . . . . . . . . . . . . . . . 29

    2.4 Define schemes dialog box . . . . . . . . . . . . . . . . . . . . . . . . . 29

    2.5 Describe dataset dialog box . . . . . . . . . . . . . . . . . . . . . . . . 34

    3.1 Create new variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

    3.2 Recode: specifying recode rules on the main tab . . . . . . . . . . . . . 44

    3.3 Recode: specifying new variable name on the options tab . . . . . . . 44

    3.4 Create new variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

    3.5 Two-way tabulation dialog . . . . . . . . . . . . . . . . . . . . . . . . . 50

  • xii List of Figures

    3.6 The extended generate dialog . . . . . . . . . . . . . . . . . . . . . . . 52

    3.7 The extended generate dialog . . . . . . . . . . . . . . . . . . . . . . . 53

    3.8 Selecting variables to drop . . . . . . . . . . . . . . . . . . . . . . . . . 55

    3.9 Selecting observations with an expression . . . . . . . . . . . . . . . . 55

  • Preface

    This book was written with a particular reader in mind. This reader needs to learnStata, but has no prior experience with other statistical software packages and is learn-ing social statistics. When I learned Stata myself I found no books that I felt werewritten explicitly for this reader. There are certainly excellent books on Stata, but theyassumed prior experience with other packages such as SAS or SPSS, they assumed afairly advanced working knowledge of statistics, or both of these. These books are ableto move more quickly to more advanced topics, but they left my intended reader in thedust. Readers who have more background in statistical software and statistics than Iam assuming here will be able to read chapters quickly and even skip sections. The goalis to move the true beginner to a level of competence using Stata.

    With our target reader in mind, I make far more use of the Stata menu system thanany other books about Stata. Advanced users may not see the value to using the menusand the more people learn about Stata the less they will rely on the menus. Also, evenwhen using the menu system it is still important to save a record of the sequence ofcommands you ran. Even though I rely on the commands much more than the menusin my own work, I still find value in the menus. They include many options that I mightnot have known or might have forgotten. This is most evident with graphs where thevisual quality of graphs can be greatly enhanced using the menu system.

    To illustrate the menu system as well as graphics, I have included over 80 figures,many of which show menus. There are numerous tables and extensive Stata resultsthat are presented as they appear on the screen and are given a substantive interpreta-tion. This is done in the belief that beginning Stata users need to learn more than justhow to produce the results. It is also necessary to go through the results and interpretthem.

    I have tried to use real data. There are a few examples