A thorough understanding of the data is critical to your success. Documentation is your key. It may include codebooks or dictionaries, manuals, and any reports resulting from the use of the data set. If such documentation is not available, you should consider developing your own codebook.
Documentation should include information about the variables, their names, labels and definitions. Without the definitions as clarification, the variable names may not match your interpretation of the term. The codebook should indicate the organization of the fields.
Handling of missing data should be part of the codebook. Researchers follow different practices so cells for missing data may have been left blank or may be indicated by a standard designation such as 9, 99, or 999. The researcher may have added an estimated value for the missing data and it is important for you to know what procedure was followed to determine the value. There may additional information on how much data is missing in each of the variables and how much data is missing overall.
Additional components of the codebook include copies of the research instruments, a detailed description of the methodologies used, procedures for data editing and coding as well as information about error rates.
If you anticipate having questions on using the data set for your research questions, it might be an important consideration to have a contact person from the original study available.