Skip to main content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
Link to Libraries homepage
Link to Libraries homepage
Rutgers University Libraries

Large Data Sets in Nursing Research- RUL: Metasearch Tools

introduction to existing data set sources that might be used for secondary analysis by nurse researchers

Finding data sets from federal agencies and non government sources


            Accelerate, a suite of services from the Clinical and Translational Science Institute for researchers at UC San Francisco, includes public access to the Large Dataset Inventory, a compilation of more than 80 local and national health datasets.  The datasets have clinical and administrative content and may be searched by users by domains, population, time frame, publisher, observation unit, and study design.  An option is available to view all of the datasets in a categorized list.  A typical record includes a brief description and links to the data and supporting documentation, as well as the elements that are searchable.

 Society of General Internal Medicine (SGIM)

            SGIM offers a Dataset Compendium which may be searched by title or by topic.  The Compendium has 42 datasets.  Twenty-six are also part of the Accelerate Large Dataset Inventory.  The remaining datasets cover clinical subjects and include registries for cancer, cardiovascular disease, and end-stage renal disease.  A typical record includes a brief description, a list of dataset details, availability, cost and a bibliography of articles based on the datasets.  Links take users to the PubMed records for the articles.

            The Compendium provides a list of 6 proprietary datasets where the dataset developers are interested in collaborating with researchers who want to use their dataset.   Dataset content falls equally in clinical and administrative categories.  All contact information is provided in the dataset description.

 Health Services Research Information Central (HSRIC)

            HSRIC was established in 2005 by the National Information Center on Health Services Research and Health Care Technology located at the National Library of Medicine.  As a portal for health services researchers, it contains links to funding announcements, reports, podcasts, discussion groups, statistics, and data.  HSRIC’s data section lists 56 datasets from government and other groups such as professional associations, research institutes and universities.  The datasets fall into both clinical and administrative categories.  Records for the datasets include title, source url, purpose, description, media, method/techniques, sample design, interval, years, and population.  Users can locate appropriate datasets with a keyword search or by browsing through the dataset title listing.

Inter-University Consortium for Political and Social Research (ICPSR)

            ICPSR, from the University of Michigan, originated in 1962 as a consortium of academic and research organizations to provide leadership and training in data access, curation, and analysis for social sciences researchers.  Currently the data archive contains over 500,000 files.  While nurse researchers have access to studies in aging, education, and demographics, there is also a collection for health and mental health which includes data from the Health and Medical Care Archive.  Deposited by the Robert Wood Johnson Foundation, the Archive has content in the areas of health care providers, cost/access to health care, substance abuse and health, and chronic health conditions.

            ICPSR offers searching by keyword and browsing by topic, series, geography or investigator.  Users may also look for studies by entering specific variables into the search box.  When the variables are separated with commas, they can be sorted by “Variable Relevance” on the results page.   Searching of the Health and Medical Care Archive may be done separately.

            Records in both the HMCA and the larger ICPSR database include title, principal investigator, summary, access notes, datasets, study description, funding, study scope with subject terms, geographic coverage, time period, collection date, population, and notes, methodology, version history, metadata exports, and the amount of study usage.

RAND Corporation

            Established as a nonprofit, nonpartisan organization in 1948, the Rand Corporation “helps improve policy and decision making through research and analysis.”  Research areas encompass health, education, national security, international affairs, law, business and the environment.  In addition to a vast array of reports, Rand makes data sets and tools available for use by the public.  Nurse researchers may find data sets on health among the oldest old, families, health and fertility, health and retirement, health literacy and public health preparedness of interest.  Rand also offers databases with state statistics.  Use of some data sets is based on a fee.

Finding data sets from a federal agency

         Federal agencies gathering data through various initiatives have set up search engines that will find results from all data sources within the agency.  (DHHS)

   includes access to more than 900 datasets from the vaults of the Department of Health and Human Services.  It was organized as part of the Department’s Health Data Initiative and debuted in February 2013.  The intent is to encourage innovators and entrepreneurs to create applications, products, services and features to help improve health and health care.  As a result, the website includes opportunities for communication with other innovators and entrepreneurs.  The datasets are high value and high quality, suitable for healthy care delivery practitioners.  When the data is not in immediately usable form, staff are making it machine readable, downloadable and accessible via application programming interfaces while they protect privacy and confidentiality.

            Users can locate datasets by searching first with keywords and then with filters for subjects, agency, date and media format.  After selecting a dataset from a brief record, the user will find a complete entry which includes the title, a description, the location to download the data and agency and program information.


Wonder (CDC)

               Wonder was developed by the Centers for Disease Control and Prevention to provide access to public health data for epidemiologic research.  The content and easy access were designed for users in state and local health departments, the Public Health Service, and the academic public health community.  By using a “fill in the blank” search page, users can locate data from public use datasets about mortality, cancer incidence, HIV and AIDS, tuberculosis, vaccinations, natality, census data and other topics.  Users may also select browsing by CDC database, topic or an A-Z index.  Wonder includes the capability to analyze the data which may also be downloaded.

              Wonder staff recommend that users gain familiarity with the dataset documentation to make sure that there is complete understanding of the characteristics and limitations of the data.  Brief summaries as well as comprehensive documentation of the datasets are available.


Health Indicators Warehouse (National Center for Health Statistics)

            The Health Indicators Warehouse provides access to national, state and local indicators of population health, health care, and health determinants.  There is supporting descriptive information to aid the researcher’s understanding and use of the indicators.  The data is drawn from the County Health Rankings (R. W. Johnson Foundation/University of Wisconsin Population Health Institute), Community Health Indicators, Health People 2020, and the Centers for Medicare and Medicaid Services as well as additional sources.  Minimum statistical standards have been established to assure data quality.  Detailed metadata, measures of variability and limitations are supplied so users can evaluate the data’s appropriateness for their research.

            Health Indicators Warehouse’s search page allows browsing by topic, geography, and health indicators initiative.  The Indicators tab at the top left of the page gives users an opportunity to search from broad filter to specific indicator.  Studies for the indicator are then listed.  A box for keyword searching is located to the upper right on every page.

            Typical records for studies include title, numerator, denominator, methodology and references.  Data and download tabs appear under the study’s title.


Rutgers, The State University of New Jersey, an equal access/equal opportunity institution. Individuals with disabilities are encouraged to direct suggestions, comments, or complaints concerning any accessibility issues with Rutgers web sites to: or complete the Report Accessibility Barrier / Provide Feedback Form.