Tinnitus Archive > Data Sets > 2 > Methods > Data processing and statistical analysis

Data processing and statistical analysis

Data for the Tinnitus Archive were obtained from the Tinnitus Data Registry, the computerized data base that houses questionnaire and test data obtained from those patients of the OHSU Tinnitus Clinic who met the criteria for inclusion.

Data coding for computer entry

Coding of all data for a given patient was performed by the same staff member who conducted the confirmatory interview for that patient. Emphasis was placed on maintaining long-term consistency of coding decisions between the different staff members. To that end, consistency was aided by (1) use of printed data-coding sheets, (2) preparing typed documentation (coding notes) for use as reference materials during the coding process, and (3) holding frequent staff meetings to discuss coding issues or difficulties as they arose.

Data base hardware and software

Three different computer systems were employed over the period covered by this investigation. From 1981-1986 the Tinnitus Data Registry was housed in a mainframe computer (Harris 300) located near the OHSU Tinnitus Clinic, and communication with the computer was handled by an intelligent terminal providing local data entry followed by batch transmission of records to the mainframe. In 1986 the Registry was moved to a smaller system located within the Oregon Tinnitus Clinic and involving several local terminals connected to a minicomputer (Digital Equipment Corp. PDP 11/73). A final change in 1992 adopted IBM-PC-compatible desktop computers, both as the host data server and as data-processing workstations.

Relational data base software has been employed from the start of the Tinnitus Data Registry. From 1981-1986 the Registry employed INFO®, but switched to PATS® (Patient Analysis & Tracking System, Dendrite, Inc., Portland, Oregon) in 1986 and has continued to use that data base software since then.

Data quality control

Quality control is a prime concern in a large research data base. Following are the main features of the quality control system employed by the Registry:

All data entry programs have employed automatic data-checking at the point-of-entry, to screen for invalid entries, missing data, and to substantiate the presence of valid record-identification information.
Immediately after a given patient's data are entered, a detailed printout is made showing each data entry in an easily readable format. This printout is then checked against that patient's questionnaires and test forms (not the data coding forms), with all checking done by a different person than the one who entered the data. This method serves to catch both coding and data-entry errors, and in addition provides a consistency check among the various data coders as it requires them to review each other's coding decisions.
Prior to data analysis, the entire set of data entries corresponding to the variables of interest (i.e. questionnaire answers or test observations) is carefully scanned and missing values, unlikely values ("outliers"), and unusual distributions of responses undergo detailed evaluation, often by the entire Registry staff, to be sure that the entries in question are valid.

Statistical analysis

The data analyses presented here were performed using either (1) SPSS/PC+® (Statistical Package for the Social Sciences, SPSS Inc., Chicago, IL) or (2) the statistical processing capabilities provided by the PATS data base software. The latter were used primarily for simpler types of analyses such as frequency distributions, measures of central tendency and dispersion, and cross-tabulations.