Reading/entering data

Because of security restrictions associated with Java Applets, only datasets stored on the Rice University server can be read. A few datasets are available now; more are under development. If you have a dataset that you would like to contribute, please contact David Lane. There are currently two "libraries" of datasets. The default library is "RVLS_data." You can change the library by using the pop-up menu. Just under the pop-up menu for libraries is a pop-up menu of datasets in the library. Choose the dataset you want to analyze. A description of the dataset will be shown on the right side of the display.
    Rather than read a dataset from the Rice server, you can enter your own data. Begin by clicking on the button labeled "Enter/Edit User Data." A window will open with an area for you to enter or paste in your data. The first line should contain the names of the variables (separated by spaces or tabs). The remaining lines should contain the data themselves. Missing data cannot be handled so all observations must have valid data for all variables. All variables must be numeric. If one of your variables is to be used as a "Grouping" or "Classification" variable, then values of the grouping variable must be integers ranging from one to the total number of groups. Use grouping variables so that you can do a separate analysis for each level of the variable or to use the variable as an independent variable in an analysis of variance. Once you have entered your data, click on the "Accept data" button. The data will be temporarily saved so that if you click the "Enter/Edit User Data" button again they will be shown


Choosing Variables

   To analyze a variable, select it in the "Y" pop-up menu and then click on the type of analysis you wish to perform. To see the relationship between two variables, select one in the "X" menu and the other in the "Y" menu. Then click on the "Correlation/regression" button. Specify a grouping variable to do an analysis separately for each group of observations. You also use specify a grouping variable to conduct an analysis of variance.


Statistical Analyses

Descriptive statistics
  1. N
  2. Mean
  3. Median
  4. Trimean
  5. Minimum
  6. Maximum
  7. 25th percentile
  8. 75th percentile
  9. sd
  10. Standard error of the mean
  11. Skew
  12. Kurtosis

Histogram

Boxplots

Stem and Leaf Displays

t-tests and confidence intervals
  1. one sample t-test
  2. confidence interval on the mean
  3. independent groups t-test
  4. confidence interval on the difference between means

Correlation/Regression
  1. r
  2. slope
  3. intercept
  4. SEest

One-Way ANOVA
To compute an ANOVA, select the independent variable from the "Group" pop-up menu and the dependent variable from the "Y" menu.


Consequences of Assumption Violations

The consequences of assumption violations can be explored by assuming a population with the same-shaped distribution as the sample. For the one-sample t test, random samples are taken from the population and a t test on the difference between the sample mean and population mean is computed for each sample. The proportion of times the null hypothesis is incorrectly rejected is computed. The difference between this empirical type I error rate and the nominal (e.g., 0.05) Type I error rate can be used to assess the consequences of a assumption violation. Keep in mind that for small sample sizes, the shape of the population distribution may be quite different from that of the sample distribution.