SPSS Wiki



New pages | Popular pages | Wiki tutorial | Help pages

Welcome to SPSS Wiki. If you're new to wikies it might help to read this article. Users of this site are constantly updating the many articles, and you can help. SPSS Wiki is intended to be a reference and workbook for SPSS statistical procedures. It is for both novice and expert. While statistical procedures are explained to some extent, SPSS Wiki is not primarily a statistical text, there are plenty of other resources on the net for that.

Hot Topics

 * Check out Raynald's SPSS Tools.
 * The SPSS Wiki logo is Greek for SPSS right?
 * SPSS help and tutorials free SPSS online videos
 * Advanced wiki page editing
 * How to cite SPSS Wiki
 * Free open source statistical software links
 * Reliable Change (and Clinical Change)
 * Sample Size, Effect Size, and Power
 * Ask your SPSS questions at the SPSS LOG
 * SPSS Forum A forum for SPSS users
 * In2Software.dk Free Online SPSS Scripts

FAQ :


 * 1. How to Embed or Integrate SPSS with Visual Studio 2005
 * 2. What are essential Softwares we need to utilize the SPSS (*.sav) into the Visual Studio 2005 as the DataSource file

Getting Started
There are a few things that are worth examining or changing in Options in SPSS, before you start.

Select "Options" from the Edit menu.
 * In "General" make sure that Session Journal is turned on, is set to append, and is being saved somewhere sensible (i.e. where you have permission to save.
 * Change "Recently used file list" to 9.
 * Set "no scientific notation for small numbers in tables" (if you don't know what this is, you want it on).
 * In viewer, select "Display commands in the log" (this will be useful later).
 * If you are ever likely to be analysing large datasets, choose the "Data" tab, and change "Transform and Merge Options" to "Calculate values before used."
 * Notice what the "Set century range for 2 digit years" is set to.
 * Many people always change the "Display format ..." to 0 decimal places. You might consider this, depending on what sort of variables you create.
 * If you use the programmability features introduced in SPSS 14, set Text Output Page Size on the Viewer tab to Wide.

Common
You can help expand this article!

Hypothesis testing: The methodological equivalent of trying to understand somebody by only asking closed questions – good if you already know them well enough to ask the right questions, bad because, what you usually get are part-truths biased by your pre-conceptions. The alternative? Grounded theory. That is, as long as it is done systematically and not as some lame excuse to avoid statistics. The down side of grounded theory? It doesn’t always apply well to new groups or situations.

Pilot study: It is more likely to be code for a third-rate study with too few participants and too shoddy a design for anything to be concluded - yet we’ll use it as a lead-in to the ‘real’ study anyway. Less often (and more correctly) means we checked some assumptions about our population, or tried to make sure our dependent measures were valid, before we rolled ahead and spent a lot of time and money.

Qualitative vs. Quantitative: Sure - people have incorrectly wrapped themselves in the qualitative flag, when in fact they are ignorant of what it really means, or only attempting to avoid statistics and rigorous methodology. But when used correctly, it systematically (and usually statistically) addresses people’s experience, as opposed to the quantitative question about generalising from one group to the next. One almost never actually opposes the other – it’s not a competition, rather, they usually complement each other. If someone characterises himself or herself as either a qualitative or quantitative researcher, they are in fact saying – I intentionally limit the range or research questions I may ask – a distinctly unscientific stance.

Random sample: The term is grossly overused - truly random samples are extremely rare. The vast majority of samples are in fact disproportionate stratified convenience samples. That is, proportions of important characteristics of the group/s differ from that of the population under consideration (not that researchers generally say who the population is anyway). ‘Convenient’ because you accepted the first prospects who would say yes.

Highly significant results: p<0.001 is better than p<0.05 right? What you’re really saying is that the probability of thinking something is happening when it is not (type I error) is less than 1 chance in 1000 as opposed to 1 chance in 20. Still sound good? Here’s the problem – most sample sizes are far less than 1000, or even 100 – so how can you logically be claiming the probability of a type I error is less than 1 chance in 100, or 1000, or whatever, when your sample was only (say) 52 participants? There is also the problem that by artificially reducing the chance of a type I error (i.e. using 0.01 or 0.001 cut-offs) you are likely to think nothing is happening when it may well be (a type II error). At any rate, significant doesn’t mean big, it means consistent. The size of the effect might be very small even when it is consistent. If you want to talk about effect size, that’s a different matter. One last downside - using extreme significance cut-offs almost always reduces the probability that you will correctly detect an effect (power) to unacceptable levels. Take our advice, stick with reporting the p<0.05 criteria regardless of what SPSS spits out – unless you really know what you’re doing.

Theoretical eclecticism: A convenient excuse to do what appeals to you under the guise of some post hoc rationalisation.

Theoretical model: A model tries to explain only WHAT is happening; a theory tries to explain WHY it is happening. The explanation is therefore either a model OR a theory. Unless it’s merely a tautology people usually mean proposed model rather than theoretical model.

Type I, II, and III Error: Type 1 error is believing something happened when it didn’t, and type 2 error is believing nothing happened when it really did. Both these mistakes can be overcome with some clever maths and/or careful design and reasoning. But what about type 3 error? What about the possibility that you’re misinterpreting what’s happening? Or to put it another way, you think something is happening, and in a sense you’re right, something is happening, just not what you think. The classic example of this is novelty. Let’s say you treat an individual or group who subsequently improve. For obvious reasons you think it’s your intervention that has made the difference, but who’s to say that it wasn’t just the novelty of the exercise they were responding to, and not the particular treatment technique itself? Perhaps not, but the point is you don’t know for sure. How does the placebo effect fit into all this? Well the placebo effect could explain some type 3 error, but not all type 3 error is necessarily caused by the placebo effect. That is to say, if the placebo effect can be described as a change simply due to believing it will happen, mind over matter, then that is not the only possible source of type 3 error. However, type 3 error can be ruled out in much the same way as the placebo effect. The only way around the problem is to design around it, but that’s a topic too detailed and boring to cover here.

Procedures

 * ALSCAL multidimensional scaling - SPSS Base
 * Anomaly detection– SPSS Data Validation
 * ANOVA - simple factorial - SPSS Base
 * AREG - SPSS Trends
 * ARIMA - SPSS Trends
 * Bayesian estimation with Markov chain Monte Carlo algorithm – Amos
 * Binary logistic regression - SPSS Regression Models
 * Bivariate (correlate) - SPSS Base
 * CAPTCA (categorical principal components analysis) - SPSS Categories
 * Case summaries (reports) - SPSS Base
 * CHAID – SPSS Classification Trees
 * Classification & regression trees – SPSS Classification Trees
 * Cluster - SPSS Base
 * Complex samples descriptives - SPSS Complex Samples
 * Complex samples general linear models – SPSS Complex Samples
 * Complex samples logistic regression – SPSS Complex Samples
 * Complex samples tabulate - SPSS Complex Samples
 * Confirmatory factor analysis – Amos
 * Conjoint - SPSS Conjoint
 * Constrained nonlinear regression (CNLR) - SPSS Regression Models
 * Correspondence analysis - SPSS Categories
 * Cox regression - SPSS Advanced Models
 * Crosstabs (descriptive statistics) - SPSS Base
 * Curve estimation - SPSS Base
 * Descriptive statistics - SPSS Base
 * Descriptive ratio statistics - SPSS Base
 * Discriminant - SPSS Base
 * Exact tests - SPSS Exact Tests
 * Exhaustive CHAID – SPSS Classification Trees
 * Explore (descriptive statistics) - SPSS Base
 * EXSMOOTH (exponential smoothing) - SPSS Trends
 * Factor - SPSS Base
 * Fit - SPSS Base
 * Frequencies (descriptive statistics) - SPSS Base
 * GENLOG - SPSS Advanced Models
 * GLM (general linear models) - SPSS Advanced Models
 * HLM (hierarchical linear models) - see linear mixed models
 * HILOGLINEAR - SPSS Advanced Models
 * HOMALS - SPSS Categories
 * Inferential statistics - SPSS Tables
 * Kaplan-Meier - SPSS Advanced Models
 * Linear mixed models - SPSS Advanced Models
 * Linear regression - SPSS Base
 * LOGLINEAR - SPSS Advanced Models
 * Means (compare means) - SPSS Base
 * Mixed level models - see linear mixed models
 * Modeling statistics - SPSS Base
 * Multinomial logistic regression (MLR) - SPSS Regression Models
 * Multiple correspondence analysis - SPSS Categories
 * Multiple response - SPSS Base
 * Naïve Bayes algorithm – SPSS Server
 * Nonlinear regression (NLR) – SPSS Regression Models
 * Nonparametric tests - SPSS Base
 * OLAP cubes (reports) - SPSS Base
 * Oneway ANOVA (compare means) - SPSS Base
 * Orthoplan - SPSS Conjoint
 * Overals - SPSS Categories
 * Partial (correlate) - SPSS Base
 * Plancards - SPSS Categories
 * PoLytomous universal models (PLUM) - SPSS Advanced Models
 * Predictor selection algorithm – SPSS Server
 * Preference scaling (PREFSCAL) - syntax only – SPSS Categories
 * Probit - SPSS Regression Models
 * Proximities - SPSS Base
 * PROXSCAL (multidimensional scaling) - SPSS Categories
 * QUEST - SPSS Classification Trees
 * Quick cluster - SPSS Base
 * Random effects regression - see linear mixed models
 * Receiver operating characteristic (ROC) analysis - SPSS Base
 * Reliability - SPSS Base
 * Report summaries (reports) - SPSS Base
 * SEASON - SPSS Trends
 * SEM (structural equation modeling) - Amos
 * SPECTRA - SPSS Trends
 * Survival analysis procedures - SPSS Advanced Models
 * t tests (compare means) - SPSS Base
 * Two-stage least squares (2SLS) - SPSS Regression Models
 * TwoStep cluster - SPSS Base
 * VARCOMP (variance component estimation) - SPSS Advanced Models
 * Validate data procedure – SPSS Data Validation
 * Weighted least squares (WLS) - SPSS Regression Models