This blog is used by members of the Spring 2010 Community Ecology graduate course at Fordham University. Posts may include lecture notes, links, data analysis, questions, paper summaries and anything else we can think of!

Thursday, March 4, 2010

Anderson and Burnham: Avoiding Pitfalls When Using Information-Theoretic Models

Anderson and Burnahm discuss the growing shift away from simple null-hypothesis testing to the more complex and arguably more meaningful models based on information-theory, specifically in wildlife biology. This shift has been prompted by the limitations inherent in null hypothesis testing. However, as the use of information-theory rises, so does the misuse of its methodology. Luckily for us, Anderson and Burnham are here to point out our mistakes and move us in the right direction.

The common mistakes fall into 3 categories: basic science, methodology, and just plain wrong.

Basic Science Mistakes:
1) Poor Science Question - Null Hypotheses are silly! They don't ask relevant interesting questions and since when is science black and white? Having multiple possible models will most likely yield a more descriptive answer. However, models must be chosen carefully because the more specific the models become, the more likely it is that they won't capture reality.
2) Too Many Models - Some scientists are becoming too reliant on computer statistical analysis and don't put enough thought into their data analysis. They may let the computer analyze all possible models rather than reducing the number after some consideration of the science regarding the issue.
3) The True Model is Not in the Set - A model cannot possibly fully represent the issue being investigated. However, information-theory can try to estimate how close a particular model is at representing the truth.
4) Information-theoretic Methods Are Not a "Test" - Null Hypothesis tests (and their resulting p-values) cannot be mixed with information-theory results. These are two different approaches of answering a question that lead to different results with different interpretations.

Methodology Mistakes:
1) Poor Modeling of Hypotheses - Applying mathematical models to alternative hypotheses is difficult and often leads to mistakes or oversimplifications. Linear models are often chosen due to their simplicity but are often not realistic - asymptotes, thresholds, etc should be considered.
2) Failure to Consider Various Aspects of Model Selection Uncertainty - Scandalous! The statistical uncertainty of choosing a best fit model is often not included in the model's precision.
3) Failure to Consider Overdispersion in Count Data - Overdispersion occurs when data independence is assumed but may actually be dependent. This results in underestimating sampling variance.
4) Post hoc Explanation of Data Not Admitted - Data analysis should never be influenced by the outcome of the results.
5) Statistical Significance versus Quantitative Evidence - P-values must be euthanized! Information-theory has no predetermined alpha levels that magically make results significant or not. Instead, significance is based on the strength of the evidence and biology that is being addressed.
6) Goodness of Fit Should Be Assessed Using the Global Model - GOF should not be assessed for each model, but on the most highly parameterized model
7) Failure to Provide All the Needed Information - Many papers using information-theory leave out values for important information in their results, which can hinder the interpretation of their models.

Outright Mistakes:
1) The Incorrect Number of Estimable Parameters (K) - There are many variables that influence the value of K, which are not always included.
2) Use of AIC Instead of AICc - AICc should be used for smaller sample sizes.
3) Information Criteria Are Not Comparable Across Different Data Sets or Different Response Variables - pretty self-explanatory. Data sets must be analyzed separately and response variables must be kept consistent.
4) Failure of Numerical Methods to Converge - Computer software can fail to converge on the maximum of the log-likelihood. When this happens seek professional help.

So, is null hypothesis testing dead and should p-values be "euthanized"? If it's so terrible and outdated why is it still so common and taught as a ritual to students?

No comments:

Post a Comment