Hints, Code and Assignments page


  Code for experiments with MNIST character data. Look at the comments after readme for usage.


  Data file for iris in Weka arff format.


  Data file for letter data in Weka arff format.


  Notes on CFS and ReliefF.


  Ripper slides.


  Statistics online module link.


  Spambase.arff file (learn to filter spam).


  Perl code to add a "fake" class to test data for weka when not given classes.


  OneR notes.

  Ensemble Notes.


  Data Mining Proposal form.


  Neural networks in Weka are called multilayer perceptrons under functions.

  Graph mining notes.


  Notes on fuzzy clustering.


  Wrapper notes.


  Collaborative filtering notes.


  Feature creation notes.


  cpu data with vendor. You may remove the vendor field if you like within weka.


  Receiver Operating Characteristic Analysis (ROC) notes.


  Some statistics notes.


  Slides on 5x2 fold evaluation.


  Data files for Weka (most UCI sets) including, breast-cancer, and iris


  Weka, java data mining tool.


  Labor data set.


  Pima diabetes data set.


  Vote data set for Weka.


  Bibliography on an approach to dealing with large data sets.


  Hypothyroid data set for Weka.


  The RIPPER paper.


  The quickprop neural network C code.


 UCI Repository of machine learning databases.