AdSense


Please edit this paper for changes to the outline, suggestions for new topics.

TODO

  • Finish 1st draft revisions
  • Show at least 1 predicted ROC curve
  • Make distance matrix figures easier to read
  • Correct references
  • Add Pareto Rank for performance data
  • Add pareto rank for perf data.

Some datasets are not shown for ROCs because the regression yields negative values for TP or FP, which are not valid. dominance results are generally bad, added gaussian processes regression algorithm to see if it helps. Crashed heap size error.

Added Isotonic regression, basically the least-squares line, which is not useful. Since there is only 1 line, all data have pareto rank 1.

Added Pareto Rank correlation. Not good for isotonic regression.

SVM regression predicted negative values for some points, failed to get past the dataset checker. Fixed this, need to re-run experiments.

Abstract

Understanding the performance of a learner under changing bias is an important problem in bias learning and meta-learning. We investigate the use of meta-models, as an easily computable summary of the performance of a learner. A meta-model is a regression model predicting performance measures of a base learner as a function of its parameters. Meta-models are unique across datasets and different algorithms, sensitive to bias changes, and amenable to regression. We compare different bases for meta-models, ROC curves and standard performance metrics. Our experiments show that regression preserves the tradeoffs between different performance measures as bias changes.

Draft

The draft is available in PDF format. You can checkout the directory for editing the paper.

Questions

Notes

  • Compute within-block average and stdev, compare to other blocks. Show that distance within the block is much smaller than across other blocks.
  • Figures are hard to read, try SubCategoryAxis

Outline

Theme: Meta-models are just like dataset signatures for learned models. We investigate how to create them and how to use them.

Plan

  • Evaluate different meta-models we can create:
    • ROC Curves
    • Common Performance Metrics

  • Want to compare how good these are for meta-models with the following performance measures:
    • Distance between different models
    • Ability to do regression
    • Preserve relative performance characteristics, like dominance, pairwise comparisons

Paper

Meta-Models

  • Motivate requires and basis for study. What could you do if you had a meta-model? What we really want to know is if I try lots of different parameters, can I use meta-models to understand the performance and sensitivity to parameters?
    • Regression can help you draw tradeoff surfaces showing tradeoffs between objectives. Good parameters are those along the tradeoff surface.
    • Can predict performance around key parameters. Sampling, then prediction

Experiments

Implementation

Notes

Reorganize experiment notes.

Distance Matrix

  • See Results
  • Need to change name of results page.
  • Do distance matrix for performance data again
  • Change figure so that it is easier to see which block is which data / alg.
  • If figure is too big, compute the average intra-block and inter-block distances, compare for different distance functions.
  • Distance matrix can be used to evalaute how good performance measure is for different but similar datasets.

Classification

  • Results
  • Do experiment with performance data
  • See if we can predict classifier, dataset, and the pair with distance matrix.
  • Run all experiments again.

Parameter Sensitivity

  • Results
  • Analyze correlation between ROC distance or performance data and key feature of weights, use maximum of the weight

Regression

  • Regression Error
    • See Results.
    • Do regression for all UCI datasets
    • Do regression for performance data.

  • Dominance Measure

  • Compute rank correlation for Pareto Rank
    • How many layers of undominated points are removed before you get to a point
    • Remove first layer of undominated points, rank = 1. The next layer has rank 2.
    • Spearman's Rank Correlation does not apply if the ranks have ties. Ties are to be expected in pareto ranking.
    • Instead, use regular correlation. Need to compute the pareto ranks first.

Notes

  • Distance using performance data, 10 folds for distance matrix, 5x5 for dataset prediction.
  • Change ROC classification to use 5x5 CV.
  • Generated perf data for training by training 100 samples of parameter vectors on 1 training fold for each dataset. Dataset has following fields: weights || acc, tp, fp, prec, predicted, dataset index, learner index
  • Genertated ROC dataset in Dataset Analysis? experiment package. Following fields: weights, threshold || fp, tp, pred fp, pred tp, dom1, dom2, shuff (odd or even roc curve index used in cross validation). ROCs are still showing infinity as initial threshold, added af ilter in the parameter sensitivity code.
  • For regression, used nu = 0.8 for SVM.

Related Work

Please add references and suggestions for references here.

Read papers on ROC curves, see if similar work has been done.

ROC Analysis
  • Found a good paper, Refs using Gaussian processes to do regression on parameter space in SVMs. Used online learner to refine regression model as more parmaeters were sampled based on an online optimization algorithm. Same person tried GAs alreday.

Bias Learning

Applications
    • Any predictive modeling or optimization


Comments

No comments for this document



legal terms | privacy policy | contact | © 2006-2007 Netcipia Inc. - All rights reserved