dc.contributor.author | Delany, Sarah Jane | |
dc.contributor.author | Cunningham, Padraig | |
dc.contributor.author | Doyle, Donal | |
dc.date.accessioned | 2008-01-28T10:26:52Z | |
dc.date.available | 2008-01-28T10:26:52Z | |
dc.date.issued | 2005-02-05 | |
dc.identifier.citation | Delany, Sarah Jane; Cunningham, Padraig; Doyle, Doonal. 'Generating Estimates of Classification Confidence for a Case-Based Spam Filter'. - Dublin, Trinity College Dublin, Department of Computer Science, TCD-CS-2005-20, 2005, pp12 | en |
dc.identifier.other | TCD-CS-2005-20 | |
dc.identifier.uri | http://hdl.handle.net/2262/13438 | |
dc.description.abstract | Producing estimates of classification confidence is surprisingly
difficult. One might expect that classifiers that can produce numeric
classification scores (e.g. k-Nearest Neighbour or Naive Bayes)
could readily produce confidence estimates based on thresholds. In fact,
this proves not to be the case, probably because these are not probabilistic
classifiers in the strict sense. The numeric scores coming from
k-Nearest Neighbour or Naive Bayes classifiers are not well correlated
with classification confidence. In this paper we describe a case-based
spam filtering application that would benefit significantly from an ability
to attach confidence predictions to positive classifications (i.e. messages
classified as spam). We show that `obvious? confidence metrics for
a case-based classifier are not effective. We propose an ensemble-like solution
that aggregates a collection of confidence metrics and show that
this offers an effective solution in this spam filtering domain. | en |
dc.format.extent | 166954 bytes | |
dc.format.mimetype | application/pdf | |
dc.language.iso | en | en |
dc.publisher | Trinity College Dublin, Department of Computer Science | en |
dc.relation.ispartofseries | Computer Science Technical Report | en |
dc.relation.ispartofseries | TCD-CS-2005-20 | en |
dc.relation.haspart | TCD-CS-[no.] | en |
dc.subject | Computer Science | en |
dc.title | Generating Estimates of Classification Confidence for a Case-Based Spam Filter | en |
dc.type | Technical Report | en |
dc.identifier.rssuri | https://www.cs.tcd.ie/publications/tech-reports/reports.05/TCD-CS-2005-20.pdf | |