Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 4.0
Info

Note : Scrubber 3.X is being ported to Apache cTAKES, this is an interim BETA release.

Table of Contents

1. Intended usages

1.1 Default configuration

...

  • Scrubber can use different classifier implementations without recompiling the software.
  • By default scrubber dynamically loads the popular WEKA C4.5 decision tree classifier with multi-class support.

2. Software Features

2.1 Annotation

  • Annotate word tokens and redact PHI from physician notes
  • cTAKES lexical parsing and medical dictionary annotation
  • WEKA multi-class decision tree classifier (plugin default)
  • Protege UI support for human expert curators (reads output) 
  • Generate feature sets containing lexical properties, medical concept codes, and human defined rules 

...

  • Compare lexical properties and distributions of public and private text sources 

3. How To

3.X Install / Train / Test / Scrub

...

Anchor
properties
properties

4. scrubber.properties

4.1 Java Object

...