Subject: | |
From: | |
Reply To: | |
Date: | Mon, 28 Jan 2013 10:23:46 -0500 |
Content-Type: | multipart/alternative |
Parts/Attachments: |
|
|
Please join Jeff Solka, PhD for the Colloquium on Tuesday, January 29, 2013 at 4:30 pm. The presentation will be held in room 256 of Bull Run Hall on the PW campus.
The speaker will be Dr. Elizabeth Hohman (NSWCDD)
> TITLE: Statistical Methods in Text Analysis
>
> ABSTRACT: This talk is structured like a mini-tutorial of text
> analysis using the R programing language and environment. We use
> PubMed to download an example corpus and perform the parsing,
> classification, and clustering in R. Instead of using R text packages
> such as tm, we represent the documents as a matrix and apply some
> standard classification and clustering techniques. All code is
> included in the slides and can be run on your own PubMed download. The
> focus is on understanding the math behind the techniques, not on
> efficiency. After understanding basics such as the TFIDF (term
> frequency inverse document frequency) representation of a corpus, one
> can be better prepared to use the available text mining packages.
>
> --
>
>
>
|
|
|