BIOSCIENCES-L Archives

January 2013

BIOSCIENCES-L@LISTSERV.GMU.EDU

Options: Use Proportional Font
Show HTML Part by Default
Condense Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Content-type:
multipart/alternative; boundary=Apple-Mail-2-974820520
Sender:
Biosciences Graduate Students <[log in to unmask]>
Subject:
From:
Tiffany Sandstrum <[log in to unmask]>
Date:
Mon, 28 Jan 2013 10:23:46 -0500
MIME-version:
1.0 (Apple Message framework v1085)
Comments:
To: SSB Faculty <[log in to unmask]>, binf phd students entry students entry <[log in to unmask]>, binf ms students entry students entry <[log in to unmask]>, BIOL MS <[log in to unmask]> cc: [log in to unmask]
Reply-To:
Parts/Attachments:
text/plain (1026 bytes) , text/html (5 kB)
Please join Jeff Solka, PhD for the Colloquium on Tuesday, January 29, 2013 at 4:30 pm.  The presentation will be held in room 256 of Bull Run Hall on the PW campus.  



The speaker will be Dr. Elizabeth Hohman (NSWCDD)


> TITLE: Statistical Methods in Text Analysis
> 
> ABSTRACT: This talk is structured like a mini-tutorial of text
> analysis using the R programing language and environment. We use
> PubMed to download an example corpus and perform the parsing,
> classification, and clustering in R. Instead of using R text packages
> such as tm, we represent the documents as a matrix and apply some
> standard classification and clustering techniques. All code is
> included in the slides and can be run on your own PubMed download. The
> focus is on understanding the math behind the techniques, not on
> efficiency. After understanding basics such as the TFIDF (term
> frequency inverse document frequency) representation of a corpus, one
> can be better prepared to use the available text mining packages.
> 
> -- 
> 
> 
> 



ATOM RSS1 RSS2