Dissertation Defense Announcement
To:  The George Mason University Community

Candidate: Tugba Onal-Suzek
Program:    PhD Bioinformatics & Computational Biology

Date:   Friday July 20, 2012
Time:   11:00 a.m.
Place:  George Mason University
        Occoquan Bldg. Room 203
        Prince William Campus
Dissertation Director: Dr. Jeffrey L. Solka
Committee members: Dr. James D. Willett, Dr. Donald Seto
Title: "The Text-Mining Based Neighboring and Automated Annotation of PubChem BioAssays"

The dissertation is on reserve in the Johnson Center Library, Fairfax campus.
The doctoral project will not be read at the meeting, but should be read in advance.

All members of the George Mason University community are invited to attend.


The number of High Throughput Assays (HTS), namely BioAssays, deposited in PubChem has grown quickly in recent years.  Currently available grouping tools and single retrieval analysis approaches for the BioAssays turned out to be impractical with the rapidly increasing volume of data. In this work, a text-mining based approach is proposed towards automated neighboring and annotation of BioAssays using their unstructured text descriptions. Our results from assay neighbor clustering analysis compared to the existing assay neighboring methods suggest that strong correlations among the bioassays can be identified from their conceptual relevance and complement existing neighboring methods in PubChem. A novel method to extract keywords from a single unstructured text document is described and the comparative performance of the method when applied to the BioAssay descriptions is reported. Finally results of the automated biomedical annotation of the BioAssay text descriptions towar
ds the discovery of chemical substances that satisfy the probe criteria are presented.