LISTSERV mailing list manager LISTSERV 16.0

Help for MS-CS-L Archives


MS-CS-L Archives

MS-CS-L Archives


MS-CS-L@LISTSERV.GMU.EDU


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

MS-CS-L Home

MS-CS-L Home

MS-CS-L  April 2012

MS-CS-L April 2012

Subject:

Reminder: [GRAND Seminar] Clustering Algorithms for Streaming and Online Settings - Friday, April 13, noon, Claire Monteleoni

From:

Jyh-Ming Lien <[log in to unmask]>

Reply-To:

[log in to unmask]

Date:

Thu, 12 Apr 2012 22:02:34 -0400

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (98 lines)

[Apologies for multiple postings]

Please join us.

**************************************************
*
*
*    GRAND Seminar
*
*    http://cs.gmu.edu/~robotics/Main/GrandSeminar
*
*
**************************************************


*Title*

Clustering Algorithms for Streaming and Online Settings

*Time/Venue*

April 13, noon, Friday
ENGR 4201

*Speaker*

Claire Monteleoni
Assistant professor
Department of Computer Science
George Washington University

*Host*

Amarda Shehu

*Abstract*

Clustering techniques are widely used to summarize large quantities of
data (e.g. aggregating similar news stories), however their outputs
can be hard to evaluate. While a domain expert could judge the quality
of a clustering, having a human in the loop is often impractical.
Probabilistic assumptions have been used to analyze clustering
algorithms, for example i.i.d. data, or even data generated by a
well-separated mixture of Gaussians. Without any distributional
assumptions, one can analyze clustering algorithms by formulating some
objective function, and proving that a clustering algorithm either
optimizes or approximates it. The k-means clustering objective, for
Euclidean data, is simple, intuitive, and widely-cited, however it is
NP-hard to optimize, and few algorithms approximate it, even in the
batch setting (the algorithm known as "k-means" does not have an
approximation guarantee). Dasgupta (2008) posed open problems for
approximating it on data streams.

In this talk, I will discuss my ongoing work on designing clustering
algorithms for streaming and online settings. First I will present a
one-pass, streaming clustering algorithm which approximates the
k-means objective on finite data streams. This involves analyzing a
variant of the k-means++ algorithm, and extending a divide-and-conquer
streaming clustering algorithm from the k-medoid objective. Then I
will turn to endless data streams, and introduce a family of
algorithms for online clustering with experts. We extend algorithms
for online learning with experts, to the unsupervised setting, using
intermediate k-means costs, instead of prediction errors, to re-weight
experts. When the experts are instantiated as k-means approximate
(batch) clustering algorithms run on a sliding window of the data
stream, we provide novel online approximation bounds that combine
regret bounds extended from supervised online learning, with k-means
approximation guarantees. Notably, the resulting bounds are with
respect to the optimal k-means cost on the entire data stream seen so
far, even though the algorithm is online. I will also present
encouraging experimental results.

This talk is based on joint work with Nir Ailon, Ragesh Jaiswal, and
Anna Choromanska.

Short bio:

Claire Monteleoni is an assistant professor of Computer Science at
George Washington University. Previously, she was research faculty at
the Center for Computational Learning Systems, and adjunct faculty in
the Department of Computer Science, at Columbia University. She did a
postdoc in Computer Science and Engineering at the University of
California, San Diego, and completed her PhD and Masters in Computer
Science, at MIT. Her research focus is on machine learning algorithms
and theory for problems including learning from data streams, learning
from raw (unlabeled) data, learning from private data, and Climate
Informatics: accelerating discovery in Climate Science with machine
learning. Her papers have received several awards, and she currently
serves on the Senior Program Committee of the International Conference
on Machine Learning, and the Editorial Board of the Machine Learning
Journal.

-- 
*Jyh-Ming Lien*
Assistant Professor, George Mason University
+1-703-993-9546
http://cs.gmu.edu/~jmlien

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

November 2021
October 2021
August 2021
February 2021
January 2021
August 2020
July 2020
April 2020
February 2020
May 2019
January 2019
September 2018
August 2018
June 2018
April 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
May 2017
April 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
July 2016
June 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
July 2014
June 2014
April 2014
March 2014
February 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
May 2012
April 2012
March 2012
February 2012
January 2012
November 2011
October 2011
September 2011
July 2011
June 2011
May 2011
April 2011
March 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009

ATOM RSS1 RSS2



LISTSERV.GMU.EDU

CataList Email List Search Powered by the LISTSERV Email List Manager