We construct hierarchical Bayesian models to model user preference.
Nonparametric priors are used in such models. This provides great flexibility in modeling the
true distribution of user preference which leads to better predictions.
(Current project)
Intrusion attackers have different preference on the attack tools they use and the networks they target.
We model individual attacker preference by a preference function and the distribution of such preference
functions among the attacker population by a Gaussian process prior. The Bayesian model is learned from
shared cyber-alert reports and then used for predicting future attacks.
(Oct, 06 -- Oct, 07, Collaborators: Phil Porras, Johannes Ullrich)
We proposed a novel framework for coclustering related objects such as documents and the words
contained in these documents. In this framework, some approximate clustering structure on one type of
objects (e.g. terms) is used to refine the similarity between the objects of the other type
(e.g. documents). An alternating refinement process leads to better clustering of both types of objects.
(Jan, 07 -- Jun, 07)
We designed and implemented a scalable algorithm for mining correlations among objects in extremely
large data sets using locality sensitive hashing. The algorithm outperforms the next best algorithm by
several orders of magnitude.
(Jan, 06 -- Jun, 06, Collaborator: Joan Feigenbaum)
We apply wavelet analysis to extract features from BGP update sequences.
The features are then used by a nearest neighbor learning process to identify outliers/anomalies in
BGP updates.
(Sep, 04 -- May, 05, Collaborators: Joan Feigenbaum, Jennifer Rexford)
We designed data-stream specific algorithms for geometric and graph problems such as
diameter and shortest-path distances. We also proved streaming-space lower bounds for these problems.
(2002 -- 2004, Supervisor: Joan Feigenbaum)
We design and implement computational models to simulate how human attention affects their vision process.
(1999, Supervisor: Larry Abbott)