Multimedia databases are growing rapidly in size in the digital age. To increase the value of these data and to enhance the user experience, there is a need to make these videos searchable through automatic indexing. Because people appearing and talking in ...
Motivation: Unbiased clustering methods are needed to analyze growing numbers of complex data sets. Currently available clustering methods often depend on parameters that are set by the user, they lack stability, and are not applicable to small data sets. ...
We present new techniques to analyze natural local search algorithms for several variants of the max-sum diversification problem which, in its most basic form, is as follows: given an n-point set X subset of R-d and an integer k, select k points in X so th ...
Power gating is a common approach for reducing circuit static power consumption. In FPGAs, resources that dominate static power consumption lie in the routing network. Researchers have proposed several heuristics for clustering multiplexers in the routing ...
Automatic measurement of semantic text similarity is an important task in natural language processing. In this paper, we evaluate the performance of different vector space models to perform this task. We address the real-world problem of modeling patent-to ...
Data is pervasive in today's world and has actually been for quite some time. With the increasing volume of data to process, there is a need for faster and at least as accurate techniques than what we already have. In particular, the last decade recorded t ...
We study the problem of constructing epsilon-coresets for the (k, z)-clustering problem in a doubling metric M(X, d). An epsilon-coreset is a weighted subset S subset of X with weight function w : S -> R->= 0, such that for any k-subset C is an element of ...