## clustering question

3 posts
• Page

**1**of**1**### clustering question

Hello,

Maybe I get some help on this site.

I am looking for a clustering algorithm that won't be budged by points density and will be able to isolate one point as a cluster even when facing high density groups at other clusters.

I would like to give my example:

I have around 100 points scatter around x=1 y=1

I have another 100 points scatter around point x=1.2 y=1.2

Now I had another point (just one) at x=100 y=100

I would like an algorithm that will see it either as a new cluster or will see the 2000 points as one cluster and the last one as another cluster.

Most algorithms I tested join the last point to one of the groups... even when I define more then 3 groups (I tried even 10), non isolated it.

What type of algorithm will be able to deal with such scenarios?

my real data set is 5D and not 2D as I presented here. Be glad to provide it anyone want to play with it.

I will appreciate any help on this matter,

T.

Maybe I get some help on this site.

I am looking for a clustering algorithm that won't be budged by points density and will be able to isolate one point as a cluster even when facing high density groups at other clusters.

I would like to give my example:

I have around 100 points scatter around x=1 y=1

I have another 100 points scatter around point x=1.2 y=1.2

Now I had another point (just one) at x=100 y=100

I would like an algorithm that will see it either as a new cluster or will see the 2000 points as one cluster and the last one as another cluster.

Most algorithms I tested join the last point to one of the groups... even when I define more then 3 groups (I tried even 10), non isolated it.

What type of algorithm will be able to deal with such scenarios?

my real data set is 5D and not 2D as I presented here. Be glad to provide it anyone want to play with it.

I will appreciate any help on this matter,

T.

### Have you looked at MST-based clustering algos?

You're probably looking for a graph-based (MST-based) clustering algo. Here is a link to an algo that I've found highly useful: http://people.cs.uchicago.edu/~pff/segment/

If you define your pairwise metric appropriately, the dimensionality of your space will no longer matter.

I have it coded in Matlab, can port it to Python if there is enough interest.

Yakov

If you define your pairwise metric appropriately, the dimensionality of your space will no longer matter.

I have it coded in Matlab, can port it to Python if there is enough interest.

Yakov

3 posts
• Page

**1**of**1**