im working on a project in data mining on clustering..

i ve implemented kmeans in c++ ..but am finding problems with it..

ma code gives illegal operation everytime i run it..
but the trace of the code when i debug it is cumin out to be correct..

will writing the code in java make a difference?

also, i want the input to come from an excel file (currently im taking input from a text file) ..
is there any way i can read data from an excel file in a c++ or a java application?

i would be rilly gr8ful if anyone mails(or provides any link) me up a running code for kmeans in java or c++, so that i can crosscheck ma implementation n possibly find out where im going wrong..

also, are there any standard n good books on clustering available??

Dang :

I do not think that their is any automated tool for converting C++ programs to C#.

C# can be used to handle datasets of essentially any size. How big is your dataset?

C# offers the ArrayList which is comparable to dynamically sized arrays

I believe that you can convert your code to C#. How many classes and lines of code are you talking about?

If you code is not proprietary or the source code is not copyrighted, you can start by posting the PseudoCode and Algorithm that completely describes your KMeans Algorithm implementation.

If it cannot be disclosed publicly on a forum we can look at other possible means of collaboration.


hi Kingsley Tagbo

i ve tried running ma code in java..n the problem of illegal operation is solved..

the code is running for a very small data set..

but now again if i run it on a large dataset say with 1000 dataitems, array index out of bounds exception arises..

can u pls tell me some other means of collaboration so tht we can discuss over the source code n detect where im going wrong???

this would be of gr8 help...

i tried java because ma guide told me that the size of array is not restricted in java n this property can be of help in datamining since v ve very large datasets..


To defend C++...

C++ can handle datasets of up to any size (bounded by the stack/heap size as with C#) if you use the STL, or alloc the memory yourself...

C++ offers std::list, std::vector, ...etc There are lots of containers in C++; if these are unfamiliar, you are missing out, but can resort to malloc (if you know how big the file is, assuming one byte is a double then size of file * size of double will be plenty for the data...for truely HUGE datasets, you probably actually want some code to process a few datapoints at a time....rather than read the whole lot into memory.Frances.

hi...I am a student and now Im doing my final project, its datamining with clustering , and ill compare it with kmeans algorithm in vb .net,so please...could someone help me with kmeans method algorithm source code, coz i wanna know how it works.......

I am a student and now I'm doing my final project, its datamining with clustering
and i'll compare it with kmeans algorithm in vb .net,
so please...
could someone help me with kmeans method algorithm source code, coz i wanna know
how it works.......

sent my email at 

yes i can explain you the's easy ...can u tell me which language do u prefer ? or would u like to see sample code along with the output ?

I have some demographic data and movie types.

I want to construct a decision tree and test it by using a simple java code for it but there are a lot of algorithm ID3, C4.5 C&Rt and so on which algorithm is appropriate for it?

Is there a java code for this implementation?

please can you provide me the java code for quickreduct algorithm?

Hi,I am desperately in need of a C++ code for FD_Mine algorithm for decision tree executable in windows XP.Someone plz mail me the code. yxj_aqi@163.comThanx,Yande

Dang :

If you are looking for an object oriented programming language that is easier to debug than C++ and also very productive, try C#.NET.

I would not recommend Java even though you can certainly build the program using Java.

C# will probably take you one day to learn as a C++ programmer. However, the simplicity of the language does not do adequate justice to the flexibility of C#.

C# is a modern Object Oriented Programming language which provides a more productive environmemt than either Java or C++



actually the reason why i want to switch over to java is that the size of array is not limited in java as it is in c++,

will i be able to process huge datasets in c#?

will i be able to process huge datasets in c#?

thanx for ur advice

actually the reason why i want to switch over to java is that the size of array is not limited in java as it is in c++,

will i be able to process huge datasets in c#?

ive already implemented the code in c++, as i wrote in ma prev msg..

(though there is some problem with it)

is it possible to covert it to c# easily?

thanx for ur advice

Hi everyone

I would like to get a sample source code in vb that does a clustering of data that is read from a file.

Thanx for your help.

I am curerently beta testing my second version of a C# .NET Apriori algorithm implementation.

The design and implementation of the tool was less than 48 hours for me.

please see

When you say "illegal operation" what exactly is the message and when does it happen...try putting in some break points in to isolate where it occurs, this may help you spot the problem.

There is probably no point is us e-mailing you code to compare and contrast; when variable names etc differ this can cloud the issue. If you want post your code, and we may be able to spot mistakes. Also the accu ( are very good at helping with C++/Java issues, if you are specific. As a first port of call, are you reading your text file into a static sized array? If so, check this is big enough for the data. If not that will throw an exception. Or, are you dividing by zero somewhere? Like I said, find out where the problem happens, and then it may be easier to solve.

As for Excel, this is not straight forward; the best way is probably with The #import Directive, Also, hunt for the COMEXCEL sample in the msdn somewhere.


please send the c++ versionof kmeans to

Hello ,I actually wanted to know how should one implement k-means algorithm in a decision tree...I know that in a decision tree program as an output we finally get a tree with all the splitting attributes (calculated on the basis of gain and entropy) but in case of a normal decision tree prog the output attribute need not be grouped For eg.employee id salary income1 1000 low2 2000 highhere as u can see we have the o/p attribute's (income in this example) values high and low ..but if suppose if we are given age as an output attribute having many values eg: age={2,5,6,7,8,9} then how do we implement the decision do we use kmeans clustering to obtain the clusters and den make a decision tree out of it ...can it all be done in a single programI would prefer if someone will post the code in java...Other languages or tools are also acceptedThanks in advance,Pankaj

