April 2008 - Posts

29 April 2008
Association rule for medical images

Has any one worked on association rules for medical images like mammography / MRI . I got a overall idea but Plz give me some implementation details, how to start ASAP.

mail to: samshesudan@gmail.com 

28 April 2008
Teradata Universe Beijing 2008 (i am presenting there)

I am presenting at the Teradata Universe Beijing this year.  I will be the first speaker for the telecommunications sessions (the other sessions being 'Banking' and 'National Accounts').

My talk will be a high level summary of the customer analytics we conduct, how we present to corporate executives, and the benefits our analysis has brought our customers and the company.

It's a single day conference, and I'll be taking a few extra days for sight seeing in Beijing with my wife.

Feel free to say hello if you see me at the Teradata Universe Beijing 2008, May 30th at The Kerry Centre Hotel, Beijing.

Cheers

Tim

 

28 April 2008
Use C5.0 to study all attributes leading to outcome
Is there any way to relax constraints to get more rules towards a target outcome? I just want to derive the information gain statistics on individual attributes.
25 April 2008
Association Rule

Hello... 

I'm working on my script about Association rule on transaction database using Java.

I am a newbie on java programming. And I have so many difficulties on it.

May I have some script documentation about association rule ( apriori algorithm ) using java please ?

My email : naldy_s@yahoo.com

Thanks a lot.
 

21 April 2008
Best of two outputs (with confidence values) Clementine

Hi all

I have two Clementine stream one trained on back prop and another with radial basis.

The output is a binary field for prediction. I got two confidence values one for each of the stream,  I am not very sure if I just take the prediction from the stream with higher confidence value  or do I need to apply any equation (if any). Please help...

I did read an earlier thread http://www.kdkeys.net/forums/thread/7236.aspx

I don't know if I can use the same logic here, please advice, thanks in advance for your time. 

20 April 2008
Need some codes

Hi,
I am a data mining student, we use Weka Tools for our practicals.
I need the source codes for some algorithms implemented in any language (C/C++/VB)
If anyone has the source codes for the following algorithms (or the links where I can find them), plz mail it to me at chirag.s1@gmail.com
Thanks.

1. Apriori (Large Item sets and Association Rules)
2. Agglomerative Algorithm
3. Decision Tree (ID3 / C4.5 / C5.0 anyone)
4. K-Means Clustering algorithm

20 April 2008
Need the codes for these

Hi,
I am a data mining student, we use Weka Tools for our practicals.
I want the source codes for some algorithms implemented in any language (C/C++/VB)
If anyone has the source codes for the following algorithms, plz mail it to me at chirag.s1@gmail.com
Thanks.

1. Apriori (Large Item sets and Association Rules)
2. Agglomerative Algorithm
3. Decision Tree (ID3/C4.5/C5.0 anyone)
4. K-Means Clustering

Thanks again

18 April 2008
Interview with Gregory Piatetsky-Shapiro

Hello,

For interested people, you can find an interview with Gregory Piatetsky-Shapiro, data mining expert and president of KDnuggets on Data Mining Research:

http://dataminingresearch.blogspot.com/2008/04/interview-with-gregory-piatetsky.html

 

15 April 2008
How to explain differences between CRT and C5.0?

Hi,

I'm usingpu CRT and C5.0 to model decision trees. Though this works pretty good I'am unbable comprehend the different output I get with CRT and C5.0.

I read the manuals and did some google research but could not find any information what actually causes the C5.0 algorithm to produce different output compared to CRT. Unfortunately C5.0 is closed source so there is not much information about its internals.
 
Does anyone have an idea where to find out more?

Cheers,

Peter 

15 April 2008
A PRIORI ALGORITHM IMPLEMENTATION SOURCE CODE IN C
Can any one help me in getting... a source code for A Priori algorithm of Association Rules in C language.Please I would be very grateful if I could have some kind of solution over it.Can I find download sites for the very purpose. PLEASE HELP!!!!!
14 April 2008
date format conversion

Hellow.

I working on some dataset that i recived via RSS. The format of the data filed is: "Tue Apr 15 03:50:00 IDT 2008",

as a string. How can i convert the date format to dd/mm/yy for example as a real or date format??

Thank's

selash

14 April 2008
Working with model node outputs

I have an SPSS source file supplying a K-Means model and a Binomial Logistic Regression model in Clementine 12.  The BLR model is looking at a mere 1298 cases.  I'd like to extract the probabilities from that model so I can score my main data file, but when I go to execute the spss export file node that's connected to my generated model node via a Type node, it crashes the Clementine Local Server.

Likewise, with the cluster ID variable from the K-Means model, I'm having a heck of a time producing an output file that contains this information.  I looked at the help files and saw the information about connecting an Analysis or Table node and then generating a derive node, but the derive node doesn't seem to select the cluster variable?  I've been surprised by how difficult it's been to dig up clear information on what seems like it would be a pretty common practice (producing a model, then applying the output of that model to existing data.  Do I need to include an SPSS output node somewhere and write syntax to get what I want?

Any thoughts are appreciated. 

10 April 2008
Asynchronous periodic pattern

Hi,

I am looking for the source code of Asynchronous periodic pattern miner in time series data. If anyone have this, plz share. Bunch of thanks. Here is my email sajib_ca@hotmail.com

Sajib

03 April 2008
Detect random variables and depedency between variables

Hi all, 

I'm not a guy from Data Mining field. However, my research leads me to a difficult problem (which seems to be related to data mining field). The problem is:

I have a set of variables (for example:  variables A,B,C,D and E).
- Some variables (for example A,B and D) are random variables. Those variables can have any value.
- The other variables are depedent on those random variable (For example: C = A+B, E = A*B+D).

Now given a set of variable: A,B,C,D,E and a dataset for those variable. For example:
A  B  C  D   E
============

1   5   6  7   12
2   3   5  3   9
......

I have no information about "Which variables are random variable", "which variables are depedent on the other variables" or "The dependency of variables" . All I have is a set of variable and a data set.

 Now, from the data set, I want to find out which variables are random variable and which are not.

How do I do that? Which algorithm should I use?
Could you pls give me some ideas or just some keywords so that I can "google".

Thanks in advance

Ball Agile.
 

02 April 2008
Kevin Hillstrom on Multichannel Forensics

Hello,

For interested people, Kevin Hillstrom describes Multichannel Forensics on Data Mining Research:

http://dataminingresearch.blogspot.com/2008/04/guest-blogger-on-dmr-kevin-hillstrom.html

Regards.

Sandro.