Ask A Data Miner - 75,000+ Members

Hi,I'm trying to use the Cox Regression. I have 30 different cases with each one having a failure event at the end.Question: is it reasonable to put all cases in ONE training set or should I use 30 training sets to train my model ? Thanks a lot for any hintWerner  ...

Read More

: Data Mining    : werner   

I have Clementine using an ODBC connection to a Postgres server on a machine within the same IP range (for example, the server is at 123.45.67.100 and my remote computer sits at 123.45.67.24). However, when I select a table through the database source node, the only field that shows up is "RowsAffected" with a value of -1. All I did is: 1. Add node2. Set database to my database 3. Select my table out of the database But nothing comes up...anyone ...

Read More

: Data Mining    : feshmania    :2 replies

I have a field in my database which is 9 digits long, but with leading zeroes--therefore stored as a string. When I read this data into SPSS, it comes into the type node as Typeless...however, this means that I can not make it have an In or Out direction, which does not allow it to be a participant in the use of a CART node.Does anyone know what I can do?  ...

Read More

: Data Mining    : feshmania    :3 replies

I was wondering if anyone had a good resource for the text mining aspects of clementine. I've fiddled around with the interactive feature but I'm not really sure what to do with it after this...maybe somebody has a tutorial on it somewhere? I Googled it, but to no avail... ...

Read More

: Data Mining    : feshmania    :1 reply

Normal 0 21 false false false DE X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNorm ...

Read More

: Data Mining    : MalteL    :2 replies

I am studying in Bachelor Degree. Now I'm working to do my thesis which the topic is Data Mining Intrusion Detection. I have collected raw data which contain " Time, Source-Destination IP, Source-Destination Mac Address, Type of Network and Transport Protocal, Length, Source-Destination Port".Anyway, I got some problems about how to start analysis the patterns in Clementine. -how can I start data mining process-Which modeling I should use-What kind of raw data ...

Read More

: Data Mining    : gobbie    :1 reply

Hi,I'm trying to derive a new field which is simply the last character of the string value held in a different field. I have tried last(STRING) and endstring(N, STRING) but it only works for the strings which are 9 characters long or greater... The field which holds the string is typeless but Clem acknowledges it's storage is of the STRING type.  Examples: field1 => 1465443LA should give 'A' (works)        &n ...

Read More

: Data Mining    : DaveBassant    :4 replies

Hi,i would like to know if is there a native fonction on Clementine to crypt (e-mail in my case) in MD5 key ?I need to merge e-mail data base, and one is crypt with a MD5 key. I'm pretty sure to have see something like that, but i can't remenber where.If you have some informations about that would be geart Thx David ...

Read More

: Data Mining    : degahoz   

HiEveryone,I want to read my persian data with Clementine 11.1    -         when  I select 'Excel Node' as  my Source Node ,it can't read more than(about) 250 rows.-         When I select  'Data Base Node'  as  mySource Node ,and connect to My Sql with ODBC , it can't read data with type'varcher' and 'te ...

Read More

: Data Mining    : B-Data miner   

Hello Has anyone experienced TLA execution failures with large files?Error:Error: Text Link Analysis Error: Exception: SessionManager: Cannot remove temporary directory - Memory : 628899kb - Memory peak : 663449kbError: Text Link Analysis Execution ErrorInformation: Stream execution complete, Elapsed=1315.48 sec, CPU=1313.19 secWarning: Execution was interrupted What I am seeing are massive "matching text" cells within records. Anyone know how to prev ...

Read More

: Data Mining    : G B   

Hi, I am working on intrusion detection in computer networks.I need some help in how to start the analysing the patterns in clementine by the attack data i have.Its in the format time, src ip, dst ip, src port, dst port, protocol, flags...i want to know how to start the data minig process cheers ...

Read More

: Data Mining    : pari2682    :2 replies

Hi !!! I am a new In Clementine so I have two questions for some experienced people. 1. I can output sometimes jpg files with distribution and so on . I f I output 3 or 4 like this how can I to combine then to one html file . I f anyway besides to write html file presenting this jpg by myself.2. How I can get scatter matrix in Clementine ? Thank a lot ...

Read More

: Data Mining    : matatovna    :2 replies

Hi !!! I get a list of 10000 customers nad asked to predict who will by new product of company.How I can attack this business problem ? IF I need to use clustering ? ...

Read More

: Data Mining    : matatovna    :5 replies

I want to perform custom bining . I have 9 predefined bins and I want to convert it to 9 nominal values with names. How I can to write CASE/SWITCH statement in DERIVE mode . Or nested if ? ...

Read More

: Data Mining    : matatovna    :1 reply

Hi, Is it possible to script in an unique export in .csv file from each set or ordered set' fields crossed by a field name "target" (set filed) with matrix node ? If it's not possible, may be exported in differents .csv ? Could you give me an example ? or the solution ;-)  Thank you very muchHugo ...

Read More

: Data Mining    : hugo59    :2 replies

Hi,  Can yun instruct me:       Using scripts, how to select and generate one or more model nuggets in the Binary Classifier Results output that is generated by automated model: Binary Classifier in SPSS Clementine v11.1?     Looking forward to hearing from you. Thanks! ...

Read More

: Data Mining    : jiangjixiu    :4 replies

Hi,      When running a stream than contains node CHAID and Logistic in Batch mode, because no model was created for CHAID, the execution was interrupted so as to fail to run all actions .       ======== Clem_batch log file shows as follows: ==========Information: Stream execution startedE3138: Stopping rules prevent any tree growthError: Model building completed but no model was createdInformation: Stream execution complete, ...

Read More

: Data Mining    : jiangjixiu    :2 replies

Evening?I have a Clementine version without CATs,and I want to try CATs.So,where can I get CATs?Thank you! ...

Read More

: Data Mining    : baoxi   

Hi,I'm running a loop with script. Iteration frequency and values to be used in a loop is determined by output table.For example,####################################################~~~~~set TABLE.output_mode=Screenexecute TABLEvar count_cset count_c=TABLE.output.row_countfor c from 1 to ^count_cvar value_cset value_c=value TABLE.output at ^c 2create selectnode at ^c*100 100set selectnode.custom_name = ^value_c~~~~~~~endfor################################################# ...

Read More

: Data Mining    : ajsooajs    :1 reply

HiDoes anyone know of a quick way to calculate a model's Gini Index in Clementine (v10) to evaluate the discriminatory power of the generated model?Many Thanks,Ronan  ...

Read More

: Data Mining    : brennanro   

Does the KMeans modelling node automatically standardize clustering inputs, or should pre-standardization be carried out? Two-Step node appears to have a standardize checkbox option. Thanks, R ...

Read More

: Data Mining    : brennanro    :6 replies

HiWhile performing memory-intensive operations in Clementine v10 (e.g merging millions of records from multiple sources or diplaying a 1000 record table from a large data file etc.), an error occurs, with the following error message appearing: "X4001". Unusually for Clementine, there is no further information about the error, although it seems to be memory related.The log file entry for the error is as follows: 2008/06/24 15:39:17 [2716-2716]: X4001: XMemoryI t ...

Read More

: Data Mining    : brennanro    :2 replies

Hi I've been asked by my DBA to apply a lock mode setting statement to my queries on an informix database (set lock mode to wait 20;) in order to handle possible update errors. Could someone advise on how this type of statment could be applied in Clementine?I've tried the most obvious solution of appending the statement before the query; i.e.set lock mode to wait 20;SELECT *FROM TABLE_NAME; However, this generates the error "Cannot use a select or any ...

Read More

: Data Mining    : brennanro    :1 reply

HiCould someone explain why after building a C&RT model (Clementine 12) on a dataset with hundreds of potential attributes, the generated model appears to be based on a certain number of attributes but there are a greater number of attributes listed as Inputs within the Summary tab of the tree's model nugget.For example, viewing a generated model built on 120 potential inputs, 4 attributes are used in creating the classification rules. However, in the Inputs section of th ...

Read More

: Data Mining    : brennanro    :4 replies

Using an Ensemble node to combine two decision tree models, with Adjusted-propensity weighted voting method and adjusted propensity if voting is tied.For customers 1,2 & 3 respectively, Decision Tree Model 1 predicts CHURN, CHURN and NON_CHURN, calculating raw propensity scores ($RRP-CHURN) of 0.98, 0.67, 0.10 and adjusted propensity scores ($RAP-CHURN) of 0.33, 0.04 and 0.01. Decision Tree Model 2, applied to the same customers, follows the same prediction categories (CH ...

Read More

: Data Mining    : brennanro    :2 replies

Happy New Year to you all!I have a question on the calculation of raw propensity scores for a tree model (in this case a CHAID tree for predicting customer Churn). I've built a tree on a random sample of 60% of the population, and am analysing the distribution of rule assignments on the 40% hold-out sample. Looking at a given terminal node where the confidence level was 0.645, the calculated churn raw propensity score ($RRP-CHURN) = 0.355 for most customer ...

Read More

: Data Mining    : brennanro    :1 reply

Not 100% sure this is the best forum to post the question, but it does relate to a Clementine analysis project.I don't have a great recollection of the basic theory behind Factor Analysis/PCA, but I've always regarded it as a data reduction technique (which I haven't had the opportunity/need to use before) useful for identifying a smaller number of underlying factors amongst a large dataset.Anyway - in advance of an upcoming segmentation project, a co ...

Read More

: Data Mining    : brennanro    :2 replies

Hi,[:)] I am so much relived after seeing such a strong and knowledgfull community of data miners!!!!am a beginer level data miner working in India.Iam not able to interpret the use of CARMA algorithm.If any one can post relevant case studies with the explanation of data set it will be of great help and also please help me in getting some good case studies for cluster analysis. If possible post the stream :)Thanks in advance. ...

Read More

: Data Mining    : kumarchitran   

How can we perform spatial data mining in Clementine 11.1???and also please help me to understand the use of dimension files.......How many dimensions can Clementine support??? Thankyou! ...

Read More

: Data Mining    : kumarchitran    :1 reply

hi,once we create a model how do we deploy them??there is no SPSS predictive enterprice solu: purchased.......what are the other ways to deploy the model in the data warehouse??  ...

Read More

: Data Mining    : kumarchitran    :4 replies

Hello,Please I want to ask you ifyou know how we can try SVM algorithm with Clementine. If you have a tutorial witchexplain the method, I would be grateful if you send it to me. Thanks a lot. ...

Read More

: Support Vector Machine    : Rosa    :10 replies

hello everybody,i am a student from germany, working on my bachelor thesis about an multinomiale logistic regression issue. I m using Clementine 11.1 and I hope someone could give an answer to my questions:Ok, my response varible falls into 5 categories/values. These values are ordered. Because I have to mention some theoretical aspects in my thesis as well I have to know whether the Logistic-Node is able to deal with this information or not. So far I haven't seen an opt ...

Read More

: Data Analysis and Statistics    : jay_r    :2 replies

Hello Friends!I am using Clementine to try to predict substantial increases in the time series data (stock prices). I used the partition node to break up my data in two parts and generate a C&R decision tree. The analysis of the resultant model shows that the model works with 84 percent accuracy on both the testing and the training dataset. I was naturally surprised by this outcome because I didn’t spend much time on selecting the predictors and was expecting a far ...

Read More

: Data Mining    : dima777    :2 replies

Hi!I am having difficulty generating a time series forecast for a series data...even if the model generates the nudget ok....it says some of the fields have not been specified while in truth they have...i checked all possible settings and still cannot figure out why thus haooens...i attached the zipped version of the stream and source file...hope you can  help me ...

Read More

: Data Mining    : dima777   

HiI am using SPSS Clementine 11.1. For ex: if i have 5 fields, field1, field2, field3, field4, field5.I want to replace the values present in all the fields based on the condition.i.e. if the values of the fields are occuring greater than 3600, then that field value should be replaced by field/2for ex: field1 = 7200 then the value should be replaced by 7200/2 i.e. 3600I am using the filler node and under the condition option the condition given is(@FIELD > 3600)  ...

Read More

: Data Mining    : manjumc    :2 replies

HiCan any one help in the below calculation as i am using clementine 11.1The following calculation to be done for Cum Cap ( results been shown)for ex: for 2nd record the value under cum cap 7.5 is done by adding2nd record of Cap + first record of cum capi.e. 3.75 + 3.75 As ID changes to a new value, fresh summation should be calculated. Sl NoIDCapCum Cap  113.753.75  213.757.5 "=3.75+3.75"31411.5 "=4+7.5"41516.5  ...

Read More

: Data Mining    : manjumc    :2 replies

HiI am using clementine 11.1, the problem is filling the values from the prevoius recordsX by filler node result200 200 2000 200 2000 0 2000 0 2000 0 200300 300 300250 250 2500 250 2500 0 2500 0 2500 0 2500 0 250The original x field contains 0 values which has to be replaced by previous values filled.If i use the filler node only one record gets fi ...

Read More

: Data Mining    : manjumc    :4 replies

Hi,I am running a for loop in which i output a 1 by k vector observation at each iteration. I want to create an array of all the observations. At the moment, I am only able to either create a separate table for each observation, or a single table that has only the observation from the final iteration. The pseudo code for what I want to do is something like this: create empty arrayfor i=1 to nselect a record from setstore record as the ith entry in arrayendforexport& ...

Read More

: Data Mining    : AES    :2 replies

Is there any way to relax constraints to get more rules towards a target outcome? I just want to derive the information gain statistics on individual attributes. ...

Read More

: Data Mining    : jas4710    :3 replies

Neural network and SVM sometimes work well over other models. However, we just don't know how to use the generated models to help us make decision, e.g. If Attr1 > 3 AND Attr3 = "Yes"then category = 1Are NN and SVM black boxes so we have to feed every new instance into a computer and let the models decide for us? We have no choices but have to resort to, say, decision tree, to make ourselves understand what's been learnt from the dataset? ...

Read More

: Data Mining    : jas4710    :4 replies

Hi allI have two Clementine stream one trained on back prop and another with radial basis.The output is a binary field for prediction. I got two confidence values one for each of the stream,  I am not very sure if I just take the prediction from the stream with higher confidence value  or do I need to apply any equation (if any). Please help...I did read an earlier thread http://www.kdkeys.net/best-of-two-outputs-with-confidence-values-clementine/#link-53I don' ...

Read More

: Data Mining    : leevikraman    :1 reply

Hi,I'm usingpu CRT and C5.0 to model decision trees. Though this works pretty good I'am unbable comprehend the different output I get with CRT and C5.0. I read the manuals and did some google research but could not find any information what actually causes the C5.0 algorithm to produce different output compared to CRT. Unfortunately C5.0 is closed source so there is not much information about its internals. Does anyone have an idea where to find out more?Cheers ...

Read More

: Data Mining    : peter.neu    :2 replies

Hi,I was wondering if there are any options to export a  stream I put to together in Clementine to SQL.I saw that it is possible to export a model to SQL but that's not really what I'm looking for. I really want to have the process of selecting meaningful data put into SQL. Cheers,Pete  ...

Read More

: Data Mining    : peter.neu    :4 replies

Hellow.I working on some dataset that i recived via RSS. The format of the data filed is: "Tue Apr 15 03:50:00 IDT 2008",as a string. How can i convert the date format to dd/mm/yy for example as a real or date format??Thank'sselash ...

Read More

: Data Mining    : selash    :4 replies

I working on the pasw13. I need to split a data set (of 13,000 posts). The partition node  dose not allow to part the data set according to the cross-validation methodology. (in this procedure you need to split the data set to 5 up to 10 parts...).  Does anybody have an idea, how can we do this?Thank's Dan. ...

Read More

: Data Mining    : selash   

Hi all/I working on the pasw13. i need to split a data set (of 13,000 posts). The partition node  dose not allow to part the data set according to the cross-validation methodology. (in this procedure you need to split the data set to 5 up to 10 parts...).  Does anybody have an idea, how can we do this?Thank's Dan. ...

Read More

: Data Mining    : selash    :1 reply

Hi all.I working on the pasw13. I need to split a data set (of 13,000 posts). The partition node  dose not allow to part the data set according to the cross-validation methodology. (in this procedure you need to split the data set to 5 up to 10 parts...).  Does anybody have an idea, how can we do this?Thank's Dan. ...

Read More

: Data Mining    : selash    :1 reply

The analysis node calculate the accuracy (correct hit) of a given classification model. how can i obtain the recall and precision from the matrix?Thank's Dan ...

Read More

: Data Mining    : selash    :2 replies

Hi, I have a table that has many fields (attributes).  There is a particular field that has unique values throughout and I would like to create a list in a standalone script that stores all the unique values.  However, I'm not sure how to go about accessing the values of a particular field and storing in list.  These values will be used for a select node later. Hope someone can help me. Thank You.Best Regards,Jeff  ...

Read More

: Data Mining    : Jefflink    :3 replies

Hi,I'm managing a project in Clementine which has multiple streams.  Each stream runs its own script.  However, there are some variables that are global (e.g. root directory to read files from) that should be shared among all the scripts.This can be easily done by setting variable in the session parameter within the Interface itself.  However, is there a way to set it within a standalone script such that all the other scripts can reference it without the n ...

Read More

: Data Mining    : Jefflink   

Hello everybody,I'm a newbit in Clementine and have a little difficulty to solve, so I propose you to consider it ;-)I'm writing a script in a stream to create a set derive field node.No problem with the beginning but with the condition.It's about something like that :set derivenode.set_default = 0set k = 1set LIST = [12 34 56]for Capa in LIST   set derivenode.set_value_cond.1 = "Kant = ">< ^k " and Nb < ">< ^Capa&nb ...

Read More

: Data Mining    : Davideik    :2 replies

Greetings:I have an 'Event_Date' field in the date and I need to create a new date field where  'New_Date' = 'Event_Date' - 60 days.  In other words, I need to derive a new variable that is 60 day prior to "Event Date". I have been struggling with this for about 4 hours reading the manual, trying different formulas and keep hitting a wall.  If you could lend a hand I would be very grateful to you.  Cheers:Patrick&nbs ...

Read More

: Data Mining    : Patrick1111    :2 replies

I would like to use a Derive node to find the space in a field.  Please could anyone help with what Expression I need to use.  The formula in Microsoft Access is InStr([postcode]," " and in Excel is it =SEARCH(" ",A1,1)Any help would be greatly received.Kind RegardsCarly ...

Read More

: Data Mining    : Carly    :2 replies

I have a field which contains a list of surnames and I would like to check whether the first letter of this field is in lower case.  I have tried islowercode(subscrs(1, 'Surname')) but this returns an error: Invalid Type for Field : BOOLEAN.Is there any other expression that I can use?I have also tried deriving a new field that just contains the first letter of the field Surname and using islowercode(Derive1) but this also doesn't work.I kno ...

Read More

: Data Mining    : Carly    :2 replies

I've connected a SetToFlag node to a source node (SPSS data file), but when I try to execute, I get an error saying that "there are no executable nodes."  Do I need to do something to my source node to make it executable, or do I need a downstream output node from the SetToFlag node?Thanks!  ...

Read More

: Data Mining    : nicheplayer    :1 reply

I have an SPSS source file supplying a K-Means model and a Binomial Logistic Regression model in Clementine 12.  The BLR model is looking at a mere 1298 cases.  I'd like to extract the probabilities from that model so I can score my main data file, but when I go to execute the spss export file node that's connected to my generated model node via a Type node, it crashes the Clementine Local Server.Likewise, with the cluster ID variable from the K-Means mode ...

Read More

: Data Mining    : nicheplayer    :2 replies

Hi to all! I'm working on a academic thesis about churn analysis in bank retail business. The goal is to compare the performance of 3 different forecasting models (logistic regression, classification trees and ANN).My dataset counts 112454 cases, but only 2106 of them are churners. How do you handle this issue? I really don't know how to build a good training set and validation set from this data.. oversampling? downsizing? In what rate?I'm using spss 16, but I ...

Read More

: Data Mining    : Jaqen    :1 reply

Hi,in my stream i have multiple database source nodes that use the same datasource. I want to set the user & password so that i can execute the stream in batch mode.I know that i can use the set Databasesourcenode1:databasenode.username = "myname"set Databasesourcenode1:databasenode.password = "mypwd" but this means i have to do it for every node. is it passible to set username and password to the datasource instead of each node? Thanks in ad ...

Read More

: Data Mining    : avtegeia    :2 replies

Could some of you share with me how you typically utilize the results of data mining?  Do you export the model to SQL and place the query in some kind of CRM?  Do you extract the discovered patterns and put them into a report for executives? ...

Read More

: Data Mining    : rdi    :2 replies

Which ones are more effective in a decision tree? ...

Read More

: Data Mining    : rdi    :1 reply

A value called $CC is assigned to each prediction that is made by a C5.0 model.  Can someone help me understand exactly what this value means and how it is calculated? ...

Read More

: Data Mining    : rdi    :3 replies

In Clementine, what is the best way to determine the optimal class distribution? ...

Read More

: Data Mining    : rdi    :1 reply

Hello,I'm new to Clementine and I'm trying to run my model against my Test set so that I can see how accurate it it, how the lift looks, etc.  I'm trying to predict a binary outcome.  I'm bringing my data in from excel and running it through the following nodes in this order: Type, Balance, Partition, Neural Net.  The Partition node splits the dataset into 75% train and 25% test. I'd like to test my model against the Test set and g ...

Read More

: Data Mining    : rdi    :5 replies

Hello, I have some concerns about the Relative Importance of Inputs list that is given in a generated neural net model.  I generated a neural net model to predict a binary outcome (churn/no churn).  According to the Relative Importance of Inputs list, "age" is a very important attribute.  However, in my training set, the average age of people who churn is almost exactly the same as the average age of people who do not churn.  I ran into the ...

Read More

: Data Mining    : rdi    :3 replies

Please see the attached lift chart.  It is very strange... it starts below 2, rises sharply, and then curves downwards along the BEST line.  Can anyone provide some input on what that might mean? ...

Read More

: Data Mining    : rdi    :1 reply

I'd like to write a script in Clementine that would build and test a model using every possible combination of options in the C5.0 algorithm.  That would allow me to look over the results and select the most appropriate set of options. Is that possible? ...

Read More

: Data Mining    : rdi    :1 reply

Do you guys do your preprocessing and dataset building outside of clementine or inside?  Right now, I'm writing all of my SQL outside of Clementine and then importing the completed dataset into Clementine. ...

Read More

: Data Mining    : rdi    :1 reply

How would you recommend explaining a lift chart to a non-data miner? ...

Read More

: Data Mining    : rdi    :3 replies

Can you implement nearest-neighbor classification in Clementine?  Has anyone done it? ...

Read More

: Data Mining    : rdi    :3 replies

When you use the prune/exhaustive prune neural nets, it forces you to create a train/test division.  Is there any way to force the algorithm to use MY train/test partitions?  If it's using it's own partition, there is no way for me to get an accurate test of the data ...

Read More

: Data Mining    : rdi    :2 replies

HelloI have a come across with the mentioned SPSS Training book, but I do not have the CD with the datasets - therefore, the book is useless at this point. Can anyone please upload the datasets?Thanks a lot.   ...

Read More

: Data Mining    : EMoscosoCam   

HelloI know that Clementine can execute the mining algorithms of a SQL Server or an Oracle; but,  why would be the motivation for doing so? I mean, If you already have a tool for data mining, why would you use the algorithms of another tool?Thanks a lot.    ...

Read More

: Data Mining    : EMoscosoCam    :4 replies

Hello Considering that v16 include some algorithms that are found in SPSS Clementine, is SPSS suitable for data mining? When would you consider to use Clementine instead of SPSS for data mining? That is, what are the limitations of SPSS regarding data mining? Thanks a lot. ...

Read More

: Data Mining    : EMoscosoCam    :1 reply

HelloBesides the tutorial that comes with the product, where in the internet can I find resources for learning Clementine in a self-paced way?Thanks a lot.    ...

Read More

: Data Mining    : EMoscosoCam    :5 replies

Hello All, I am new to a Clementine scripting and I could not find any examples on task similar to mine.So, if anyone knows how to send dynamic parameters into balance node please help me. Thanks!What I need to do is to change multiplicative parameter based on condition, like that :set :balancenode.directives=[ {1 "ncounts = 1 " } {2 "ncounts = 2 " } {3 "ncounts = 3 " }  {4 "ncounts = 4 " } {5 "ncounts = 5 " } { ...

Read More

: Data Mining    : natalie    :2 replies

Is there a date format which Clementine can detect automatically in the Var. File Input node, so that it is not necessary to define the data type manually? I have been using YYYY-MM-DD but it is always detected as string.Regards,Ken Aston  ...

Read More

: Data Mining    : kenasto    :2 replies

Hello everyone,Is there any way I can call an external program (batch file, exe-file, etc) within a stream? Sometimes I would like to start a process from within a stream.Ideally there would also be a way to tell Clementine to wait until the external process is finished, returned some parameters, etc. before continuing to execute the stream.Regards,Ken   ...

Read More

: Data Mining    : kenasto    :2 replies

Whenever I try to use specified sort order function in a Merge node (Optimization tab), I get an error message like this one: "Specified previous sort order invalidated at record: (8)"Both tables are definitely in the specified order because I had just sorted them. Is this a bug?  ...

Read More

: Data Mining    : kenasto    :3 replies

Is it possible to delete text files within a stream? The stream produces large temporary output files. I would like them to be automatically deleted during the stream execution. I checked the scripting manual but couldn't find a command for it.Regards,Ken Aston  ...

Read More

: Data Mining    : kenasto    :3 replies

Most of the time I use text files for input and output. But actually, when the stream becomes long I would prefer to store all the data in database using ODBC. But this involves reading and writing back to the database many times during stream execution (with multiple streams after each other).I tested it with a dataset of 100,000 records, each 10 fields, and reading as well as writing is faster when I use simple text files instead of a database. I just test it with the slow ...

Read More

: Data Mining    : kenasto    :2 replies

After doing some tests it seems like Clementine's database access is extremely slow. Has anyone experienced the same? Here is the test I conducted:Stream setup:import node -> filter node -> select node -> screen outputThe select node selects about 300,000 from the 20,000,000 rows in the dataset, by an ID number (integer). The stream is executed on a client, everything else is on the server (SQLServer database and raw data in text file format). I've created ...

Read More

: Data Mining    : kenasto    :5 replies

 Hello, Neither the Scripting  Documentation nor any online help seems to list the explicit commands for setting node properties. I found examples, but is there a complete list of all the commands? Particularly, I am trying to find out how to set "Create table" or "Insert into table" in a database output node, such as (fictional command): set databasenode.databaseExportType=createorset databasenode.databaseExportType=append I ...

Read More

: Data Mining    : kenasto    :3 replies

We experience the following error.Error reading fileStream_name.strInvalid entry CRC (expected 0x7990f41e but got 0x5e90fe1e)Thanks in advance for any support you can provide.Roberto ...

Read More

: Data Mining    : RobertoRapozzati    :1 reply

Hi all and thanks in advanced. I am a newbie in Clementine and sorry my English. I've been investigating about    decision trees and I used the classic example of playing Golf. So, I used a C5.0 node and finally a Table node. In this last node Clementine shows the original fields (outlook, temp, humidity, wind and play) and also two more fields: $C-play and $CC-play. I know the first one is the predicted and the second one I think is the confiden ...

Read More

: Data Mining    : Castillo   

Is there any way I can make clemnentine to ask me for some values when the execution starts?Why do I need this. I have many repeating task - for example Churn prediction model which executes ones per month.  Before I start stream I need to enter nodes and change the date (YYYYMM) and some other parameters. I have many database nodes where I use QUERRYs for gathering data and I always need to enter this date (many times the same date). I tryied using '$P-MONTH' ...

Read More

: Data Mining    : ludimax    :4 replies

 [:'(]Hi, everybody:I use clementine client and server (both 10.1 simplified Chinese version) for data mining project. Clementine server was installed on Machine IBM P550 which operating system is AIX 5L 5.4 (simplified Chinese version). Clementine client was installed on a PC which operating system is Windows XP SP2. I get the question below:Data file used in the project is a csv file, exported from oralce 11i which was installed on Unix system. All of th ...

Read More

: Data Mining    : wlzlyg    :1 reply

Dear All, I have this problem and i don't know what to do...for a specific field called "volume", any value exceeding 999,999 is turned to $null$. I have tried to override the field type (which was setting to String rather than range) on the file Node and also i have tried to specify a range value on the type node....nothing works. Is there anything else i could do?  Many Thanks  Harry  ...

Read More

: Data Mining    : hgwelec    :3 replies

Dear All, After performing a classification task using C5,CART and CHAID I am using the generated models on a stream to get the results from an ANALYSIS node. The target field is called CLASS on my stream. In the output of the analysis node  i get for model names : $C-CLASS$R-CLASS$R1-CLASS  ....OK so $C-CLASS refers to the C5 Node, but what about $R-CLASS and $R1-CLASS. Which one is CHAID and which one is CART? I tried renaming the models i ...

Read More

: Data Mining    : hgwelec    :2 replies

Dear All,  I am trying to derive a new field as follows :  1)..If a field called "text" has String value " the sun is shining"2)...and a field called "score" has value 0.9 if  score >= 1.0 then deriveField=contents of field "text" + the String "_GREATER" if score <1.0 then deriveField=contents of field "text" + the String "_LESS" So in the above example the derived f ...

Read More

: Data Mining    : hgwelec    :2 replies

Is there any way to do this? I don't use the clementine folder for my projects, so it would speed up things to have the program display my projects folder instead of the program's.I checked the help, some program options and the clementine folder but couldn't find out how to do it. Is there a way?Thanks.  ...

Read More

: Data Mining    : Arkantos    :3 replies

Hi everyone, I have an AMD X2 4200+ with the optimizer installed, and I have noticed Clementine sometimes uses both cores and sometimes it doesn't. It is quite annoying because when using both the processing speed literally doubles.  A PC / Clementine reboot sometimes fixs this, but why does it happen?Thank you! ...

Read More

: Data Mining    : Arkantos    :6 replies

Hi guys, I'm using Discriminant to predict a categorical output with 8 values.My problem is that the model's selection may not get past the filters in other departments... so I would need to always have the "next best choice" available.This information should be available due to the nature of all data mining classification methods, right? I think I have read some months ago Tim (Manns) recommending to create a separate output field for each value when ...

Read More

: Data Mining    : Arkantos    :2 replies

Hey guys, I was wondering if anyone had any experience working with Clem 12 under Vista 64.Max memory I had to leave it at default (256mb) because changing it would give me the thread title's error. In fact, by default, Clem 12 wouldn't launch: I have to initialize it as an administrator. But this doesn't save me when changing max memory.Any ideas? I have 4gb of ram, sticking to the 256 just for compatibility's sake would real ...

Read More

: Data Mining    : Arkantos    :4 replies

So, after reading the help in Clementine and seeing the demo, I concluded that for SLRM you must have:1) A "PRODUCT_OFFERED", string field with different product offers.2) A "ACCEPTED", flag field with T/F value, stating if the client has accepted the offer or not.So now you should have a base with, for example, 5 different and balanced (10-30% each category, suppose) values for PRODUCT_OFFERED, and in each category, T/F values for "ACCEPTED" bal ...

Read More

: Data Mining    : Arkantos   

Hi everyone, I have been using the "Action -> Discard" function included in the Data Audit node, but the problem is that this has to be done individually to each field... isn't there a way to do it automatically? I tried selecting many fields and setting the option but it still changes only one. Looked for help at the manual but nothing... Clem 11.1 btw... Thanks ...

Read More

: Data Mining    : Arkantos    :2 replies

Hello, I may have to migrate from Clementine to WEKA one of this days. Has anyone had experience with this program? I'm downloading it now and I will start testing it, but I have no idea.Any comments/opinions/suggestions will be very useful, thank you.A.PD: I don't intend to open Clementine streams in WEKA ("migrate" can be misinterpreted), I intend to do the same things I am doing with Clementine, but in WEKA. ...

Read More

: Data Mining    : Arkantos    :1 reply

Hi guys, maybe somebody can help me.I have a dbase with several columns containing history, example:SALARAY_JAN, SALARY_FEB And so on...If I wanted to get the average from the salary group, I would just type it. But the thing is I have 200 groups with 4 fields each. That would be a lot of typing, specially because I don't want to get just the average, I want to get about 4 or 5 equations from each group.Any suggestions to avoid wasting my time with a repetitive task?Than ...

Read More

: Data Mining    : Arkantos    :2 replies

Hi guys, maybe somebody can help me.I have a dbase with several columns containing history, example:SALARAY_JAN, SALARY_FEB And so on...If I wanted to get the average from the salary group, I would just type it. But the thing is I have 200 groups with 4 fields each. That would be a lot of typing, specially because I don't want to get just the average, I want to get about 4 or 5 equations from each group.Any suggestions to avoid wasting my time with a repetitive task?Than ...

Read More

: Data Mining    : Arkantos    :4 replies

Is there a way, besides using the same amount of data for training as for the prediction, to get with neural networks or whatever a real probability value? Confidence is not probability, it's just useful to determine the relative probability with other records. Right now I'm just comparing to reality and creating sub groups with probability value. Thanks! Arkantos  ...

Read More

: Data Mining    : Arkantos    :2 replies

Hi, I have two questions regarding Neural Networks.1) Suppose today I generate a Neural Network model, and in a couple of months I get new data, and I want to train my generated model with the new records.Do I have to start over? Do I have to append the records and make one big pass with the node, generating a new network but with more data?The Neural Network node help says that if the "Continue training existing model" option is selected, then the model previously ...

Read More

: Data Mining    : Arkantos    :2 replies

Right now my Clementine is running a process, I've been working 12 hours straight with just some breaks for eating and going to the bathroom. My objective is to assign a probability number to 1.750.000 records, being the probability of adquiring a product next month. Each month, only 5.000 of those records adquires te product. I have 150 usable fields, with many different kind of distributions, storage class, types, and so on. I even have one set field with 100 diff ...

Read More

: Data Mining    : Arkantos    :5 replies

It's a simple situation, that I really don't know how to solve in Clementine:I made prediction number 1. I have the primary keys, the $N-X and the $NC-X fields. I get the new data that contains what really happened, so I make a field named X with T or F value, according to the new data. Then I run the Evaluation Node to see a Gain Chart.Then the same process but with prediction number 2. How can I display in the same chart the Gain curve 1 and Gain curve 2?That' ...

Read More

: Data Mining    : Arkantos    :2 replies

I've been making some churn neural networks models, and I've arrived to some conclusions, but this is something I can't understand very well. The category T I'm trying to predict is very small, so I use a small sample of F for training. I leave default options. So I arrived to two "best models". The model A is the one with the best gain chart considering only the best scored records, and the model B is the one with the best gain chart overal ...

Read More

: Data Mining    : Arkantos    :7 replies

hi,i have a database with 100k lines, each line is a record. there are occasions where a line is defected or one of its values is reported to be "unknown value", i would like clementine to ignore these lines automaticaly (or delete them before the C.50 is being executed). thanks in advance, ...

Read More

: Data Mining    : warpy    :5 replies

HiI am trying to discover sequential patterns by using the sequence node of SPSS Clementine 9.0. Unfortunatly, after having read the transaction data, there is the following alert: Internal Error: Bad function call in file "data_file.cpp", line 180I don't use a server, and execute sequence analysis only with my own PC (RAM 1MB); transaction data have round about 1 Mio lines. Does anyone know what's wrong and why there is this internal error of Clementine? ...

Read More

: Data Mining    : CScheffler    :2 replies

Hi, I am trying to apply the K-Means model to a data set I have in SPSS Clementine. I have managed to generate 5 clusters, and have added a new column to the table containing the data (that specifies which cluster a record falls into), but I do not know what to do next.  Is there a way of graphically representing these clusters in Clementine? Using a graph, or histogram of some kind? Or is the list of clusters (the actual K-Means model) as far as it goes?Thank you v ...

Read More

: Data Mining    : LizAndTodd    :2 replies

Hi, I was wondering if anyone knew of any online tutorials or books, which give a walk-through of how to use the linear regression model in SPSS Clementine? I have never used it before, and want to apply it to a small data set that I have.Thank you again! ...

Read More

: Data Analysis and Statistics    : LizAndTodd    :2 replies

hello,i m trying to design a CEMI node for parallel processing. In this i have put a LIST control on "Settings" tab and based on the selection i want to make some sheets vible and invisible. Sheet created for "Settings" tab are working file, but sheets created for extra tab  (called EXPERT) can not handle VISIBLE option. I am attaching my specification file for better insight. If i try to add this spec file, clementine add it to proper location (" ...

Read More

: Data Mining    : ybadhe    :1 reply

Hi,I[m new using Clementine and I heard it was better if not necessary, to work with data cubes for feeding data to Clementine. I[ve searched a lot in Google but I haven[t found a true answer to my doubt.Any help appreciated. Juan ...

Read More

: Data Mining    : juancarlosr    :1 reply

hiis it possible to script to connect to a database? Our DB2P connection times out after 10 mins and icks me out. By the time I have run one section of the stream the database nodes are no longer connected and the stream stops. I have tried to re-create the Database nodes using script or use a small dummy database connection to keep the connectino alive but neither work. can the database connection be scripted? thanks ...

Read More

: Data Mining    : fuzz    :2 replies

Hi!I am in the progress of researching/writing for my master thesis at the university from the topic of Data Mining for CRM. This topic is still quite new from where I come from (Serbia) so I have no real support or to say on-hand help to guide me with it. I was recommended using SPSS Clementine for this project. I have version 8.1 to use.Actually as I have no data sets available from companies I was wondering if someone could guide me on building a simple model (nothing too ...

Read More

: Data Mining    : ivan.karlovic    :4 replies

I am encountering a project that are given by some random data (i.e. dempgraphic data, purchasing data) to train a model and then apply that on another set of data. We are required to use that model and try to predict who is going to buy our product in that new set of data.However, we've tried every method (i.e. logistic regression, decision tree and neural network) and still couldn't come up with a very pleasing accuracy rate with the trained model. ...

Read More

: Data Mining    : HKHK    :3 replies

Hi,     I am want to store only the R square value from the regression output in txt files through scripting. The reason for doing this is to do further processing depending on this R square value.  If regression model with stepwise method is run, then some variables are excluded from the regression model. The problem is that we want this variable name to be written in separate txt file which are included in the regression.  Is it ...

Read More

: Data Analysis and Statistics    : nivrutti    :2 replies

hi,     If I develop a data mining algorithm, and import it into the Clementine by the CEMI. After the model is trained, can I publish  the generated model  with the Publisher Node?  In other words, can I publish a CEMI model to fit the need of a real time scoring application?      Thanks in advance. Neil  ...

Read More

: Data Mining    : flycoco    :5 replies

Hi:   In the Clementine, Derive Conditional nodes use If-Then-Else statements to derive the value of the new field. How to create a multi-conditions( if-then elseif-then elseif then... else) Derive node?  for example, the origin field is "age".  I want to modify the abnormal values and keep the normal values for that the abnormal values are "dirty".       If the Age <=  0 then Age =0   ...

Read More

: Data Mining    : flycoco    :1 reply

Hi,I am using Clementine 10.1 to generate a cluster map of people with the Kohonen node.One of my inputs is a set which describes the group that a person works for. This set has 40 different group codes in it.When I run the network, I get a warning message for the stream saying that it is 'ignoring large set input field x'.Is there a limit to how many things you can have in a set? If so, what is it?Thanks ...

Read More

: Data Mining    : elj729    :4 replies

Help with Churn Project.I'm working in a churn project for a retailer (supermarket) with Clementine. I have 3 months of transactional information (with some demographic like gender, marital status, age, ocupation, etc) and 3 weeks in order to verify desertion.The period of time for desertion is relatively short because I'm working with 2 segments ofcustomers who are very frequent (average 1 visit every 5 days).I have tried to model using the algorithms of Neural Net ...

Read More

: Data Mining    : vspl    :3 replies

Hello, help! I need to run reliability analysis on the following data: multiple subjects, each subject has 2 judges for the 50 possible variables (each subject only expresses about 15, min 6, max 29), and each variable can be repeated at irregular intervals within the same trial (hence why I need something that accounts for time). I've had a couple of different people tell me that I need to create a time series, but that keeps normalizing my times and therefore ...

Read More

: Data Mining    : Shadowsclaw13    :1 reply

When running one of my streams, I need to be able to close a table after the output has been created. I have tried the following commands in Clementineexecute FIRST:tablenodedelete FIRST:tablenode  -- this just deletes the node, not the output.delete tablenodeoutput -- does not do a thingThe following commands just result in a syntax error:close FIRST.tablenodeclose FIRST.tablenode.outputdelete FIRST:tablenode.outputdelete FIRST.tablenode.output - (period between tableno ...

Read More

: Data Mining    : oval    :4 replies

A really poor delimitted file was received with field values corresponding to field x ending up in field y. After a bit of clean up a good chunk of the file was useable, unfortunately the numeric values that I have on the file remained as strings and the comma that separates the thousands and millions place remained.  I need to get rid of the comma in the string value so that I can convert the string to a numeric value. Trying to go from string to numeric does not work: ...

Read More

: Data Mining    : oval    :2 replies

I need some help... I want to know about this topic about clementine "Getting the most from models".  ...

Read More

: Data Mining    : norbak    :3 replies

Please let me know if this is the right way to z-scoring? ITEM_PROFIT / ( @GLOBAL_MAX(ITEM_PROFIT) -@GLOBAL_MIN(ITEM_PROFIT))but how can I arrange the Set Global Node so it run automatically each time I do the above transformation in Derive Node?I am doing this for preparing data for clustering. So another question is some of item profit is minus (- 0.5 for example, comparing to average +5$ profit), is this a problem? Many thanks.  ...

Read More

: Data Mining    : hunterdong    :4 replies

Hi, anybody know how can I set the stream so that when someone click "run", a window will pop up and allow (or prompt) him to change a value in a SELECT node? Or anything from script side can achieve that effect? Thanks. ...

Read More

: Data Mining    : hunterdong   

Hi, I am trying to create a series of derive nodes, assign them some strings (13-digit length, may begin with 0), but it keeps assigning the string value as scientific format?I don't want anything like 7.87897423432432E14, is there a way to mandate it remember I gave the variable a string value? Thanks mates var sku nodey iset sku=["ok","00ok787","0000003432432"]set i=5for codeA in yrset i=i+5set code="OK"set co ...

Read More

: Data Mining    : hunterdong    :6 replies

Hi dear all, I have got questionnaire data (SPSS format) with over 2,000 columns. A few questions: 1. Is this usual to have thousands of variables? (in the questionnaire some questions are multiple choice with a few hundreds of options. I don't know how they carried out the survey! Must be a very long page in IE...)    My answer currently is Yes. 2. How can I store this into database (e.g. SQL Server supports 1024 columns only)?   ...

Read More

: Data Mining    : hunterdong    :2 replies

Hi, If I have a table like below, where each customer has a field contains a value (income) and some fields of spending in different categories, and I want to derive a field for each row to represent the percentage of that row's value for his yearly income (there is always a row records his income).CustName  Field         Value            &nbs ...

Read More

: Data Mining    : hunterdong    :3 replies

Hi,I want to choose every pair of combination of say my top 100 products. And a RESTRUCTURE node can place the name of these 100 products into a list [].But how can I retrieve each pair of these 100 elementes in a list like [ProdA1, ProdA2, ProdB1....]? (I will create select criteria using these pairs)The only alternative way I have done this is hard-code it in stream script, assigning a string: (luckly all the products code is the same length)set str1="ProdA1ProdA2ProdB ...

Read More

: Data Mining    : hunterdong    :4 replies

Bascially I need to produce output to different folders. (Didn't find a way to create folder in script but that is another issue). So I use strPath="C:" >< varFolder1 >< "name.csv"which would NOT work, or maybe because I didn't find at the corner of the helpfile how to place a backslash in that place I want it to be. Single quote won't work in stream script. Solutions I found :1. use substring() or trim space ...

Read More

: Data Mining    : hunterdong    :2 replies

Hi guys,Have you experienced in scripting nodes in supernode?My Clementine (stream script) doesn't allow me to manipulate restructure node in a supernode, while allowing manipulate select node! set SuperNode.parameters.'Restructure2.fields_from.Drug'=["A","B"] #will NOT workset Restructure1.fields_from.Drug=["A","B"] #will work Anyone has any similar problem? I also can not find a way to clea ...

Read More

: Data Mining    : hunterdong    :1 reply

Hi, I have got one table which contains sales records for each week:Week  ProductName Valuewk1    ProdA    5wk1   ProdB    3wk2   ProdA   4wk2  ProdC    5 If a certain product is not sold that week, it is not stored as a row for that week.If I want to insert these non-sold product with 0 value, will I be able to do this? I can do this if the table only conta ...

Read More

: Data Mining    : hunterdong    :9 replies

Hi friends, question again:I have the following transaction dataTransID Product 1 jam 2 milk 3 jam 3 bread 4 jam 4 bread 4 milk 5 meat6 cheese6 beer6 water6 crisp7 vegandGroup A: meat, bread,cheese,crispGroup B: milk, beer waterIf I need to get the number of transactions from every possible combination of one product from A and the other product from B, is there a normal way to do it in a single clementine route?e.g. number of transactions containingMeat+Milk:Meat+B ...

Read More

: Data Mining    : hunterdong    :5 replies

Hi dear all, During analysis of transaction data I am a bit confused about the boundary of ETL and analysis. I am supposed to extract summary information from transactional format and carry out further analysis. As there is no data warehouse or data mart or anything like that, streams are built to generate summary data of different granularities and subjects for analysis. But if not considering loading incremental data, I suspect the efficiency and long term benefit ...

Read More

: Data Mining    : hunterdong    :1 reply

In segmenting customers using transaction data, are there any alternative metrics, say using change of frequency (steady increased frequency, fluctuating, decreased visit frequency) in stead of frequency in RFM?If there is any, what is the common algorithms, or approaches to catch that using Clementine from transaction format data? I can derive things like:Customer1Jan +  (Increased frequency comparing to last month)Feb +  (Increased)Mar -  (Decrease)Apr =  ...

Read More

: Data Mining    : hunterdong    :1 reply

Hi dear all, what is the common cause of this Internal Error: Stream optimisation failed and is switching off? It is a super node of the same input, two branches inside, and each can be optimised separately, and when two branches are merged using a single field, this warning appearsThanksPreviewing SQL: SELECT T0.C0 AS C0,T0.C1 AS C1,T0.C2 AS C2,T0.C4 AS C3,T0.C6 AS C4,T0.C7 AS C5 FROM (SELECT T0."TRANS_ID" AS C0,T0."TRANS_DATE" AS C1,T0."CU ...

Read More

: Data Mining    : hunterdong    :1 reply

Got a question and need some insights:If I have 20 types of product in a transaction table, and I want to know each pair of product type's value (which means I need to build a table of 190 rows)Source table is like:Trans1 TypeA 5$Trans1 TypeC 1$Trans2 TypeA 2$Trans3 TypeA 1$Trans3 TypeC 2$Trans3 TypeT 1$which means Transaction Trans1 contains 2 types of items with total value 6$; Trans1 contains 1 type of total 2$I need a table likeTypeA_TypeB  0$Type ...

Read More

: Data Mining    : hunterdong    :8 replies

Don't know why but when I ran a stream script (as simple as update select node value and then excute an database export  node), all CPU resources are taken and other users could not even log on. As a result I am banned from using script/SQL codes as a myth says these may harm the performance? Any ideas? ...

Read More

: Data Mining    : hunterdong    :1 reply

Hi, I have a stream which joins two source table (MERGE node), and then processes the merged data on 3 different kinds of aggregation, and then 3 distinct results thereafter. I don't think cache will speed it up, and the merged data is huge (10 million rows), this sometimes caused performance issue for other users. Did some experiment on smaller sample, result is count-intuitive: run threee branches one by one will take less time than start all three ...

Read More

: Data Mining    : hunterdong    :2 replies

Urgent question seeking helpHow to detect DUPLICATED records and output them?How to find if a variable is fully $null$? (recognized as typeless so can't use Data Audit node) ...

Read More

: Data Mining    : hunterdong    :3 replies

Hi, If I got a (purchasing transaction) table with more than 100 million rows and 50 columns, how am I supposed to explore the data? Now I am using sampling like 500K rows from begining, otherwise I can't do anything like GRI, K-Means because I don't know how long that will take..Any ideas?Thanks, ...

Read More

: Data Mining    : hunterdong    :1 reply

Hi dear, If I have multiple fields like:CustID  Items_A_Bought        Items_B_Bought ... Items_Z_BoughtJohn    1                                 7                  & ...

Read More

: Data Mining    : hunterdong    :2 replies

Hi, I couldn't find related information in help file. If my stream has a supernode S_Node (non terminal), and a filter node Remove_Sum_Filter1 is contained in the supernode, how can I successful run the stream script? I got a " can not find node" error.If the node is not in supernode it runs smoothly. Many Thanks,# the script # a loop to rename fields (shorten field names) # first make a temp variable to hold the field name var tim # start loop, ...

Read More

: Data Mining    : hunterdong    :2 replies

Does this mean some structure issue? Cheers, ...

Read More

: Data Mining    : hunterdong    :6 replies

Hi,If I want to consolidate multiple fields and link possible records together, i.e. if I know People A are using Card1 and Card2 and LoyaltyCard1, and People B are using LoyaltyCard1,then I want to identify all of these cards and loyalty cards as the same people (by creating a new ID field)So the source file will be:CustfromFileA, CustfromFileB, CustfromFileCSSN1,Card_1,LoyalCard_1SSN2,Card_2,LoyalCard_1SSN1,Card_3,naSSN2,na,nana,Card_2,nana,na,LoyalCard_1na,Card_3,LoyalCard ...

Read More

: Data Mining    : hunterdong    :3 replies

Hi,I was thinking is there a way to construct a function that derives the last day of a DATE value's month. For example if a field is 2008-08-16 then a derived field will be 2008-08-31. SQL has this function, however is it possible to do it in Clem expression or script?A similar question is if I know which Saturday of 2008 (the first Saturday of 2008), how can I derive the date information? Intuitively I can use a hard-coded table to lookup this, but seems ...

Read More

: Data Mining    : hunterdong    :1 reply

I received a request for voting PASW in kdnuggets and noticed this...PASW Modelling instead of Clementine now.It may be related with former executive failed to sell the trademark to the company for 20M$. When I read the story on PASW website about their renaming I was laughing.  ...

Read More

: Data Mining    : hunterdong    :3 replies

Hi,I created some custom data processing utility and am now seeking to integrate it into Clem. I think I need to create it as a Clementine node as CLEF.I tried to read help but still not very clear. Seems I need to create a Specification File, and if that can not fulfilling my demand, then call API. So the first question is where can I get a guide to create this or to find out all the grammer/key words? I can see "Example of a Specification File" and clef_examples_e ...

Read More

: Data Mining    : hunterdong    :1 reply

Hi Dear All, I want to use Clementine to find association rules for data such as the retail market basket data from an retail store. However it is in a variable-length file with each line contains all items within a transaction separated by space.http://www.kdkeys.net/how-to-transform-transaction-per-line-data-to-feed-into-apriori/#link-89 How can I use Clementine (or Excel) to prepare it for the Apriori node which support either one item+TransactionID per line ...

Read More

: Data Mining    : hunterdong    :6 replies

Hi, Will I be able to detect and output all the fields which contains negative value (<0)? For example,A B    C    D  E1  1    txt  7   -17  -1   txt  0   55  2    txt  5   7Can I use expression to output B and E are detected? Is that something including @field+1 or @Next(field)? Where can I find documenation about this? ...

Read More

: Data Mining    : hunterdong    :5 replies

Hi Dear All, I am trying to run a Sequence node over about 20,000,000 rows of transaction format data. The ID field (Customer ID) is a unique string so I had to specify as "typeless" (also what the Type node decides after it read data). Time field and content field are all right and value appears in Type node. However I receive this error:Unknown data value in symbolic fieldCheck the type for this field is correctly instantiated?If I reduce the amount by sampli ...

Read More

: Data Mining    : hunterdong    :3 replies

Hi,I writted a Visual Basic CEMI node to determine the upper and lower 1% of a range variable, I want to add to it the abilite to generate a DERIVE node directly on the stream, like you do with the BINNING node.Is this possible? Can I call the script language from visual basic?Thanks a lot     Adolfo ...

Read More

: Data Mining    : fito    :3 replies

I want to use a Supernode parameter (Ex.maxpar) in a "derive node"; for example: if var > maxpar then 1 else 0 endifHow I refer to max_par? I tried $P-maxpar but it doesnt work...There is any turnaoround?Thank,  Fito ...

Read More

: Data Mining    : fito    :2 replies

I need to print, for audit purposes, the stream and the content of all the nodes.Of course I can print the stream with the “print” option of Clementine, and write a script to “writeln” each of the properties of each node (in fact I need specially select and derive nodes), but the problem is that several nodes share the same name (Ex: the name of the derive variable)… again I can rename the nodes with the annotation option but I need a better, au ...

Read More

: Data Mining    : fito    :4 replies

for example, to execute generate boost from the result of a distribution graph ...

Read More

: Data Mining    : fito    :1 reply

Hi, does anyone know how to calculate matrix at Clementine script?I want to try to make 'Markov Chain' stream or script but do not find any clue at Clementine manual.At, SPSS Base with Advance, I know how to do with it.Taka   ...

Read More

: Data Mining    : taka    :3 replies

Hi, Clementine 11 is now available at my office. After installation, I launched Clem11 first. With working Clem11 first, I started Clem11 second. It did not launch. I mean, Clem10 can launch multiple process. However, Clem 11 can launch only with single process. Does anybody know how to launch multiple Clem11 (s) at a single PC? I have five clem client licenses. SPSS intends me to use a clem license at a single PC.But, I want to use multiple Clem11 processes at a single PC.Ta ...

Read More

: Data Mining    : taka    :10 replies

Fellows,I am new at SPSS!! I have project about SQL standalone to find some association rules on the DataWareHouse such as frequent item sets but I need some tools to compare my result and prove my works, I need an application that can able to find association rules specially frequent itemsets, Recently, I download SPSS 1.1  to amalyze my dataset  but I have no idea how to use SPSS specially for association rules? Does anyone ...

Read More

: Data Mining    : ergin    :3 replies

Hello,    I haven't had much exposure to SPSS or Clementine in the past. My company is currently developing a fraud prevention model to identify high risk accounts. We are currently trying to decide which software is best for this purpose, or if we should just outsource the project all together. Here's some of the details. We are a short term lender, our customer database consists of typical demographic and transactional information including nam ...

Read More

: Data Mining    : jsun    :3 replies

The Model tab for K-Means models contains detailed information about the proximities of every cluster to all other clusters. Is it possible to get this information (proximities of a specific cluster to all other clusters) as output, so I can use it for further analyses in Clementine scripts?Your help is greatly appreciated. Thank you.  ...

Read More

: Data Mining    : jofra    :2 replies

Hello all,I am just begining to use clementine for the first time and havealready come unstuck.  I'm trying to load in some sampledata.  I have it in a .csv file - everytime I try to load it inthe var node it says "can't read file column names" - and so no datagets loaded.  I originally prepared it in an excel spreadsheet inthe usual table looking format.I'm a computational linguist and not a data miner so I'm having a fewdifficulties getting started here - I a ...

Read More

: Data Mining    : Opie    :4 replies

Hi all,I was writing a stream script to create a type node. But I don't know on how to set the Data types of field within the type node using scripts .For example  if the field is either "set" or "Flag" etc.Can anyone help me on this?Thanks in advanceRahul ...

Read More

: Data Mining    : Rahul Dev Sharma    :1 reply

Hi all,Can anybody tell me if it is possible to transpose a datafile using an identifier field in Clementine? Is there a straight forward method witout using the "set to flag" and agregate node? Thanks Rahul ...

Read More

: Data Mining    : Rahul Dev Sharma    :7 replies

I have some web log files I want to derive sessions from.  I've papers by Mobashe et. al. on how to create sessions and think I can see how to do that in SQL.  Import the log files, query for a particular IP address and then first and last times and first and first + n minutes.  Can I do the same in Clementine? ...

Read More

: Data Mining    : werowe    :1 reply

Hi there, I am new to SPSS clementine, and I am having a problem getting a decision tree C5.0 right. My dataset has about 5800 records with 85 input variables and 1 target (set data type)variable. Using data partition, I split it into 67% training data and 33% validation data and the seed is 15000. For C.50 node, the setting is as follows:output: decision tree.mode: experts with minimum number of observations in a leaf of 1. After running the decision tree node and browsing t ...

Read More

: Data Mining    : tinybunny_8    :5 replies

HiI working on web-mining project. I need to clusrter to fildes: I.p add# (212.12.13.14 for example) and URL (/campus/.....).I working on data set with a 100,000 recorded, and a 512 kb memory.Whan i browse the result i get only one cluster!!  pressing the VIEWER button, endedwith empty screen.??thanks,Dan Lavin ...

Read More

: Data Mining    : dan lavin    :4 replies

I wolud like to know if it possible to convert an I.D. index such i.p. address to string like visitor#1..instead?(for  example: i.p. 212.23.13.13, 32.12.23.13, 212.23.13.13)???? visior1, visitor2, visitor1...Thank'sDan ...

Read More

: Data Mining    : dan lavin    :4 replies

Thread Status: No Status Resolved Not Resolved Hi TemFirst, thanks, for the "visit i.p" stream, it is realy helpful!!I working on a web mining project. As part of the Pre-process work, i need to IDENTIFYING VISITORD and  SESSION. The following are a sample of the log file :date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) cs(Referer) sc-status sc-substatus sc-win32-status sc-bytes cs-bytes 2005-06-28 00:00:18 212.199.222. ...

Read More

: Data Mining    : dan lavin   

Hi everyone!I have a problem starting clementine. When I load it, it does not respon for a few minutes and then I get the following message:Server exception: java.net.SocketTimeoutException: Accept timed outI also have noticed that it cannot connect to the local server.Can anyone help me on this? ...

Read More

: Data Mining    : hector__21    :3 replies

Hi,I am new to Clementine.I went to Tools, Stream Properties, Script and then typed some script commands.  Then, I needed a script from a different file, so I selected Import from the File Options to get that script.Is there a way to include that file in the middle of the script using a command instead of manually importing the script file and placing its contents.Please advice.Thank you.B. Gopalan ...

Read More

: Data Mining    : bgopalan    :1 reply

Hi - I am hunting around for documentation relating to Clementine's dbms integration, specifically on Clementine's "odbc-netezza-properties.cfg" and "odbc-netezza-operators.cfg" files and wondered if anyone can help.  I understand these 2 files handle the mapping between the Clementine functions in the user's streams and the SQL that gets squirted to the backend dbms. Correct?I am looking to achieve full interoperability between Clementine and a Netezza Perform ...

Read More

: Data Mining    : jam    :2 replies

Thanks very much for that info - very handy.There's some datatypes in there that I can only guess at, and some I can't.  Considering the followingselection of functions from an odbc-operators.cfg file:'*MAX', 1, N, N, "MAX(%1)"'*MAX', 1, D, D, "MAX(%1)"'*MAX', 1, T, T, "MAX(%1)"'*MAX', 1, TS, TS, "MAX(%1)"integer, 1, B, X, "(1 1)"'*BETWEEN', 3, B, X, X, X, "(%1 BETWEEN %2 AND %3)"log, 1, F, N, "{fn LOG(%1)}", LOGQ1: Why are some function names enclosed in single quotes like ...

Read More

: Data Mining    : jam    :1 reply

I am trying to connect to the Clementine server via the client end.But, each time when I tried, the Server end will get a blue screen(Dr. Watson) with the following error message:Driver_irql_not_less_or_equalDoes anyone have similar encounters b4?Server configuration:1 GB Ram120GB HarddiskP4  2.4a ProcessorW2K Server OS.Clementine Server 9.0Rgds ...

Read More

: Data Mining    : amkdocom    :1 reply

hi,                                                                               &nb ...

Read More

: Data Mining    : michelle   

hi, i;m trying to use spss components in visual basic but i can't , does anyone know how to use and active them?thank you ...

Read More

: Data Mining    : migueleitor    :2 replies

Something that caught me out the other day.  It is the correct and logical outcome, but can be unexpected.If you concatenate any number of strings together and one of the string values is null, then the result of the concatenation will also be null.  Clementine behaves this way, as does SQL and some programming languages.If you want the result of a concatenation to equal the combination of any non-null values, then first use the Filler node to replace nulls with an ...

Read More

: Data Mining    : TimManns   

Hi all,   This example is requires a bit of knowledge of Clementine Solution Publisher, and since it is written in VB.NET, an understanding of this programming language would help you understand how the application interacts with Clementine Solution Publisher.The source code might be useful to people wanting a VB.NET example of how to shell an application that has no user interface.  If you have purchased Clementine Solution Publisher then this provides a ...

Read More

: Data Mining    : TimManns   

Hi all,   To use as an additional Clementine Scripting reference, I've created a Clementine 9.0 stream file that contains a comprehensive script.  This script creates all nodes and edits all properties for every node found in Clementine 9.0.  The stream itself is not created to "do" anything, but the script is a useful resource if you ever need to use Clementine 9.0 scripting to automate stream generation or manipulation.  I hope it is useful.   Once ...

Read More

: Data Mining    : TimManns   

I am presenting at the Teradata Universe Beijing this year.  I will be the first speaker for the telecommunications sessions (the other sessions being 'Banking' and 'National Accounts').My talk will be a high level summary of the customer analytics we conduct, how we present to corporate executives, and the benefits our analysis has brought our customers and the company.It's a single day conference, and I'll be taking a few extra days f ...

Read More

: Data Mining    : TimManns    :6 replies

You are probably familiar with the Balance node. It performs the function of selectively and randomly sampling your data based upon the values of a field or number of fields. Also known as stratified sampling!If your data is managed by a data warehouse, then Clementine has this cool behaviour of automatically converting functions into SQL, so the data processing can be performed by the database and less data needs to be extracted and duplicated on another file system.Unfortun ...

Read More

: Data Mining    : TimManns   

I recieved a newbie question on how to aggregate data.The attached streanm (verison 10) provides two examples of common ways to aggregate / summarise data.CheersTimhttp://timmanns.blogspot.com/ ...

Read More

: Data Mining    : TimManns    :1 reply

Attached is an example stream related one of my personal blog posts. Also pasted the blog post below.CheersTim- - - - - - - - - I got a ton of ideas whilst attending the Teradata Partners conference and also Predictive Analytics World.  I think my presentations went down well (well, I got good feedback).  There were also a few questions and issues that were posed to me.  One issue raised by Dean Abbott was regarding building neural networks ...

Read More

: Data Mining    : TimManns    :10 replies

I've attached and pasted below a standalone script that can be imported into any Clementine stream and used to export all node annotations to an HTML file.  The stream loops through all nodes in the stream, takes the node annotation and writes this to an HTML webpage file.The HTML file will be created in either the same directory you opened the stream in, or the Clementine installation directory (by default this is C:Program FilesClementine9.0).The HTML file that wi ...

Read More

: Data Mining    : TimManns   

Occasionally I hear the same query; how can a Clementine user obtain a number of predictions for a categorical target field.  Instead of the Neural Network providing a single prediction (and associated confidence of that prediction) for a set/categorical field, they want a score provided for each set value, and the sum of these should equal 100%. Such an output can be used for things such as product recommendation, whereby a ranked list of products (along with predi ...

Read More

: Data Mining    : TimManns   

Hi all,I've been toying around with this idea for ages but my progamming skills aren't great so it took me a while :) .  I've built a small application that can load in a Neural Network model in PMML format.  The application parses the Neural Network into datagrids and also creates a graphical representation of the Neural Net.I'm not sure how useful this will be.  It was just a long-term hobby I've been playing around with.  I've attached a single zip file ...

Read More

: Data Mining    : TimManns    :3 replies

A minor irritation for me is that an Aggregate node will always add "_Sum" or "_Mean" to the end of a field name when the field is used as an aggregate field.Most of my Clementine analysis is automatically converted into SQL and processed by the database.  Keeping field names shorter than 30 characters is a limitation of our database.The Clementine script below automatically renames all the fields within a specific Filter node.  This ca ...

Read More

: Data Mining    : TimManns    :2 replies

There is a Marketing Analytics conference in which I am the opening speaker (god help me...).  It is in Singapore during 19-20st November 2007 (http://www.analytics2007.com).  I am also giving an associated three hour workshop on 21st November. I was requested to speak on a fairly generic topic (see below).  I will try to cover the major data mining issues many large companies face. I will briefly mention Clementine and Teradata as the two m ...

Read More

: Data Mining    : TimManns   

I've provided a very simple example of how you might typically transform transactional level customer data into summarised customer data within one row per customer.  I have attached an example Clementine stream and data file that illustrates this.There are details on my blog;http://timmanns.blogspot.com/2008/11/simple-data-transformation-example.htmlCheersTim ...

Read More

: Data Mining    : TimManns   

Hi all,I've attached an example application that can be used as a custom node inside Clementine.  Clementine's External Module Interface (CEMI) technology uses a simple text file to allow users to define the properties of their own custom nodes. Basically any application that can be run via the command line can easily be added as a custom node inside Clementine.  To end users they appear and behave just like regular nodes. I've created a simple vb.net ...

Read More

: Data Mining    : TimManns    :2 replies

fyiI'll try to write about Clementine and data mining related activities in my Blog, but other stuff will be in there too.The blog is just my ramblings. Not a replacement for my input to this forum!http://timmanns.blogspot.com/CheersTim Manns  ...

Read More

: Data Mining    : TimManns    :3 replies

Hi guys,Although Clementine supports field/column names in upper and/or lower case, some database applications may be configured to accept only uppercase.Below is a short Clementine stream script that loops through all nodes in a Clementine stream and changes the field names listed in any Filternode to be uppercase.This was created using Clementine 9.0, but should work in earlier versions.RegardsTim------------# script to loop through all nodes and convert # fi ...

Read More

: Data Mining    : TimManns    :2 replies

Hi Paul,Yes, this is not difficult.  You need to apply a set-to-flag node, for one of the categorical fields you want to examine, then apply a sum aggregation by the other categorical field you want to examine.I have attached a zipped stream file that uses the Drug1n demo data.CheersTim  Reply to an Existing Message Hi Tim , i want to ask you is known a way to use a crosstabulation , not like a end node, i want to use like a Derive node, thaqt is possible ...

Read More

: Data Mining    : TimManns    :1 reply

Hi all,Attached is an example application and VB.NET source code that reads a TwoStep clustering model in PMML (an XML standard - Predictive Modelling Markup Language), uses a XSLT stylesheet to transform the cluster information and export it as a .csv file.It should work successfully with TwoStep cluster PMML, but may also work with K-Means and other clustering models exported as PMML.It is a example of how in VB.NET you can use a XSLT stylesheet to transform ...

Read More

: Data Mining    : TimManns    :1 reply

Hi all,   I’ve attached an example stream that I thought might be useful. It breaks any 5 word text string (separated by spaces) into 5 separate fields.      - background info - As part of a Text Mining for Clementine project our customer wanted to breakdown a derived concept string into separate words.  These words would then be compared to an existing taxonomy system.   The concepts can consist of up to five words, separated by spaces. The str ...

Read More

: Data Mining    : TimManns   

Hi all,   As you may know, the Clementine Report node can be used to generate reports and export tabular data formatted in some custom way.  By including HTML code within the Clementine Report node you can also display rendered HTML on screen or export to file.   I've attached an advanced example that exports a small sample of data using Clementine's Report node. Inside the Report node edit window I have written HTML code that includes reference ...

Read More

: Data Mining    : TimManns    :2 replies

The new features added to Clementine version 9.0 (recently released) are described in the link below.  Also pasted below.http://www.spss.com/clementine/whats_new.htmCheck out the SPSS Clementine homepage;http://www.spss.com/clementine/For add-on features including Text Mining and Web Mining;http://www.spss.com/text_mining_for_clementine/http://www.spss.com/web_mining_for_clementine/ What’s New in Clementine 9.0? Clementine is widely regarded as the leading enterprise dat ...

Read More

: Data Mining    : TimManns   

Hi,Clementine users frequently want to automate the process of building a new model, testing that model and replacing an existing model if the new model is more accurate.  The attached Clementine stream demonstrates how this can be done.  The stream contains a script that automtaes the process of building a C5.0 model, testing and replacing an existing model.  A basic method to measure the accuracy of a model is used to evaluate the models.  Simple 'if the ...

Read More

: Data Mining    : TimManns    :2 replies

Hi, me is a civil student doing demand forcasting as my final yr. project. i did the DSS system in SPSS. but i have to present it in VB. but i donot know to link VB with SPSS. i have only 15 days to submit my thesis.pls help me and let me know how to link together.tq ...

Read More

: Data Mining    : mck   

I know all of you are visual basic professional, so I need some opinion from you all because I'm new in visual basic.I need to do a Decision Support System (DSS) that can do something like forecast and analysis report.And need to be use SPSS or SAS programme to geneate the chart or report.I don't know what database should use to link with visual basic, either MS SQL Server or Oracle.Which one is the better back-end database to work with visual basic and SPSS programme?Thank y ...

Read More

: Data Mining    : skyloon    :1 reply

I have a question; please don't think it silly.....I took the Intro To PASW (Clementine 12.0), which didn't go into much detail about building models. In fact, by the third day of the class, we were a little more than half way through the training materials.Is there a source for learning to build models using Clementine other than taking a course from SPSS?Does anyone have a model/stream that they've created  that also includes a description of the pr ...

Read More

: Data Mining    : ydolemuq    :8 replies

I'm still learning Clementine, but does anyone have a stream  that predicts whether a person or persons will respond to a survey in the future that they can share?  I have a stream, but I'm not getting the results I thought I should get. I'm using the binary classifer/ensemble  nodes, producing about seven different models and four of them failed to complete. Maybe I don't know what I'm doing, but I just need something to ...

Read More

: Data Mining    : ydolemuq    :1 reply

Hi,I am working on the Linear regression of a series ofnumbers and during my research I came across your siteand looked at your stored procedure. It looks good and the only question I have is aboutthe calculations you are using.' N : the number of elements in the samplee.g. the count of the independent variable X'' A = (SUM(Y)*SUM(X^2)) - (SUM(X)*SUM(X*Y))/ (N*SUM(X^2)) - (SUM(X)^2)'' B = N*SUM(X*Y) - (SUM(X)*SUM(Y)) /N*SUM(X^2)-(SUM(Y)^2)'' ...

Read More

: Data Analysis and Statistics    : john    :13 replies

HiIn Clementine, whether nonlinear regression (curve fitting) can be preformed or not. If so what is the procedure to be followed.Thanks ...

Read More

: Data Analysis and Statistics    : manjumc    :4 replies

 Questions & Answers

 Community Newsletter