24 August 2008
How can I read huge amount of Persian data / 'varcher' and 'text' (none numerical) data

Hi Everyone,I want to read my persian data with Clementine 11.1    

-          when  I select 'Excel Node' as  my Source Node ,it can't read more than (about) 250 rows.

-          When I select  'Data Base Node'  as  my Source Node ,and connect to My Sql with ODBC , it can't read data with type 'varcher' and 'text' …(not only Persian data)

How can I do these?

Many thanks

21 August 2008
Clementine 12. client version TLA node error

Hello

 

Has anyone experienced TLA execution failures with large files?

Error:

Error: Text Link Analysis Error: Exception: SessionManager: Cannot remove temporary directory - Memory : 628899kb - Memory peak : 663449kb
Error: Text Link Analysis Execution Error
Information: Stream execution complete, Elapsed=1315.48 sec, CPU=1313.19 sec
Warning: Execution was interrupted

 

What I am seeing are massive "matching text" cells within records.

 

Anyone know how to prevent this, or better yet, fix it.

 

Support punted to development, then crickets...

 Thanks

 

19 August 2008
Migrating to WEKA...

Hello, I may have to migrate from Clementine to WEKA one of this days. Has anyone had experience with this program? I'm downloading it now and I will start testing it, but I have no idea.

Any comments/opinions/suggestions will be very useful, thank you.

A.

PD: I don't intend to open Clementine streams in WEKA ("migrate" can be misinterpreted), I intend to do the same things I am doing with Clementine, but in WEKA.

19 August 2008
Data mining intrusion detection

Hi,

 I am working on intrusion detection in computer networks.

I need some help in how to start the analysing the patterns in clementine by the attack data i have.

Its in the format time, src ip, dst ip, src port, dst port, protocol, flags...i want to know how to start the data minig process

 cheers

18 August 2008
balance node pushback as SQL (stratified sampling)


You are probably familiar with the Balance node. It performs the function of selectively and randomly sampling your data based upon the values of a field or number of fields. Also known as stratified sampling!

If your data is managed by a data warehouse, then Clementine has this cool behaviour of automatically converting functions into SQL, so the data processing can be performed by the database and less data needs to be extracted and duplicated on another file system.

Unfortunately the Balance node isn't one of the functions automatically converted into SQL. In order to perform stratified sampling you have to take a different approach and selectively pick the values of your target column/field and sample them individually.

I attached one Clementine version 12.0.2 stream (balance node.zip, rename to .str) as one example of how to do this. By using a select condition, followed by a random sample, followed by a union (append) it is possible to easily obtain a stratified sample from a huge dataset efficiently.

I have also pasted below an example of the type of simple SQL that gets processed;

SELECT *
FROM (
SELECT *
FROM (
SELECT *
FROM IPSHARE.TMANNS_DRUG4n
WHERE (Drug = 'drugA')
SAMPLE 0.5
) AS TimTemp1
UNION ALL
SELECT *
FROM (
SELECT *
FROM IPSHARE.TMANNS_DRUG4n
WHERE (Drug = 'drugX')
SAMPLE 0.2
) AS TimTemp2
) AS TimTable
;

Cheers

Tim

14 August 2008
Script matrix node

Hi,

 

Is it possible to script in an unique export in .csv file from each set or ordered set' fields crossed by a field name "target" (set filed) with matrix node ? If it's not possible, may be exported in differents .csv ?

 Could you give me an example ? or the solution ;-)

 

Thank you very much

Hugo

13 August 2008
Using scripting in Bath mode, how to ignore the error interruption until all actions are completed

Hi,

     When running a stream than contains node CHAID and Logistic in Batch mode, because no model was created for CHAID, the execution was interrupted so as to fail to run all actions .
 
     ======== Clem_batch log file shows as follows: ==========
Information: Stream execution started
E3138: Stopping rules prevent any tree growth
Error: Model building completed but no model was created
Information: Stream execution complete, Elapsed=16.55 sec, CPU=14.14 sec
Warning: Execution was interrupted
Error: Script execution was interrupted on line 59 column 3
Failed to run all actions, error code: 1
 
   ==========================================================

   My question is:
        Using scripting in Bath mode, how to ignore the error interruption and  automatically skip to the next node.  let the running no stop until all actions are completed?  and how to get its error code by scripting?  (By the way, my data have no any problem, no null and no blank, are full complete data set.)

     Thanks a lot!!

Jiang

10 August 2008
Need Help: Using scripts, how to select and generate model nugget(s) in Binary Classifier node output

Hi,

  Can yun instruct me:
      Using scripts, how to select and generate one or more model nuggets in the Binary Classifier Results output that is generated by automated model: Binary Classifier in SPSS Clementine v11.1?
     Looking forward to hearing from you.
 
Thanks!

06 August 2008
I started a blog...

fyi

I'll try to write about Clementine and data mining related activities in my Blog, but other stuff will be in there too.

The blog is just my ramblings. Not a replacement for my input to this forum!

http://timmanns.blogspot.com/

Cheers

Tim Manns

 

29 July 2008
SOMs UI

Hi everyone, I just came across this site. Apologies if this is the wrong forum to post this question in.


I'm having a problem with self organising maps. I understand the theory and algorithms but I'm not sure how to go about implementing it in code (C#). I've seen a couple of examples on http://www.ai-junkie.com/ann/som/som1.html and http://davis.wpi.edu/~matt/courses/soms/#Java
but neither are C# (which is the only language I'm familiar with). This isn't a problem algorithmically as that's easy to convert, but I'm having great difficulty on the user interface side. What's the best way of testing my SOM code? I don't know where to start with the interface side. There's no need for code (unless you have C# code) but theory as to what's actually happening in the previous sites and the how I should be rendering the nodes etc..
Thanks to anyone who can shed any light on this. I've been trying to figure it out for weeks, and I just can't get started :(

29 July 2008
Internal Error: Stream optimisation failed and is switching off

Does this mean some structure issue?

 

Cheers,

27 July 2008
how to deploy models?

hi,

once we create a model how do we deploy them??

there is no SPSS predictive enterprice solu: purchased.......what are the other ways to deploy the model in the data warehouse??
 

24 July 2008
HELP:Where to get CATs(clementine application templates)

Evening!

I have a Clementine version without CATs,and I want to try CATs.

So,where can I get CATs?

Thank you!

22 July 2008
Quick way to remove outliers?

Hi everyone, I have been using the "Action -> Discard" function included in the Data Audit node, but the problem is that this has to be done individually to each field... isn't there a way to do it automatically? I tried selecting many fields and setting the option but it still changes only one. Looked for help at the manual but nothing...

 

Clem 11.1 btw...

 

Thanks

22 July 2008
How to refer a node in stream script if the node is part of a super node?

Hi,

 I couldn't find related information in help file. If my stream has a supernode S_Node (non terminal), and a filter node Remove_Sum_Filter1 is contained in the supernode, how can I successful run the stream script?

 I got a " can not find node" error.If the node is not in supernode it runs smoothly.

 

Many Thanks,

# the script # a loop to rename fields (shorten field names) # first make a temp variable to hold the field name var tim # start loop, for field each field in the filter node named "Remove_Sum_Filter1" for tim in_fields_at Remove_Sum_Filter1:filternode # if this field ends in "_Sum" then do something if hasendstring(^tim, "_Sum") then # remove the last 4 letters set Remove_Sum_Filter1:filternode.new_name.^tim = allbutlast(4, ^tim) else # otherwise do nothing endif # end the loop endfor

22 July 2008
Calculating model's Gini coefficient in Clementine

Hi

Does anyone know of a quick way to calculate a model's Gini Index in Clementine (v10) to evaluate the discriminatory power of the generated model?

Many Thanks,

Ronan

 

17 July 2008
Data mining survey
I’m doing a Datamining Clustering survey. So I need some Clustering algorithms. I already find some in RapidMiner, but I need some more. Anyone can help me, with the source code of these algorithms (in JAVA if possible)?
Here a list of some, that I’m thinking to use:
 
CLARA
CLARANS
PAM
SNOB
BIRCH
SMTIN
STING
 
I appreciate any help, thanks in advance.

17 July 2008
How can I read values from output file(*.cou) in a script?

Hi,

I'm running a loop with script. Iteration frequency and values to be used in a loop is determined by output table.
For example,
####################################################
~~~~~
set TABLE.output_mode=Screen
execute TABLE

var count_c
set count_c=TABLE.output.row_count

for c from 1 to ^count_c

var value_c
set value_c=value TABLE.output at ^c 2

create selectnode at ^c*100 100
set selectnode.custom_name = ^value_c
~~~~~~~

endfor
####################################################

It works very well with Clementine client but the problem is that I have to run this script with Clementine server using batch.
When I run the script with shell file, outputs are not shown on screen of course, but just saved as a file(*.cou) and cannot read output value.

Is there anyway to read values from output files?
 

13 July 2008
filler for count the presense of value>0 in multiple fields?

Hi dear,

 

If I have multiple fields like:

CustID  Items_A_Bought        Items_B_Bought ... Items_Z_Bought

John    1                                 7                          0

Jack    0                                0                           1

Jess    2                                2                           2

 And I want to calculate how many types of items they bought:

John  2 (A and B)

Jack 1 (C)

Jess 3 (A,B, C)

 The way I can think of is to use Filler node to fill where @FIELD>0 with 1, then sum_n(@fields_between(Item_A_Bought,Item_Z_Bought). Is there any other way to do this?

Thanks,

12 July 2008
Nearest-neighbor classification
Can you implement nearest-neighbor classification in Clementine?  Has anyone done it?
12 July 2008
data mining in C#
hello....i need your help. i got an assignment about prediction in data mining. and the problem is, i'm new in C# and data mining. i've tried to search an example in sql server site but when i try it into my project it wont worked. an sql exception was made. here is my code
using System;
using Microsoft.AnalysisServices;
using System.Data;
using System.Data.Common;
using System.Data.OleDb;

namespace AMOSampleForDM
{
        public Database         CreateDatabase(Server srv, string databaseName)
        {
            Database    dbNew = new Database(databaseName, Utils.GetSyntacticallyValidID(databaseName, typeof(Database))) ;

            if (srv.Databases.ContainsName(databaseName))
                srv.Databases.RemoveAt(srv.Databases.IndexOfName(databaseName));
           
           
            srv.Databases.Add(dbNew);
            dbNew.Update(true);

            return dbNew;
        }
 [STAThread]
        static void Main(string[] args)
        {
 
            try
            {
                Server srv = new Server();

                srv.Connect("localhost");

                Database db = CreateDatabase(srv, "Test");
            }
           catch (Exception e)
            {
                Console.WriteLine(e.Message);
            }

and the exception is "make sure that the "sql browser" is running. cannot made a connection because the target machine is actively refused it." i really confused, because i'm make sure that the sql browser and server are running.
could anyone help me with this problem or would you give me another example about data mining in C#???? please i really need your help...

thanks a lot,

angela
11 July 2008
Nonlinear regression

Hi

In Clementine, whether nonlinear regression (curve fitting) can be preformed or not. If so what is the procedure to be followed.

Thanks

10 July 2008
Unknown data value in symbolic field


Hi Dear All,

 I am trying to run a Sequence node over about 20,000,000 rows of transaction format data. The ID field (Customer ID) is a unique string so I had to specify as "typeless" (also what the Type node decides after it read data). Time field and content field are all right and value appears in Type node. However I receive this error:

Unknown data value in symbolic field
Check the type for this field is correctly instantiated?


If I reduce the amount by sampling to say 100,000 rows it worked, possibly because the max set size is not exceeded so recognized it as set.

I have to increase Max Set Size to a crazy figure, set Customer ID as set and <read+> and seems it is working.

Why CARMA node (reading a typyless Transaction ID) can work without this problem?


Many Thanks,

05 July 2008
Perl source code for document classification , Naive Bayes Classifier

Text Document Classification Using Naive Bayes Classifier with Perl. Source code is available.
03 July 2008
Why Clementine 11 sometimes uses my two cores and sometimes it doesn't?

Hi everyone, I have an AMD X2 4200+ with the optimizer installed, and I have noticed Clementine sometimes uses both cores and sometimes it doesn't. It is quite annoying because when using both the processing speed literally doubles.

 A PC / Clementine reboot sometimes fixs this, but why does it happen?

Thank you!

More Posts Next page »