Orange Forum • View topic - Problems with running big datasources

Problems with running big datasources

A place to ask questions about methods in Orange and how they are used and other general support.

Problems with running big datasources

Postby Max » Tue Mar 20, 2007 11:21

I've been having problems with running analysis on large datasets.

Is orange meant for this kind of analysis or not.

Postby Janez » Tue Mar 20, 2007 17:57

Your question is not particularly specific...

How many attributes of which kinds do you have, how many examples, which methods would you want to use?

Janez

Database

Postby Max » Fri Mar 23, 2007 9:19

Moji datasource znajo biti resnično veliki (okoli 300-500.000 zapisov.)

Spremenljivke so kategorične in numerične. Predvidoma pa želim izgrajevati modele - logistične regresije, multriple regresije, drevesa, C4.5, nevronske mreže. Seveda bom verjetno izvajal tudi segmentacije, za kar bom potreboval modele za klasifikacijo.

Problem se mi je pojavljal zaradi velikih baz, zaradi česa mi je večkrat "freezal" program.

Hvala za pomoč

English

Postby Max » Fri Mar 23, 2007 9:20

Sorry for my Slovene;-)

Atributes

Postby Max » Fri Mar 23, 2007 10:31

On Some analysis I use about 10-20 variables, while some analysis could include even more that 100 atributes (mainly number)

Postby Janez » Fri Mar 23, 2007 13:44

Orange probably won't swallow 300-500 thousands examples. The largest I tried was 100K examples with 30 attributes, or one thousand examples with 12K attributes. Your matrix is just too great, I think.

Janez

Postby Blaz » Tue Apr 03, 2007 19:29

I doubt that any standard package can handle millions of records. There are dedicated packages and methods that (never used them, though) that do disk-based data mining. If you have so big data sets, with perhaps not too many attributes, that learning should be equally well done on carefully sampled (stratified) data.

If you do find the package that does something like a neural network learning on millions of records in bearable time, please let us know.


Return to Questions & Support



cron