Not Yet Answered # Could anyone recommend an interactive application for data analysis?

rubberman 1,109 Discussion Starter bestbird7788 Discussion Starter bestbird7788 Discussion Starter bestbird7788 OK, so HostGator for some reason no longer allows gcc/g++ access unless you have a Designated Server account, which is a lot of money to spend just to compile my "Hello World" program. Thus I figured I'd compile at home, then upload. Program is your regular old bare-bones Hello World ...

0

I need to conduct a large amount of data analysis on database. Could anyone recommend an interactive application for data analysis?

The requirements are:

1. Able to cope with the unexpected requirement rapidly.

2. Able to perform further computations on results interactively.

3. Easy to confront even a large amount of complex computations

What would you great expert recommend?

Thanks in advance.

0

Matlab, Wolfram, and other math packages will be suitable for this. There are also such things as SAS and SPSS (now owned by IBM I think). These are all very expensive, but what you are asking for is not simple stuff. I deal with real-time data analytics using many engineering techniques including things like Kalman filters, statistical sampling, monte carlo methods etc. in order to extract meaningful data from masses of data (millions of data points per day). We are in the process of working up machine learning techniques (I'm starting to learn the R language for that), including neural networks, fuzzy logic, and simulated anealing methods in order to better understand how our systems behave (supporting millions of concurrent users world-wide). So, what I am saying is that there is no quick path to this. Complexity != simple, or easy! Also, if you don't understand the underlying math, you will never be able to know if you are getting valid information out of your data sets.

FWIW, I spent a few years writing real-time risk analysis software for the options trading industry in Chicago, and that (lots of 3rd order differential equations applied to massive data flows) was simple compared to what I am doing now!

*Edited 4 Years Ago by rubberman*

0

Wolfram

thank you for your reply. You had given me a huge information content : -) , and I think you must be professional in mass data analysis. I'm glad to meet you, a expert.

I know a little about SPSS and SASS, It maybe too expensive for me and too difficult to understand.

BTW, It seems more fit for "tell me what happend", but not fit for "let me find the reason", I mean It hides many detail. ( I guessed so, maybe not correct)

some of my computation goals like:

a. to select out the 10 categories of best sellers

b. as a further computation on the basis of result from a., to select out the top 20 products from each category,

c. as a further comparison with that of the last year based on the result from a., to select out the newly-appeared and the disappeared categories on the list of this year.

could you give me a more narrow range according to the samples above, and not too expensive.

BTW, I know well about Excel and a little about VBA, but Excel is not a good choice because the data is updated daily in my case while Excel is fit for the fixed data

0

BTW

All my goals are commercial and functional, There maybe no arithmetic like "linear regression"

0

Hi, rubberman

I'm so grateful for your help

I'm willing to pay for the right one, including SPSS if it is useful.

I konw It's hard to image how to give a suggestion and make a choice without a discussible example.

So, I will give a discussible example below (with my friends's help, It's some difficult) ,please give me a suggestion abou how to use SPSS or any other solutions to solve the same problem( more details are ideal).

for example, I needs to compute the product whose annual sales values are all among the top 100

MSSQL data structure( sales table's fields): productID, time, value

SQL solution is as below:

```
WITH sales1 AS (
SELECT productID, YEAR(time) AS year, SUM(value) AS value1
FROM sales
GROUP BY productID, YEAR(time)
)
SELECT productID
FROM (
SELECT productID
FROM (
SELECT productID,RANK() OVER(PARTITION BY year ORDER BY value1 DESC) rankorder
FROM sales1 ) T1
WHERE rankorder<=100) T2
GROUP BY productID
HAVING COUNT(*)=(SELECT COUNT(DISTINCT year ) FROM sales1)
```

This article has been dead for over six months. Start a new discussion instead.

Recommended Articles

Hi. I have a form with list box : lst_product, datagridview : grd_order and button: btn_addline. lst_product has a list of product ids selected from database (MS Acess 2013) , grd_order is by default empty except for 2 headers and btn_addline adds rows to grd_order.

btn_addline :

`Private Sub btn_addline_Click(ByVal ...`

I don’t want at this stage work on a big separate project as I've already got plenty ...