Wednesday, March 21, 2012

linear regression with nested explanation variable

We are trying to create a model of linear regression with nested table. We used the create mining model sintax as follow :

create mining model rate_plan3002_nested2

( CUST_cycle LONG KEY,

VOICE_CHARGES double CONTINUOUS predict,

DUR_PARTNER_GRP_1 double regressor CONTINUOUS ,

nested_taarif_time_3002 table

( CUST_cycle long CONTINUOUS,

TARIFF_TIME text key,

TARIFF_VOICE_DUR_ALL double regressor CONTINUOUS

)

) using microsoft_linear_regression

INSERT INTO MINING STRUCTURE [rate_plan3002_nested2_Structure]

(CUST_cycle ,

VOICE_CHARGES ,

DUR_PARTNER_GRP_1 ,

[nested_taarif_time_3002](SKIP,TARIFF_TIME ,TARIFF_VOICE_DUR_ALL)

)

SHAPE {

OPENQUERY([Cell],

'SELECT CUST_cycle ,

VOICE_CHARGES ,

DUR_PARTNER_GRP_1

FROM dbo.panel_anality_3002

order by CUST_cycle ')}

APPEND

({OPENQUERY([Cell],

'select CUST_cycle,

TARIFF_TIME,

CYCLE_DATE

from dbo.nested_taarif_time_3002

order by CUST_cycle,TARIFF_TIME')

}

relate CUST_cycle to CUST_cycle

) as nested_taarif_time_3002

The results we got are a model with intercept only. if we don't use the nested variable (the red line) we get a rigth model . (we had more variable ....)

Is there a way to do this regression correctly?

Thanks,

Dror

Hi Dror,

You could remove the "regressor" flag from the nested table column (in the create mining model statement) if this column is not indented to be part of the regression equation.

Thanks,

Dana Cristofor

|||

Thanks Dana,

the problem is that ido want it to be part of the regression

otherwise i don't need the nested table

Thanks

|||

I'm investigating an issue - let me work on this and get back to you.

Thanks

|||

I think I may have found the problem. I created a simple model with processed and had expected regressions. I then created the same model using a nested table and processed and got a constant result back - very confusing. I then tried creating an additional model in the nested structure that would be identical to the first non-nested model I created, and again got no regressions - startling!

What I found out was that for some reason (possibly a bug) when I added a nested table, the wizard did not add the "regressor" flag to any of my continuous inputs. Once I manually added the regressor flag and reprocessed, I got the expected regressions in my output.

Please check the regressor flag on the model columns and let me know if this helps for you. To set the regressor flag, go to the Mining Models tab of the Data Mining designer, click the cell representing the input column under the mining model (not the mining structure) and view the properties. The regressor flag is a possible option for the mining model column.

Thanks

-Jamie

|||

Thanks Jamie,

we added the regressor flag in the minig models tab, and the we got the same results:

when we add the regressor flag to the column in the nested table, we get an intercept only model.

if we put the regressor flag on the nested table only, and not on the column of the table, we get the same regression as if we didn't use the nested table.

is there a way to solve it?

we would also like to run the model from the DMXquery of the management studio, is there a way to get the script of the model from visual studio?

Thanks,

Dror

|||

I wonder if it's possible that there's too much "noise" with the nested table? I've tried with a degenerate case to prove that there's no obvious bug in the software preventing nested regressors from working. Can you also try the same model with the decision tree algorithm and see what happens?

You can get the DMX form of the model by following the instructions at the tip and trick here: http://www.sqlserverdatamining.com/DMCommunity/TipsNTricks/3652.aspx

No comments:

Post a Comment