Friday, March 23, 2012

Link analysis

Does Microsoft plan to extend the number of Data Mining algorithms in AS in the future releases? The question is motivated by the task of so called "link analysis" where one should determine how data attributes are related to each other or how and in what extent they influence each other in probabilistic terms. A good solution would be to build a Bayesian Network which gives an insight to how data attributes are related by means of directed acyclic graph. But this approach is not yet implemented in AS2005.

Existing algorithms such as association rules or decision trees might be used but they are far from being ideal for this task (association rules are designed for determining frequent boolean sets in data like Name=Attribute, decision trees work good for classification tasks but perform poor by design for the tasks of revealing attributes direct and inderect influence).

It would be interesting to know what algorithms and approaches Microsoft plans to develop in the future.

We cannot comment on algorithms that will appear in future versoins other than what has already been announced. In SQL Server 2008 we are introducing ARIMA time series with a default option to combine both ARTXP and ARIMA models to get the best of both approached.

The July CTP of SQL Server 2008 has this algorithm available.

sql

No comments:

Post a Comment