|
|  |
| Employing Inductive Databases in Concrete Applications. LNCS 2006 :. |
| In this paper we present the application of the inductive
database approach to two practical analytical case studies: Web usage
mining in Web logs and financial data. As far as concerns the Web domain,
we have considered the enriched XML Web logs, that we call conceptual
logs, produced by specificWeb applications. These ones have been
built by using a conceptual model, namely WebML, and its accompanying
CASE tool, WebRatio. The Web conceptual logs integrate the usual
information about user requests with meta-data concerning the Web site
structure. As far as concerns the analysis of financial data, we have considered
the trade stock exchange index Dow Jones and studied its component
stocks from 1997 to 2002 using the so-called technical analysis.
Technical analysis consists in the identification of the relevant (graphical)
patterns that occur in the plot of evolution of a stock quote as time
proceeds, often adopting different time granularities. On the plots the
correlations between distinctive variables of the stocks quote are pointed
out, such as the quote trend, the percentage variation and the volume of
the stocks exchanged. In particular we adopted candle-sticks, a figurative
pattern representing in a condensed diagram the evolution of the stock
quotes in a daily stock exchange. In technical analysis, candle-sticks have
been frequently used by practitioners to predict the trend of the stocks
quotes in the market.
We then apply a data mining language, namely MINE RULE, to these
data in order to identify different types of patterns. As far as Web data
is concerned, recurrent navigation paths, page contents most frequently
visited, and anomalies such as intrusion attempts or a harmful usage of
the resources are among the most important patterns. As far as concerns
the financial domain, we searched for the sets of stocks which frequently
exhibited a positive daily exchange in the same days, so as to constitute
a collection of quotes for the constitution of the customers’ portfolio, or
the candle-sticks frequently associated to certain stocks, or finally the
most similar stocks, in the sense that they mostly presented in the same
dates the same typology of candle-stick, that is the same behaviour in
time.
The purpose of this paper is to show that the exploitation of the
nuggets of information embedded in the data and of the specialised mining
constructs provided by the query languages, enables the rapid customization
of the mining procedures following to the users’ need. Given
our experience, we also claim that the use of queries in advanced languages,
as opposed to ad-hoc heuristics, eases the specification and the
discovery of a large spectrum of patterns. |
| R. Meo, P. L. Lanzi, M. Matera, D. Careggio, R. Esposito. Employing Inductive Databases in Concrete Applications. In: Constraint-Based Mining and Inductive Databases, selected papers from Inductive Databases and Constraint Based Mining 2004, LNCS 3848, pp. 295-327, Springer, 2006. |
|
|
|
|
|
|
|
|
|
|  |
|
|