This paper present a systematic study
of the effectiveness of five variant sources of contextual information for user
interest modeling. The five contextual information sources used are: social,
historic, task, collection, and user interaction. This study focus on website
recommendations rather than search results. This research evaluate the utility
of these five sources, and overlaps between them, based on how effectively they
predict users’ future interests. The results demonstrate that the sources
perform differently depending on the duration of the time window used for
future prediction, and that context overlap outperforms any isolated source.
This research uses a systematic, log-based study of
numerous contextual sources for modeling user interests during web interaction.
The core task for any user modeling system is predicting future behavior, and
evaluate the informativeness of different sources of contextual evidence based
on their
informativeness for
predicting users’ future interests at different temporal durations. Assume that
the user has browsed to a web page and the task is to leverage context to
predict their future interests. The use of the current page and five distinct
sources of context are evaluated:
(i) interaction: recent
interaction behavior preceding the current page.
(ii) collection: pages with
hyperlinks to the current page.
(iii) task: pages related to
the current page by sharing the same search engine queries
(iv) historic: the long term
interests for the current user.
(v) social: the combined
interests of other users that also visit the current page.
This is the
first study to systematically assess contextual variants for user interest
modeling. The research also study the use of overlap between sources as a
stronger source of contextual signal. After that the performance of contextual
variants depends on the time duration used to represent future interests, and
overlap between contexts yields more effective interest models than any model
itself. Understanding which sources and source combinations best predict future
user interests is critical for the development of effective website
recommendation systems.
The primary
source of data for this study was the anonymized logs of URLs visited by users
who opted in to provide data through a widely-distributed browser toolbar.
These log entries include a unique identifier for the user, a time-stamp for
each page view, a unique browser window identifier, and the URL of the Web page
visited. In order to remove variability caused by geographic and linguistic
variation in search behavior, but only include entries generated in the English
speaking United States locale.
This
research studied the effectiveness of different sources of contextual evidence,
and their overlap, for user interest modeling. The findings of our study
suggest that the best-performing contextual sources are dependent on the
duration between and the end of the prediction window. This has implications
for the systems that use contextual information to support post-query
navigation and general browsing behaviors. For example, these systems must not
treat all context sources equally. Weights should be assigned to each source
depending on whether the system is recommending web pages that are relevant to
the immediate situation, the current work task, or the user’s general
interests. The contexts as defined could be implemented using server-side
lookups (task, collection and social) or client-side code (interaction and historic).
0 comments:
Post a Comment