Data#
We share four datasets as examples for users to try out the pretrained NNE. You can find them (Matlab files) in the ‘sample_data’ folder at this GitHub directory. These datasets all come from public sources. More detailed descriptions of these datasets can be found in the paper.
Description of the datasets#
Expedia - destination 1#
This dataset comes from a Kaggle contest based on Expedia.com search and booking data. The data are used in several papers to study consumer online search behaviors. This dataset here focuses on the search sessions for the largest travel destination in this contest. There are \(n = 1258\) sessions, 3 product attributes, 2 consumer attributes, and 1 advertising attribute.
Expedia - destination 2#
This dataset includes the search sessions for the 2nd largest travel destination in the same Kaggle contest as above. There are \(n = 897\) sessions, slightly below the current minimal requirement of \(n\) the pretrained NNE (see here). Despite this, the pretrained NNE seems to work well.
Trivago - desktop channel#
This dataset comes from the ACM RecSys Challenge based on user session data on Trivago.com. This dataset here includes the search sessions made on the desktop channel.
Trivago - mobile channel#
This dataset includes the search sessions made on the mobile channel from the same RecSys Challenge above.