No more applications are being accepted for this job
- Perform the scalability study for finding frequent pairs of elements by dividing the dataset
Online Big Data analytics tutor - Toronto, Canada - TeacherOn
Description
i was given a data set and i find frequent itemsets by implementing two efficient algorithms: A-Priori and PCY.Programming LanguageYou can choose your favorite programming language, preferably one of the following ones: C, C++,
Java, C#, or Python. i have the code but i dont know how to execute it and run it. i need help modifying and running the code
Dataset
The retail dataset contains anonymized retail market basket data (88K baskets) from an anonymous
retail store. The preprocessing step to map text labels into integers has already been done. Use
Sublime Text, TextPad or Notepad++ or other software to open the file. Do not use Notepad.
Dataset link:
I will provide you eventully
Experiments
results for the following support thresholds: 1%, 5%, 10%. For example, if your chunk is
10% of the dataset, you have around 8,800 baskets. Therefore, if your support threshold is
5%, you should count the pairs that appear in at least 440 baskets. See three samples below
for three different support thresholds.
Level:
Bachelors/Undergraduate
Gender Preference:
None
Meeting options:
Available online - via skype etc.