top of page

Summary

This project is to build a model to predict whether a user will download an app after clicking a mobile app advertisement, based on a vast collection of users’ clicking record on app advertisement as our original training data including attributes such as ip, os, device, channel etc. After thorough analysis and preprocessing of the dataset, we selected 6 features including ip, app, device, os, channel, and click_time for training part and set 2 features “is_attributed” and “attributed_time” as output labels for both training and testing parts. We firstly tested Decision Tree (ID3 & C4.5) and obtained a convincing accuracy of 90%. Next, we tested several algorithms in WEKA, including K-Nearest Neighbor, Ada Boosting, Logit Boosting, Classification via Regression, and Random Forest. And the results turned out that Classification via Regression and Random Forest led to better results for this prediction task compared with other algorithms. Finally, we combined MultiLayer Perceptron Classifier with Regressor, the accuracy of which came to nearly 95%. To summarize, we could safely say that the result of combination of MLP classifier and regressor is better than either of them and other used algorithms, which is able to come up to the original goal in our task overall.

Future Work

Currently, we only use 10% of the total data in training and testing our model. Given more time, we would cast effort to make full use of more data for training or try to extract some valuable features based on much more data. In addition, we would also test some other algorithms like recurrent neural network. But in our opinions, the accuracy of the model won’t increase to a great extent.

Work Division

Ji Lin (jle2795): Data preprocessing, different algorithms testing in WEKA, design for website.

 

Han Wang (hwm4792): Data visualization, different algorithms testing in WEKA, website composition.

 

Qiping Zhang (qze7487): Data preprocessing, Decision Tree (ID3) testing, implementation and testing of MultiLayer Perceptron (both classifier and regressor).

bottom of page