How to Run Kdd2013AuthorPaperIdentification Benchmark
What is Kdd2013AuthorPaperIdentification?
- KDD Cup is the well-known data mining competition of the annual ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
- KDD Cup 2013 -> https://www.kaggle.com/c/kdd-cup-2013-author-paper-identification-challenge
How to Build Benchmark Environment?
- Add exclude to /etc/yum.repos.d/CentOS-Base.repo file [base] and [updates] sections to exclude PostgreSQL Packages:
[base] ... exclude=postgresql* [updates] ... exclude=postgresql*
- Install PostgreSQL 9.2 and others:
# Only PostgreSQL 9.2 can store dataRev2.postgres rpm -Uvh http://yum.postgresql.org/9.2/redhat/rhel-6-x86_64/pgdg-centos92-9.2-6.noarch.rpm yum install postgresql92 postgresql92-server postgresql92-contrib postgresql92-libs postgresql92-devel # Python installation yum install python yum install python-devel # Numpy installation yum install numpy # Scipy installation yum install scipy # g++ installation yum install gcc-c++ # pip installation yum install pip # psycopg2 installation (not available by yum) pip install psycopg2 # scikit-learn installation (not available by yum) pip install -U scikit-learn