In the era of \"Big data\", there is an emerging need to process a massive data set using large cluster system. Without proper strategies to handle these data, it is very challenging to gain a good performance from the system. In this paper, many I/O and execution scheduling strategies for parallel data mining application has been investigated. This paper discover the strategies that balance the data processing load and better utilize a multi-core cluster system for data mining application. Issues that impact the performance have been addressed.