With the advent of Internet-of-Things (IoT) devices, including smart meters and sensors in the smart grid, there has been immense research interest in big data management, analytics, and parallel processing of data. However, complex hardware and software parameters configurations and in-depth understanding of the data processing design are essential for efficient utilization of big data analytics platforms. In this work, we analyze the parallelization of load prediction by utilizing spark regression python library to assess the performance with workloads of up to 8 nodes. The results of different configurations have been studied and analyzed against the performance of Apache Spark. It was found that a trade-off between the number of nodes and cores is necessary to perform efficient parallel computing. Multiple sets of combinations of nodes and cores are considered in this paper to evaluate the performance. The work also signifies the importance of high-performance computing capability for smart meters big data management. The obtained results indicate that the computational time is not only dependent on the data size but also on the number of compute nodes and the number of cores assigned to run the job.