Abstract:
Agriculture is the important sources of survival and one of the most important factors in the economic growth of the country. In order to perform analysis on agriculture field that leads to many issues like proper information about current status of soil moisture, climate humidity and temperature. Some devices are developed for improve agriculture production, but it is not successful and sufficient. In this paper, the proposed system process the agriculture data(Big Data) in Hadoop platform to predict the crop yield and to suggest the crop growth thereby improve the quality of yield. In this work, a novel prediction approach using K-nearest neighbor (NPKNN) was proposed to handle and process the large volume of agriculture data set in parallel in Map-Reduce framework. The proposed system has implemented only three nodes. It can be implemented to more number of nodes. A master is setup with two slave nodes in Hadoop distributed environment. The input agriculture test and train data set are in data nodes (slave). The master implement NPKNN algorithm in Map-Reduce frame work to read the data set and analyze it. The output file for each data nodes is written back to Hadoop Distributed File System (HDFS).