Machine Learning–Based Predictive Farmland Optimization and Crop Monitoring System

E-agriculture is the integration of technology and digital mechanisms into agricultural processes for more efficient output. This study provided a machine learning–aided mobile system for farmland optimization, using various inputs such as location, crop type, soil type, soil pH, and spacing. Random forest algorithm and BigML were employed to analyze and classify datasets containing crop features that generated subclasses based on random crop feature parameters. The subclasses were further grouped into three main classes to match the crops using data from the companion crops. The study concluded that the approach aided decision making and also assisted in the design of a mobile application using Appery.io. This Appery.io then took in some user input parameters, thereby offering various optimization sets. It was also deduced that the system led to users’ optimization of information when implemented on their farmlands.


Introduction
Agriculture is vital for the development of the world. We, humans, benefit from agriculture one way or the other, which has made agriculture a key area of study. Farmers will always need information to refer to, most especially when growing crops that are not common in their land or culture [1]. e average farmer has access to crude sources of information such as TV, radio, newspapers, fellow farmers, government agricultural agencies, farm supply, and traders. ere is, therefore, a need for a system that allows farmers access to relevant information [1].
Machine learning is among the trending technologies; hence, there exist several technologies and systems that run on a machine learning framework [2]. In recent times, several machine learning systems in agriculture have been tested and created. Research of several machine learning algorithms' effectiveness in agriculture [2] and other application domains has also been conducted and this is because machine learning is a very effective tool for efficient use of resources, prediction, and management, which are needed in agriculture. Machine learning is the ability of an electrical processing system to acquire knowledge and apply that knowledge [2]. e scope of this work is concerned with food crop agriculture and using machine learning to help optimize land for maximal crop yield by efficiently utilizing land resources. Crop yield relies strongly on how effectively the basic land requirements can be utilized; land here refers to topography, soil type, soil nutrients, water content, sunlight, and all such factors related to crop growth on farmable areas.

Database/Crop Datasets.
e data that populates this database includes the plant growth parameters that were used to form the individual decision trees in the random forest. Such data include irrigation, spacing, nutrient requirements, location, temperature, and other related factors that originated from several trusted databases. is plant growth condition database is designed to help decision making with the machine learning algorithm.

Method.
In this work, machine learning applied what had been used to set parameters and embedded it into a dataset on a mobile application. e machine learning algorithm was designed to maximize land proportion. e dataset contains parameters of some inputs that are critical for plant growth. e machine learning algorithm defines the relationship between these input parameters and certain internally stored prediction parameters and provides a solution for the output. e values in the database have been converted to a range system of 0 to 1; the need for conversion to the same range is due to data incoherence; data was derived from different sources and was therefore inconsistent, thus requiring a specific conversion.
3.2.1. Output Layer. All inputs and their respective weighted values were converted to a range system of 0 to 1.

Decision Layer(s).
is consists of layers of decision that help to classify input data into appropriate groups which also helped making decisions and setting parameters. e random forest algorithm was used to train the datasets, and the same datasets were applied to an MLR model as a benchmark for the random forest algorithm.
e work showed that the random forest algorithm is far more effective in crop yield prediction. 3 Liakos et. al. [2] 2018 is work involved a research into the use of machine learning agricultural production systems.
is work applied artificial neural networks.
is work showed that machine learning models have been used in several agriculture-related areas. Mainly in crop production and aiding management decision making processes. 4 Ming et. al. [5] 2016 is work involved classification of land cover based on image and remote sensing.
Random forest machine learning algorithm was used in the classification of image data.
Random forest is an efficient classification algorithm and performs effectively without using special selected features. 5 Nitze et al. [6] 2012 is work compared the effectiveness of several machine learning algorithms: support vector machine, artificial neural networks, and random forest.
several classifiers, Naïve Bayes for ML, random forest (RF), multilayer perceptron in case of ANN, and LibSVM for support vector machine, were used in this work for the classification of crops.
Even though classification results depended strongly on the number of images used, the SVM classifiers performed much better than the RF and ANN in most of the cases. 6 Chen and Cournede [7] 2017 is work focused on finding the most efficient way to predict the yield of corn based on meteorological records.
is work studied a new methodology named multiple scenarios parameter estimation and used the CORNFLO model.
Random forest regression was shown to be the most efficient for crop yield prediction. 7 Mitra et al. [8] 2017 is work focused on simulating and predicting crop yield for effective crop management and adequate results.
A three-layered artificial neural network (ANN) and R language were used in this work for prediction and simulation of crop yield.
e artificial neural network was effective for simulation and prediction.

Output.
is is composed of results from classification.

Classification.
is entailed defining sets of groups to which a new observation would belong. e aim of data classification here was to divide the crops into classes based on their respective data; these classes are based on crops growing together most efficiently on a given piece of land. e actual classification was carried out using random forests to allow all inputs to be considered multiple times for better accuracy since the algorithm comprises multiple decision trees. Figure 1 displays the system data flow diagram. e method of classifying and analyzing the results of the classification is divided into phases and functions. e phases include the resource process, which includes the fetching of data from the CropInfo database, which was followed by the generation of machine learning subclasses; the random forest algorithm was used in this phase to create subclasses based on ten different crop feature sets. In the class generation phase, subclasses with similar generation patterns were grouped into three main classes, which are used in the mobile application phase to help optimize the mobile output.

System Design.
e study also used activity diagrams to analyze the system's behavior and design. is section briefly discusses the interactions between the different activities in the application. It is broken up into three sections: (i) e user login activity (ii) e scheduler (iii) Tips and tricks activity

e Login Activity Diagram.
e operation of login as shown in Figure 2 involves a simple user verification process; once user credentials have been submitted, testing will be conducted to decide whether or not the account is valid; when user validity has been verified, the user will have access to the dashboard functions: the key operation, the tips and tricks, and the optimiser.

3.3.2.
e Scheduler Activity. e scheduler activity involves two significant events, as seen in Figure 3. e first one is the schedule event; this task allows users to schedule and display events created by the main task as well as usergenerated ones. e second event is the reminder event; this activity allows the user to set reminders and to view active reminders created by the main activity and those created by the user.

e Main Activity.
is is the main component of the program, consisting of user input system, machine learning algorithm, feedback system, and database for crop knowledge, as shown in Figure 4

Results
e system proposed includes an input collection system incorporating user input, which is processed using the optimiser algorithm. e algorithm used to break down crop is research uses the random forest classifier to classify the crop resource characteristics into ten subclasses; these subclasses are further categorized into three main classes. e crop, based on the dominant features of a variety, is tailored to its optimum level. In this work, the subclasses of crops were generated using two methods. e first approach involved four random forests generated in BigML; these models were created and analyzed, and results were compared with the performance of the second model, which involved the use of weighted linear equations for decision making.
For this classification, these models were used, and each model or tree was used to process the final model. e variance in those models was generated by modifying the model's rules.
OCF represents an ideal match for the class. e weight of each crop feature shall be determined using a set {x1, x2, ... x7 x7}.
ose sets coincide with a set of values for each weight. Light requirement (Lt), water requirement (W), space requirement (S), location (L), pH requirement (P), soil type (St), and companion (C) are the characteristics to be considered.

Subclass Models: Method One.
is study presents four subclass models with two tree samples each, of the ten, that were generated and analyzed as the outputs from each model generation were similar. e parameters for the generation of each of the models were selected randomly, thus varying the output. is was done to allow the fitting of features to certain subclasses if some parameters are absent. ese models were used as datasets for designing of the three major classes A, B, and C.
(1) Model One. e first model was created using S, Lt, W, and St. Figure 5 is the subclassification of crops based on the set of random parameters mentioned above, where parameters are not included. is model allowed the classification of crops into their subclasses: location, companion, and pH requirements. is model provided one of the most effective subclass generations of the 10 models that were analyzed.
(2) Model Two. e second model was created using S, Lt, W, and L. Figure 6 displays the subclass generation solution provided where data are not included for crop pH requirement, soil type, and companions. is was also an efficient model, considering that most plants in the study area essentially need the same type of soil for growth.
(3) Model ree. e third model was created using S, L, and Lt. Figure 7 shows model three output analysis, and this model was considered to be the least efficient model of all  (4) Model Four. e fourth model was created using S, L, and P. Figure 8 shows that it worked like the third model with a limited number of specified data categories, but it was significantly more efficient than the third model because it includes key parameters that enable the subclass generation to fit more precisely.

Subclass Output: Method Two.
After model analysis of method one, it was discovered that, based on the nature of the data, all class generation rules produced similar results due to the uncertainty of the position values; the classification methods below were implemented to create a less ambiguous way to allow for more ideal crop class generations.

Ideal Class Model Distribution Output.
e crop classification efficiency is based on the combined three  models. e performance was obtained from the subclass models where similar calculations were made on the output data of the subclass and an additional feature for each subclass to allow fitting to the three final ideal classes; to allow this fitting, accompanying crops were added to the model data.
is results in crops with features belonging to class A and class B not interacting, meaning that crops should not be planted together in those groups. e features of class C elements interact with the features of both class A and class B; this means that crops with features of class C can be grown effectively in either of the other two classes.
Based on the distribution and classification of the features shown in Figure 9, the ten crops considered in this work are assigned to their respective classes according to their characteristics as shown in Table 2.

Discussion
e mobile application allows multiple farm accounts to be opened on the same computer, there are two choices on the start page as shown in Figure 10 to either to create a new farm account as shown in Figure 11 or open an existing farm account as seen in Figure 12. e user has access to the dashboard after successful login or sign-up, as shown in Figure 13, and its functions. e functions of the dashboard are the optimiser function, as seen in Figure 14, which is the main application operation; the scheduler function; and the tips and tricks function which contains the knowledge repository. e optimiser consists of three fields of data: the field of farm size input that takes numerical input in square meters, the area field that takes user location input, and the field of    pH input that takes the farm soil input as seen in Figure 15. Users choose the crops they want to grow on their farm, and the outputs are displayed in the optimiser output area based on the input as in Figure 16.
e scheduler as seen in Figure 17 enables users to set the events or activities that they wish to perform. e user shall provide the task mark and pick the date of the work to be performed.    e tips and tricks event allows users to get agricultural tools as shown in Figure 18. Such tips and tricks are divided into collapsible components; such components include tip tools in the database for each crop and some additional tips on the field. ey also include guidelines for pH checking and soil improvement.

Conclusion
Most farmers do not have access to a central repository of relevant information that will help them make full use of and optimize their farmland. is work provided a mobile application interface that allows farmers to access their farmland information and guarantees them the services they need instantly.

Future Work
In future work, the machine learning models used to inform parameter setting in the mobile application could be developed using the machine learning algorithm embedded in the system and used to predict.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request. Ecological requirements are available in the following link: http://www.nafis. go.ke/agriculture/maize/ecological-requirements/.

Conflicts of Interest
e authors declare that they have no conflicts of interest.