Population Dynamics Of Soybeans Essay.

Population Dynamics Of Soybeans Essay.


Study of Population Dynamics of Soybean Semilooper Gesonia gemma by using Rule Induction Model in Maharashtra

Gesonia gemma Swinhoe (1885), the grey semilooper has emerged as a serious threat to the soybean crop. This defoliator causes heavy damage to the crop in the form of loss in grain weight and grain number. Gesonia gemma population from various districts of Maharashtra has been taken to study the population dynamics related to abiotic features. Sequential covering algorithm (CN2 rule induction) has been proposed for rule induction model to generate a list of classification rules with target feature (G. gemma population) and the independent abiotic features. The classification rules have exhibited more accuracy and can be used to study the collected evidence for prediction.Population Dynamics Of Soybeans Essay.


KEYWORDS: CN2 rule induction, Gesonia gemma, population dynamics, semilooper, soybean


Soybean [Glycine max (L.) Merrill] belonging to the family Fabaceae commonly known as legume family as it is widely grown for its edible bean which has a good amount of calcium, protein and vitamins. In early seventies the crop was considered to be the safer crop against the insect pests. However, the scenario has changed as researchers from many parts of India have confirmed that seed yield and seed quality are being adversely affected by major insect pests viz. girdle beetle, tobacco caterpillar, grey semilooper, Helicoverpa armigera, jassids and white fly Netam et al. (2013).

Gesonia gemma Swinhoe (1885), the grey semilooper has emerged as a serious threat to the soybean crop. This insect pest causes defoliation to the crop, which inflicts significant reduction in pod number, pod weight, grain number and grain weight, resulting loss of grain 3.94q/ha, Yadav et al. (2014). Studying the population dynamics and forecasting the pest (G. gemma) population using the collected evidence, will bring out a successful module in Integrated Pest Management (IPM) on soybean.

Rule induction model has been proposed to understand population dynamics, since the rules (IF-THEN) are said to be more expressive and human readable in nature. One of the sequential covering algorithm, CN2 rule induction is used in building the model. CN2 induction algorithm, Clark and Niblett (1989) is a collective work based on AQ algorithm and ID3 algorithm. It is developed to classify rules accurately even in the presence of noise and the rules are made to be efficient and simple. CN2 classification rules have shown 93.73% accuracy in heart disease prediction, a comparative study done by Ramaraj and Thanamani. (2013). The proposed rule induction model will extract IF-THEN rules by classifying of pest population as high, medium and low based on Economic Threshold Level (ETL) on soybean.


The dataset comprises of field pest scouting data from various districts of Maharashtra viz., Akola, Amravati, Nagpur, Wardha and Yavatmal, along with its respective abiotic factors which includes maximum temperature (MaxT), minimum temperature (MinT), relative humidity (RH), moisture adequacy index (MAI), soil moisture index (SMI), total rainfall (RF) and number of rainfall days in a week (RFD) from 2009 to 2013. The sample data is given in the below table

Table 1. Sample tuples from the dataset

PI – pest incidence; SW – standard week; CS – crop stage; MaxT – maximum temperature; MinT – minimum temperature; RH – relative humidity; MAI – moisture adequacy index; SMI – soil moisture index; RF – rainfall in mm; RFD – number of rainfall days in a week;

The pest population count has been classified as low, medium and high based upon the Economic Threshold Level (ETL), and it has been defined using standard week and crop stage (pre-flowering and post-flowering).

Table 2. Classification of PI according to the proposed ETL

The data discretization method was used to convert the continuous features (MaxT, MinT, RH, MAI, SMI, RF) into categorical counterparts. Max-diff discretization method has been chosen upon binning method, since it is highly non redundant in nature while binning the features. This method calculates the maximum of all the differences between each tuple of their respective feature and categorizes accordingly for binning the features. The pseudo code for the max-diff discretization method is given below

procedure maxdiff(data : no. of tuples, diff : no. of differences between tuples, pos : no. of position of max. difference between tuples)

//Ascending order

n = length(data)


newn = 0

for i=0 to n-1 inclusive do

if data[i-1] > data[i] then

swap(data[i-1], data[i])

newn = i

end if

end for

n = newn

until n = 0

//Max. Difference

for j = 1 to n inclusive do

diff[j-1] = data[j]-data[j-1]

end for

//Position of Max. Difference

d = length(diff)

max = diff[0]

for k=0 to bin-1 inclusive do

for l=0 to d-1 inclusive do

if diff[l] > max then

max = diff[l]

pos[k] = l

end if

end for

end for

end procedure

Using this Max-Diff method the continuous variables are categorized into discrete variables of 5 bins (A1, A2, A3, A4, A5) each except the feature (no. of rainfall days).

Rules (IF-THEN rules) are the most expressive and human readable representations for learned hypotheses, Anderson and Moore (1998). Rule induction models basically allows the interrelationships of features to influence the output from the classification, Chu and Lin (2005). Hence they are said to be sensitive than that of other classifiers which considers features as independent to the target feature. The Rule induction uses the following CN2 classifier procedure for extracting rules from the dataset:

Population Dynamics Of Soybeans Essay.