Appraisal of genetic algorithm and its application in 0-1 knapsack problem

: A lot of uncertainties and complexities exist in real life problem. Unfortunately, the world approaches such intricate realistic life problems using traditional methods which has failed to offer robust solutions. In recent times, researchers look beyond classical techniques. There is a model shift from the use of classical techniques to the use of standardized intelligent biological systems or evolutionary biology. Genetic Algorithm (GA) has been recognized as a prospective technique capable of handling uncertainties and providing optimized solutions in diverse area, especially in homes, offices, stores and industrial operations. This research is focused on the appraisal of GA and its application in real life problem. The scenario considered is the application of GA in 0-1 knapsack problem. From the solution of the GA model, it was observed that there is no combination that would give the exact weight or capacity the 35 kg bag can carry but the possible range from the solution model is 34 kg and 36 kg. Since the weight of the bag is 35 kg, the feasible or near optimal solution weight of items the bag can carry would be 34 kg at benefit of 16. Additional load beyond 34 kg could lead to warping of the bag.


INTRODUCTION
Genetic algorithm is a type of optimization algorithm which is often categorized as a global search heuristics. This process or algorithm is used in minimizing or maximizing an objective function by mimicking the process of natural evolution. Genetic algorithm as an optimization technique takes its root from nature and the terminologies often used or associated with it are biological. The basic components of GA include; Fitness function used for optimization; population of chromosomes; selection of chromosomes which reproduce; production of next generation of chromosomes by crossover; mutation of chromosomes in new generation in a random manner. Fitness function is the function that is to be optimized by the solution produced by the GA, the fitness function is one of the most important part of the algorithm hence it is done carefully. Selection operator selects based on probability of the chromosomes to be used for reproduction. The fitter a chromosome, the more likely it is to be selected.

Cross-over Operation in GA
The cross over operation is basically used to create offspring. It swaps a subsequence of two of the selected chromosomes to create two offspring.in the crossover process, several strategies can be employed. They include; Single point crossover as shown in Figure 1; two point or multipoint crossover as shown in Figure 2.
Mutation ensures the production of new offspring through reproduction. The process of generating new offspring from single parent is shown in Figure 3.  Genetic algorithm has vast application in engineering and sciences. Some areas of application of GA include: Knapsack problem; aircraft design; communication network; gas pipeline; poker; encryption and decryption; travelling salesman problem etc. The application of genetic algorithm in solving Knapsack problem is considered in this research.

REVIEW OF LITERATURE
Optimization algorithm, a type of method used in finding the optimal solution(s) of a computational problem thereby maximizing or minimizing aparticular function is known to be called genetic algorithms (GA). Being a branch of the field of study called evolutionary computation; they are known to mimic the natural selection of biological processes of reproductionand to solve the 'fittest' solutions [27]. Genetic Algorithm (GA) was introduced by John Holland in 1975 with the help of his colleagues and students. GA is often known to be an optimizerof vector feature weights in either a linear or a nonlinear fashion. The selection and quality of each pattern feature have an influence on the pattern classification of subsequent success and this pattern classification requires that for a set of measurable features objects be described [21].
The structure in Figure 4 represent the basic structure of the genetic algorithm where each iteration in the cycle is capable of producing new run or generation of chromosomes, the run is repeated continuously till a point where there is at least one highly fit chromosome in the population.

Research on Genetic Algorithm
In a study made by [23][24][25] a comparison was made for the effectiveness of bearing fault detection and this was done with the use of three types of Artificial Neural Networks (ANN) which are the probabilistic neural networks (PNN), the multilayer perceptron (MLP) and radial basis function (RBF). There they used genetic algorithms to optimise the parameters such as thewidth of RBF and PNN, the number of nodes in the hidden layer of MLP along with the input selection of featuresand were able to have results which gave the rolling bearing condition relative effectiveness based on the three classifiers. [5] also applied the above method with difference in classifier been used for gear box diagnosis. [18,29] propose a framework that combines various existing feature selection methods based on feature subset selection by GA. Advantages of this approach are the search for features subsets that are small which perform well for an inductive learning algorithm and also the accommodating of multiple feature selection criteria.
R.E. Marmelstein in his work [16], gave detailed information on how GA is to be applied to Data Mining and reiterate its importance to Machine Learning algorithms. Over a period, there has been growth in interest with the use of Genetic Algorithms (GAs) due to their success in large scale search and optimization problem. GAs was being used by [16] to improve the performance of Data Mining clustering and classification algorithms and to examine strategies for improving these approaches. Even though a band pass filter can be in-cooperated with envelope analysis for early identification of bearing defects, no consensus has been reached as to which passband is optimal.
[10] studied passbands to defect frequency components ratio, thereby evaluating the degree of defectiveness in bearings, as been specified by the application of GA with reference to residual frequency components. In a different work related to what [10] did, [6] proposed a method that combined conventional and intelligent search techniques, which resulted to a high performance in fault detection. They used shock response spectrum (SRS) together with wavelet filtering with real coded genetic algorithm (RCGA) and band-pass for features extraction.
ANN one main problem is the selection of best inputs, thereby allowing difficulty in compact creation and exact network which require little pre-processing as in the problem of face recognition. [7,8] however, tackled similar problem with the use of GA by selecting the most relevant input features from large possible sets of features in machine condition monitoring contexts. Feature selection is a vital field in machine learning for reaching good performance in the diagnostics system to improve the reliability, accuracy and effectiveness for fault diagnostics which is an important factor to consider. In a later research [9] also went further to advance the generalization performance of support vector machine (SVM) using GA and succeeded in improving SVM training ability where only limited training data is available. [23] presented a work on gear fault detection based on two class (normal or fault) recognition using ANN and SVMs, using GA along with the selection of input features. SVM was found to classify accurately and better than ANN with and without GA. SVM is popularly based on the idea of structural risk minimization. A method was proposed based on two nested real valued genetic algorithms (NRGA), with it the parameters of SVM was optimized efficiently and the parameter optimization was sped up by orders of magnitude contrary to the traditional methods which optimizes simultaneously all the parameters [3,13].
With the use of SVM the problem often associated with it is with the choice of an optimal kernel and the method of optimizing its parameters in the process of learning. Genetic Algorithm was proposed for the approach of this kind of problem of parameter optimization. Grid algorithm, was compared to the traditional method for parameter setting, and this was done with the aid of more experiments using data sets of different benchmark [22].
[1] did a multi-stage feature selection by selecting the best possible condition parameters for time, frequency and time-frequency domains on gearboxes. They augmented selected features at each stage and fed them as input into a neural network classifier for subsequent step, while by selection process a new subset of feature candidates was considered. Genetic algorithm is often used in finding the best solutions of selected problem. [12] suggested two ways in which GA can be used to design a multiple-classifier system. First they used a disjoint subsets feature as individual classifier, whereas in the second they selected possible overlapping subsets feature and individual types of classifiers. In comparing the multiple-classifier systems with the two suggested GAs: best features upset found by a GA individual classifier and the sequential backward selection (SBS) method, it was observed that the former produced the minimum training error rate compared in all the performed experiments. [31] did use genetic algorithm in selecting subsets thereby achieving, multi-criteria optimization in terms of generalization accuracy and costs associated with the features.
More often, most techniques developed by researchers are mainly considered for single bearing system however, Wulandhari 18 developed a technique for a hybrid adaptive genetic algorithms (AGAs) with back propagation neural networks (BPNNs), called AGAs-BPNNs and applied it to the diagnosis of multiple system bearing and their technique was able to obtain high accuracy in classification and reduced the number of iterations when compared to BPNNs.
Apart from using GA for feature selection and optimization we also find compact genetic algorithms cGAs contributing to several evolutionary methods thereby reducing memory requirements. cGAs, processes a probability vector by emulating the evolution of populations through specific update rules. By processing a real valued solution coding for a new variant [17] overcame problems associated with cGAs.
In a study to overcome the design problem of the Fuzzy ARTMAP neural network. [14] used genetic algorithm heuristic to profound solution to the multiobjective optimization problem associated with it. They used hierarchical parallel architecture to achieve their goal thereby reducing convergence in the GA used.
For the optimization of process controllers GA can be applied to using natural operators. Well established methodologies to realize the workability and applicability of genetic algorithms for process control applications was discussed by Malhotra et al. In their work GA was applied to speed control of DC servo motor, speed control of gas turbine and to direct torque control of induction motor drive for the optimization of control parameters. For the control system design application of boiler-turbine plant genetic algorithm was used. The ability of the GA to develop a state feedback controller for a non-linear multi input/multi-output (MIMO) plant and a proportional-integral (PI) controller model was given by [2]. In another work [28, 20, 32] gave a method for designing finite impulse response (FIR) filters that is rapid, automatic, and which gives a filter realisation of near minimal computational complexity.
In solving the shortest path problems, GA was used and tested on generated random problems having 6 nodes to 70 nodes and from10 edges to 211 edges with different sizes [4], a method based on priority encoding was used to solve the shortest path variety of network optimisation by both exact and approximate method. For the optimization of uncertain functions and their applications the GA was effectively used by [11], two sorts of formulation for the optimization of problems under uncertainty which are the memorybased fitness evaluation GA (MFEGA) and the GA using sub-population (GASP) were discussed in their work.

Knapsack Problem
The knapsack problem is a problem based on determination of strength or capacity of bags used in conveying loads. It makes use of combinatorial optimization in search for solution to a problem under uncertainty. When solving the knapsack's problem two attributes are usually considered: Value or benefit upon which the importance of the item depends on and the Volume or weight of items to fit into the bag.
Mathematically, the two attributes can be represented as a modelled equation: −the goal is to maximize: −subject to the constraint: where: N = number of items that may potentially be placed in the knapsack, Vi= positive integer volume, Bi= positive integer benefit.

Fig. 5. Knapsack Problem representation
To solve the knapsack problem an assumption is made with respect to the capacity/weight the knapsack can carry, as presented in Figure 5. Hence, it is assumed that the capacity of the bag is 35 kg, signifying that the maximum allowable item the bag can contain is 35 kg. Considering an individual buying five items from a shopping mall with weight and benefit assigned as shown in Figure 5 and Table 1 As shown in Table 1, A represent Tangerine, B denote grape, C represent mango, D represent apple and E represent oranges. To effectively fit in items available into the knapsack, models are to be generated for the constraints.
From the problem being considered, there are two constraints, weights and benefits. Expanding eq. 1and 2 further, where N=5: Subject to the constraints: , (4) and Xi Є {0,1}, for I = 1, 2, …, since in GA chromosomes are usually represented in bits of 0 and 1, For the problem 2 n , where n is 5, the levels of iteration = 32.

RESULTS AND DISCUSSION
The solution space is the chromosome encoding Ci. The 0-1 concept is initiated where Gene: 0-represent absence of item in the Knapsack while 1-signify presence of item in the bag. 5 bits are requested to represent chromosome encoding.
Initial population is created, and chromosomes randomly created. The fitness function is used to evaluate how good a solution is. From Table 2, Figure  6, there are 32 set space. The first set space has presence of item E. Value or benefit of knapsack is 3 and weight of knapsack is 8kg. The second set space has presence of item D. Value of knapsack is 6 and weight of knapsack is 4 kg. The third set space has presence of item D and E. Value of knapsack is 9 and weight of knapsack is 12 kg. The fourth set space has presence of item C. Value of knapsack is 5 and weight of knapsack is 10 kg. The seventh set space has presence of items C, D and E. Value of knapsack is 14 and weight of knapsack is 22 kg and so on. Since mutation introduces the diversity within the next population so that search algorithm doesn't get stuck at local maxima, it is important to pay attention to C15, C29 and C30 where there is tendency of highest probability of getting selected in the next generation. A close look at the fifteenth set space in Table 2 reveal the presence of items B, C, D and E. Value of knapsack is 16 and weight of knapsack is 34 kg. From the solution of the model, It was observed that there is no combination that would give an exact weight or capacity the bag can carry except set space 15 and 29 where the weight of items are 34 and 36 respectively. Hence the feasible solution would be 34 for which the benefit is 16. 36 is beyond the size of the bag, if items are forced into the bag, there is tendency of overstretching or tearing the bag.

CONCLUSION
The present study shows that Genetic Algorithm (GA) is a stochastic technique and a good method for solving complex combinatorial problems. In the study, GA has been used to find solution to the 0-1 Knapsack Problem (KP). The technique can reduce the complexity of the KP from exponential to linear, which makes it possible to find approximately optimal or near solutions for an NP problem. From the solution of the GA model, it was observed that there is no combination that would give the exact weight or capacity the 35 kg bag can carry but the possible range from the solution model is 34 kg and 36 kg. Since the weight of the bag is 35 kg, the feasible or near optimal solution weight of items the bag can carry is 34 kg at benefit of 16. Additional load beyond 34 kg could lead to tearing of the bag. In future research, authors recommend application of genetic algorithm for process optimization in Engineering field like optimization of gas pipeline and quality control.

GA
-Genetic Algorithm KP -knapsack problem ANN -Artificial Neural Network PNN -Probabilistic neural networks MLP -Multilayer perceptron RBF -Radial basis function SBS -Sequential backward selection SRS -Shock response spectrum RCGA -Real coded genetic algorithm SVM -Support vector machine AGA -Adaptive genetic algorithms BPNNs -Back propagation neural networks MIMO -Multi input/multi-output PI -Proportional-integral