The Prediction of the Man-Hour in Aircraft Assembly Based on Support Vector Machine Particle Swarm Optimization

ABSTRACT: As the representative of manufacturing industry, aircraft assembly lacks of effective method to forecast man-hour. The forecasting accuracy of existing methods is universally pretty low. On the basis of full analysis of aircraft assembly’s feature, this study proposes a forecasting model based on support vector machine (SVM), which is optimized by particle swarm optimization. It can carry out quantitative prediction of the process’ man-hour during aircraft’s assembly. Firstly, we decompose aircraft’s assembly work by the concept of work breakdown structure. Further, the process parameters related to man-hour were listed and we made necessary correlation analysis of these historical data. Parameters with high contribution are then used as input of forecasting model. A new forecasting model utilizing SVM is proposed, which carries out the process as the minimum research granularity. Its performance is compared with back propagation neural network. The process of automatic drilling & riveting is adopted as an example in order to present and validate the model. Experimental results reflect that SVM has high forecast precision and good fitness, so that it is suitable for small sample prediction. Through the optimization, it can effectively predict man-hour of assembly work in a short time while maintaining sufficient accuracy.


INTRODUCTION
Due to the large size, complex shape and numerous parts, the amount of aircraft assembly accounts for more than a half of total aircraft manufacturing (Enming, 2005).Aircraft assembly is the process of assembling a large number of aircraft parts according to a certain order and gradually putting them into components, forge pieces and sub-units.Finally, the components are butted into an entire aircraft.Therefore, aircraft assembly, as a very important part in the industry of large passenger aircraft, need be carried out strictly in accordance with the production plan, which includes not only the distribution of production site, equipment and resources, but also the arrangement of man-hour.Man-hour quota not only directly affects the working time, the utilization rate of the equipment, but also is the basic unit for calculating cost (Bin and Zuhua, 2006).At the same time, it is widely used as a toll for cost management in manufacturing enterprises.Therefore, man-hour prediction as the core of man-hour quota management is directly related to economic accounting, production schedule control, resource optimization, production cycles shortening, cost control and product quotation.Besides, it ultimately promotes the improvement of labor productivity of enterprises and enhances their market competitiveness (Chao and Danchen, 2010).
Just realizing this point, advanced international aircraft manufacturing companies have changed the condition that arranges production plan passively into predicting man-hours actively.Through the analysis, they can determine the production planning and production scheduling.Since the civil aircraft industry in China started late, there is a big gap in man-hour Yu, T. and Cai, H. prediction between China and international level.The overall utilization of workshop equipment is less than 40% (Changqing et al., 2011).At present, man-hour prediction approach is mainly estimated through workers' experience, analogy and other methods based on relevant technical files and it has to calculate the production hours according to the detailed production process step by step.This method has the disadvantages of slow speed, low efficiency, big error and high dependence on personal experience, which causes the uncertainties in production planning and scheduling (Yajie et al., 2013).As a result, it is unable to adapt to the needs of modernization, scientific and fine production management, seriously affecting the production schedule and production efficiency.
Different industries including computer, ships, computer numerical control (CNC) machining and aerospace have devoted a lot to develop a series of man-hour management software with good performance.Man-hour prediction has been treated as an important method to improve production efficiency, optimize resource management and shorten the manufacturing cycle.Among this series, the most representative are Timesheet, AceTeamwork, Appfarm, G TimeSheet and Replicon Web TimeSheet.However, most of these programs are integrated with enterprise resource planning (ERP) system, manufacturing execution system (MES) and computer aided process planning (CAPP).Their core function is to manage and share the document and the information during production.Thereby, man-hour management is deficient.So far, the time forecasting methods in manufacturing industry mainly includes standard data method, simulation method by numerical control (NC) program and artificial intelligence method.Boothroyd (1994) used an injection molding as standard and estimated other injection moldings rely on the relative value.Wang et al. (2002) used a triangular fuzzy number to estimate ready time and solved the ready time scheduling problem.Siller (2006) predicted the man-hour by simulation on NC program and specially forecasted the cycle of high-speed milling.Coelho (2010) also presented a practical mechanistic method for milling time estimation when machining freeform geometries by process simulation method.Bustillo and Correa (2012) presented a predictive model by using artificial intelligence to optimize deep drilling operations under high speed conditions for the manufacture of steel components.A feed-forward neural network and support vector machine (SVM) are used for surface's roughness prediction by Çaydaş and Ekici (2012).Two soft computing techniques, namely, neuro fuzzy logic technique and support vector regression technique were used for the assessment of the remaining useful life (RUL) of cutting tools (Gokulachandran and Mohandas, 2013).Li et al. (2012) tested various possible linear regression methods by using six dependent variables to enhance the production yield in the color filter (CF) manufacturing process.In his study, the Support Vector Machine for Regression (SVR) model was found to be the most appropriate for forecasting manufacturing performance.In these methods, SVM owns few adjustable parameters, quick computing speed, accurate rate, good robustness and good generalization ability (Cao and Tay, 2003).In the absence of more background information, it can also achieve high prediction accuracy (Kotsiantis, 2007).
It solves the non-linear, high-dimension and local minima and some other issues effectively.
On the basis of extensive study of time prediction methods and the present situation in manufacturing industry, aircraft assembly work is broken down according to the concept of work breakdown structure (WBS).Then, we listed various related process parameters.Using process as the smallest granularity, the predictive model based on SVM is established.Typically, one of the important processes -automatic drilling & riveting process -is introduced as an example to verify the model.Moreover, the predictive performance is also compared with the back propagation (BP) neural network.Eventually, considering that the grid algorithm is time-consuming, we choose to use PSO algorithm and genetic algorithm (GA) to optimize the predictive model.

PROBLEM AND ALGORITHM DESCRIPTION
Man-hour has different meanings under different situations.
In this paper, from the perspective of work assignment, it refers to a combined unit of human labor and time required to carry out individual task needed in the process of production (Hur et al., 2013).For instance, a task involves cutting steel plate with the size of 1,000 x 800 x 5 mm and it needs two persons to man-hour data and is the foundation for time management and prediction, the minimum unit of our research is the process.
A complex project can be decomposed into a series of clearly detailed subprojects by WBS, which groups project elements through deliverables-oriented ideas.These elements organize and define the work scope of a whole project.The subprojects decomposed by WBS cover labor and non-labor resources, including man-hour information which provide an important foundation for schedule planning, resource requirements, cost budget, purchasing plan etc. Top-down approach is applied in this paper to complete the WBS decomposition.Namely, aircraft assembly is decomposed by the order of project-sortieworkstation-station-process.Then, we can predict the man-hour of each process.The total time of a project can be obtained by accumulating all the process' man-hour.

SUPPORT VECTOR MACHINES
SVM is established on the basis of statistical learning theory (Cristianini and Shawe-Taylor, 2000).It is a kind of machine learning method which applies the principle of Vapnik-Chervonenkis (VC) dimension and structural risk minimization.It extensively overcomes the problems of "dimension disaster" and "too much learning" that exist in traditional machine learning methods such as neural network (Peng and Wang, 2009;Choi, 2009).It shows a lot of unique advantages in solving small sample, non-linear and high dimensional recognition problems.Especially, it is widely used in pattern recognition, regression analysis, function estimation, time series forecasting and other fields (Ge et al., 2004;Guo and Li, 2003).
We establish a training set containing n training samples as {(x i , y i ), i = 1, 2, ..., n}.Among them, x i (x i Є R d ) is the i th input column vector of training samples.In this study, they are a series of variables associated with the man-hour; , is the corresponding output value, namely, process' man-hour, and d is the number of rows.It is clear that manhour forecasting belongs to the typical non-linear problem.The regression function in high dimensional space, which represents the relationship between process' man-hour and input variables, can be expressed as: ζ limits the error of the regression function and slack variables, , ( ≥ 0, ≥ 0)are introduced.Namely, where C is the penalty factor.In order to simplify the learning process of SVM, the problem can be changed as follows by using Lagrange function: (2) (3) where K(x i ,x j ) = ϕ(x i ) ϕ(x j ) (the inner product of x i and x j ) is kernel function.
We applied radial basis function (RBF) kernel function K(x,y) = exp ( ) in our study.Assuming that the optimal solution of Eq.

PARTICLE SWARM OPTIMIZATION
Particle swarm optimization (PSO) was firstly proposed Kennedy and Eberhart (1995) and it is based on the research of group behavior of birds.PSO looks each individual in n-dimension as a particle without weight and volume.It flies in the search space with a certain speed.The flight speed is adjusted dynamically by individual flight experience and group flying experience.
In a n-dimensional space, the i th particle's position and speed can be expressed respectively as: The optimal position that each particle has experienced is P i = (P i1 , P i2 , ..., P in ).The global optimal position where φ(x) represents the high dimensional feature space.It is a non-linear mapping function; ω and b are parameters to be estimated.
Yu, T. and Cai, H. P g (t) Є {P 0 (t), ... P g (t)}| f (P g (t)) min {f(P 0 (t)), ... f(P s (t))} is P gbest = (P g1 , P g2 , ..., P gn ).P g (t) means the best location of a population containing s particles.and some other kind of parameters.Besides, the relationship among these factors is complex, leading the forecasting work to a typical non-linear problem.As input parameters of the predictive model, man-hour driving factors directly affect the predictive accuracy of the model.So, we need to analyze this influence from aspects of the contribution and relevance of the factors.This step is to make correlation analysis of related factors by using statistical product and service solutions (SPSS), an IBM software widely used in statistical analysis, data mining, prediction analysis and decision support.

Sample data acquisition and processing
The data's changing scope of sample set imposes an important influence on training model.The sample set should reflect the characteristics of a process as far as possible.Then the data were observed and collected, and invalid data were excluded.Due to the big differences in magnitude and unit, the data should be normalized before establishing the regression model.

Model selection and parameter determination
When building SVM, the selection of kernel function is directly related to the performance of the model.We choose to use radial basis function (RBF) which is widely used owning to its best performance.The penalty factor C and the kernel parameter g directly influence the accuracy of the predictive model.The role of C is to adjust the learning machine's confidence interval range in a certain data subspace.Kernel parameter g affects the distribution complexity of sample data's subspace and determines the minimum error (Keerthi and Lin, 2003).In this study, cross validation is firstly adopted to determine the parameters C and g.Subsequently, we apply PSO to make further optimization and we get the best parameter combinations of C and g.

Sample training and model establishment
Generally, for a given time prediction problem, sample data {x 1 , x 2 , x 3 , ..., x r } are usually divided into 2 parts.The front m data is set as the training sample to build predictive model and the latter n-m data is for prediction test.Model building is actually the process of solving regression function (Eq.2) by SVM with its good performance.

Application and forecast
Put the input data that need to be forecast in accordance with the specified format.Then, predict them and make error analysis through the predictive model achieved in step "Sample The updating formula of the i th particle's velocity and position are: a) velocity updating formula: b) position updating formula: where w is the inertia coefficient whose value is also in the scope of (0,1); c 1 is the cognitive coefficient; c 2 is the social learning coefficient.The values of c 1 and c 2 are usually in the scope of (1, 2); r 1 , r 2 , are the random number in the scope of (0,1).
The parameters C and g should be considered as a particle in space.Under this assumption, the problem of parameter optimization is substantially to find the global optimal position of all particles.Then, the SVM model can be optimized by utilizing the advantages of fast global optimization possessed by PSO.

THE PREDICTIVE MODEL BASED ON SUPPORT VECTOR MACHINES-PARTICLE SWARM OPTIMIZATION MODEL CONSTRUCTION
Historical data is a comprehensive reflection of the internal mechanism of a system's changes (Bing, 2014).The number of historical data shows the mechanism of the changes in an extent.Process' man-hour is an important basis for reasonable plan and arrangement of production.Man-hour related data are dependent on the specific process.Therefore, process is the basic unit for our study.We analyzed the technological characteristics of a specific process to extract influence factors.Besides, intrinsic relationships between these factors and the process' man-hour were further mined through machine learning, so as to achieve the purpose of man-hour prediction.
The steps of establishing the predictive model of SVM are described as follows:

Influence factors analysis
The total man-hour of aircraft assembly is affected by many factors including product parameters, process parameters  training and model establishment".For regression problems, the measures to value the performance of a model are mean square error (MSE or E) and determination coefficient (R 2 ):

MAN-HOUR PREDICTION OF AUTOMATIC DRILLING & RIVETING BY SVM-PSO
WBS decomposition demonstrates that the aircraft assembly involves multiple workstations and stations.Assembly work eventually composes a series of assembly process.There is a variety of connections during the course of aircraft assembly, such as riveting, bolting, welding, gluing etc.According to statistics, 70% of the aircraft accidents due to fatigue failure were attributed to the joint connection.Moreover, 80% of the fatigue cracks occurred in hole connection (Liming and Chongneng, 2008).It is visible that connection quality greatly affects the life of an aircraft.Riveting occupies a very important position among different kinds of connections.It is estimated that the amount of assembly labor accounts for about half or even more of the entire aircraft manufacturing labor.And riveting accounts for 30%.Owning to the good fatigue resistance and high reliability, automatic drilling & riveting is widely used in the assembly process of large plane and wing panel.It is an important process for aircraft assembly.Therefore, this paper takes automatic drilling & riveting process as an example to conduct the research of man-hour prediction.

ANALYSIS OF THE INFLUENCE FACTORS OF PROCESS' MAN-HOUR
According to statistics, headless riveting accounts for more than 80% in wing manufacturing process of aircraft ARJ21.Riveting & drilling parameters are the key-factors that affect the efficiency and quality of assembly (Lianxi et al., 2013).Generally, we analyze the driving factors of the automatic drilling & riveting process from the aspects of general features of assembly (I F ), processing technic (I p ), equipment parameters (I E ) and performance indexes (I I ).The function relation between man-hour and various factors can be expressed as: where n is the number of test samples; y 1 (i = 1, 2, …, n) is the real value of the i th sample; y' 1 (i = 1, 2, …, n) is the predictive value of the i th sample.The smaller MSE means the higher prediction accuracy.R 2 determines the related degree and the value more closer to 1 indicates the greater fitness.
Build the man-hour predictive model based on SVM-PSO as previously mentioned (Fig. 1).
General features of assembly includes assembly parts (F p ), material's composition (F i ) and thickness of the material (F t ).At present, the material' s airframe (including fuselage, wings, tail etc.) mainly adopts the materials of aluminum alloy, titanium alloy, steel and composite materials in the world.However, aluminum alloy material is used in more than 90% of the cases due to the restrictions of China's civil aircraft development.So, material composition is removed.Then, general features of assembly can be expressed as: Yu, T. and Cai, H. E r , E c , E d , E f , E  x max -x min + y min , (y max = 1, y min = -1) The processes can be achieved by automatic drilling & riveting, which mainly includes: drilling, countersinking, glue, nail feeding, fastener installation or completing one or more of these operations.We need to use the combination of these process.So, it is unnecessary to consider process technic when analyzing the influence factors.
Equipment parameters associated with the man-hour are mainly related to drilling parameters including spindle speed (E r ), clamping force (E c ), feed rate (E d ), riveting force (E f ), riveting residence time (E t ) and lubrication pressure (E p ).The relation between equipment parameters and man-hour can be expressed as: we obtained the correlation between each influence factor and process' man-hour through various analyses, respectively.Finally, the results are demonstrated in Table 1.
In the mentioned table, Pearson's correlation coefficient describes the relationship between variables as well as their statistical correlation degree.Pearson's correlation coefficients bigger than 0.4 mean good correlation, bigger than 0.5 mean strong correlation and bigger than 0.6 represent very strong correlation.Conspicuousness can be understood as probability, which is used to determine the overall conspicuousness difference between reality and hypothesis.It requires conspicuousness levels below 0.05, because 5% can be considered as a small probability event.It is easy to find from Table 1 that driving factors which influence the man-hour of automatic drilling & riveting process are: The indexes of main performance refer to scratch length of hole (I l ), surface roughness (I Ra ) and burr height (I h ).
After these considerations, Eq. 10 can be expressed as: It is found that the major driving factors are extremely related to some other detail parameters.Besides, these parameters may influence each other or some just have weak influence.In the current modeling process, we should pay enough attention to the pre-condition that improves the accuracy of the model with least driving factors to reduce the complexity and the noisy interference of the model (Wang et al., 2012).This requires further analysis of the factors' correlation and contribution.This study used Statistical Package for the Social Sciences (SPSS) to analyze the correlation of 99 sample data extracted from historical database.Firstly, the sample data was saved as in the format file.sav, which can be recognized by SPSS.Then, the analysis was made to judge the correlation between F p and process' man-hour, including Pearson's correlation and conspicuousness.Similarly,

SAMPLE DATA ACQUISITION AND PROCESSING
Firstly, we have to screen the sample data extracted from the historical database to eliminate invalid data of unconventional or high reusable degree.Then, 99 valid data were obtained and they were separated into test set (19 data) and training set (80 data) when building the model.The training set and the test set were randomly selected in order to ensure the accuracy of the model.Because the units of F t , E r , ..., E t are different and the magnitude of them differs tremendously, we made the normalization processing of these samples by the following formula: The data were normalized to the range of (-1,1) and 80 training data after normalization are shown in Table 2.

Sample number
Process time

REGRESSED PREDICTION BASED ON SUPPORT VECTOR MACHINES
We will begin to train the sample data and establish predictive model in this part after completing the mentioned preparations.Firstly, use cross validation (v = 5) to find the best parameters of C and g.Secondly, put the normalized sample data into predictive model and complete sample learning and prediction.The prediction results of training set and test set are shown in Figs. 2 and 3.
Both training set and test set were randomly generated in the experiment, so it could ensure the universality and accuracy of the result.Nevertheless, it is particularly necessary to conduct comparative analyses on other methods.In order to further validate the performance of the model, we also presented BP neural network to make predict experiment.The forecasting result is shown in Fig. 4.
Table 3 assesses the comparison of the forecasting results by SVM and BP neural network.After comparison, it is obvious that the prediction result of SVM is much better than BP neural network.MSE of BP neural network is much higher than SVM and R 2 of SVM is closer to 1. Thus, it proves that SVM can be effectively applied to the prediction problem of small sample while maintaining sufficient accuracy.

OPTIMIZATION OF MAN-HOUR PREDICTIVE MODEL
The parameters of SVM decide its study performance and generalization ability.In practical application, in order to reduce the dependence of initial sample and improve the accuracy of SVM model, it is necessary to optimize the parameters of C and g.It means that SVM optimization is actually the optimization of the relevant parameters, namely, quickly find optimal SVM parameters C and g.Grid search can find the highest accuracy of classification, namely, the global optimal solution.However, if you want to find the best parameters of C and g in greater scope, it will be very timeconsuming.With the decrease of the grid density, time grows exponentially.Heuristic algorithm can find the global optimal solution without traversing all the points in the grid.Currently, the widely used heuristic algorithms for finding optimal parameters are PSO algorithm, genetic algorithm, and simulated annealing algorithm.
Considering that the principle of PSO and GA is simple and they are easy to implement with high efficiency, this study tried to imply PSO and GA to optimize SVM prediction results obtained above.Figures 5 to 7 show the results of optimization using the PSO and their corresponding prediction results.
Figures 8 to 12 show the results of optimization using the PSO and their corresponding prediction results.From the optimization experiments of PSO and GA, we could find that the optimal prediction result is obtained when the maximum number of evolution (maxgen) = 500; population (pop) = 100 of PSO (MSE = 0.0019387; R 2 = 0.99212).And the optimal prediction result is obtained when maxgen = 100; pop = 20 of GA (MSE = 0.0022812; R 2 = 0.98412).As a result, PSO optimization is much better than GA.

CONCLUSION
Aircraft assembly is crucial for its production.It requires effective prediction of man-hour at the beginning of a project, in order to arrange the resources and make production plan reasonably.So far, time prediction mainly relies on work experience.It has a lot of problems, such as slow speed, low efficiency, big error and serious dependence on personal experience.Because the aircraft assembly work is complex and be related to various process' parameters, this research decomposed the assembly work through the concept of WBS.Then, the parameters that affect process were extracted by analyzing process' features.Parameters with high correlation are useful.Further, we selected the parameters that highly contributed to predicting man-hour as driving factors.Namely, the goodness of fit is close to 1, which can well reflect the relation between the dependent variable (man-hour) and the independent variable (F t , E r , E c etc.).
The predictive model based on SVM PSO can effectively forecast the man-hour of different processes according to their driving factors.It is an important approach for enterprises to realize scientific management of the man-hour.In the next step, we will continue to do more studies and experiments so as to make the prediction more precise.
complete this work.If it costs 10 minutes for each person, then the man-hour of the task is 20 (2 x 10) minutes.The man-hour of each task is related to comprehensive factors.Therefore, we have to predict the man-hour by taking into account various features of the task's properties, the environment and the requirements of the target.Because the process includes basic The Prediction of the Man-Hour in Aircraft Assembly Based on Support Vector Machine Particle Swarm Optimization 3 is , α = [α 1 , α 2 , ..., α k ], α * = [α problems, the parameter σ controls the flexibility of RBF kernel function.It directly affects the generalization performance of models.The penalty factor C plays the role of balancing decision function and error classification samples, which influences the generalization ability of the model.

Figure 1 .
Figure 1.The man-hour predictive model based on SVM-PSO.

Figure 2 .
Figure 2. The prediction result of training set.Figure 3. The prediction result of test set.

Figure 4 .
Figure 4. Prediction result of the test set by BP neural network.
Finally, man-hour predictive model based on SVM was established.Moreover, we made a comparison with BP neural network.As the parameters of C and g extremely determine the model's performance, this study proposed PSO and GA to optimize the predictive model.Various experiments were made with different parameters to evaluate MSE and R 2 .The results indicate that PSO is more accurate with better fitness.The process of automatic drilling & riveting is used as an example to validate the model with the comparing result of MSE = 0.0019387; R 2 = 0.99212.It means that the MSE between predictive value and the real value is very small and the coefficient of determination reached 0.99212.
The Prediction of the Man-Hour in Aircraft Assembly Based on Support Vector Machine Particle Swarm Optimization

Table 1 .
t , E p , I l , I Ra , I h ) Correlation comparison of different influence factors.

Table 2 .
The Prediction of the Man-Hour in Aircraft Assembly Based on Support Vector Machine Particle Swarm Optimization Sample set after normalization.

Table 3 .
The comparison of the forecasting results.