1. Introduction
It is very common to use multiple unmanned aerial vehicles to perform various tasks [
1,
2], especially military tasks [
3]. UAVs cooperatively patrolling perimeters, monitoring areas of interest or escorting convoys have been key research topics [
4,
5,
6]. Power inspection, air rescue, large parties and so on are the most common application examples of Multi-UAV cooperation [
7,
8,
9]. Therefore, it is an important research aspect to seek an efficient task-assignment mechanism to solve the task assignment problem of multiple UAVs [
10]. In this context, the problem of multi-UAV cooperative task allocation arises at the historic moment.
Collaborative task assignment of multiple UAVs is to assign one or more tasks to UAVs under various constraints in the battlefield environment and determine the sequence of tasks to be executed by UAVs. Task assignment is an important part of collaborative task planning of multiple UAVs, and is also the prerequisite for realizing collaborative operation of UAV groups [
11]. In the allocation process, factors such as the type of tasks, timing constraints among tasks, types of tasks that can be performed by UAVs, combat capabilities and payloads should be fully considered, and combat tasks should be reasonably assigned to the UAV fleet with the goal of achieving optimal efficiency of all tasks [
12,
13]. Therefore, the collaborative task-assignment problem of multiple UAVs is a non-deterministic polynomial (NP) problem of combinatorial optimization under multiple constraints [
14,
15]. The key to solving this problem is to establish the task assignment model and use the assignment algorithm.
At present, many general mathematical models have been proposed for multi-UAV task allocation at home and abroad. The commonly used task-allocation models include the multiple traveling salesman problem (MTSP) [
16] model, mixed-integer linear programming (MILP) model [
17], the vehicle routing problem (VRP) [
18] model, the network flow optimization (DNFO) [
19] model, the cooperative multiple task assignment problem (CMTAP) [
20] model, and the capacitated vehicle routing problem (CVRP) [
21] model. Based on these general multi-UAV task-allocation models, many scholars have established targeted task-planning models.
However, the current research still has the following problems: (a) In the establishment of qualified task-allocation models, such as the MTSP model and the CVRP model, the requirements of the actual battlefield are ignored in order to meet the model standards. The established model simplifies the objectives of task allocation and battlefield constraints, which reduces the application effect of research results in actual combat. (b) UAV and task types in existing task-allocation models are mostly single, lacking consideration of heterogeneous UAV attributes and multi-task types. Existing UAVs in the formation are different from each other in terms of purpose, load, combat capability, etc. In actual combat, multiple types of UAVs are often needed to cooperate to complete a certain task. Therefore, the task-allocation model based on a single UAV does not support complex tasks.
In solving problem models, there are two main research results: optimization method and heuristic method. The optimization methods include linear programming, width/depth preference search, branch and bound, and so on. The optimization method has a high time and space complexity in solving problems. As the scale of a task-assignment problem increases, computing power is required and the time consumed is also increased. It is thus necessary to simplify the model to a large extent, which makes it difficult to apply to complex problems. The heuristic method of solving a model mainly focuses on the model’s objective function, which requires less complexity. This can result in a better feasible solution in a short time, and has practical feasibility for large-scale complex problems.
The harmony search (HS) algorithm not only has the advantages of a meta-heuristic algorithm, but also own its characteristics [
22,
23]: (a) HS integrates all the existing harmony vectors to create a new harmony vector, in contrast to GA, which uses two elderly vectors, while particle swarm optimization (PSO) only considers the individual optimal position and the global optimal position of the whole population; (b) HS independently adjusts each variable via improvisation, unlike PSO, which deals with a solution vector by single rule. Due to these characteristics, HS has been successfully applied to a wide variety of optimization problems, such as river ecosystems [
24], selective assembly problems [
25], mental health analysis [
26], reservoir engineering [
27], speech emotion [
28] and wireless network positioning [
29].
However, HS suffers from trapping in performing local searches, and a crucial factor of its performance is the key control parameters, such as harmony memory (HM), harmony memory size (HMS), harmony memory considering rate (HMCR), pitch adjusting rate (PAR), bandwidth (bw) and the number of improvisations (G). It is unfortunate that there are few studies on the mathematical theory of the underlying search mechanisms of HS in the open literature. S. Das et al. proposed a new algorithm called EHS, after discussing the relationship between the explorative power of HS and the control parameters through analyzing the evolution of the population variance of HS [
30]. This algorithm has good exploration, but poor exploitation. Thus, this paper devotes significantly more attention to the search mechanism of HS and its parameters.
Therefore, the contribution of this paper is the proposal of a new model and a new method, which aim to: (a) considering the complexity of the actual missions, divide the missions into reconnaissance, strike and evaluation tasks, and considering the benefits, costs and constraints of the missions, establish the target allocation model of heterogeneous multi-UAV missions with different loads; (b) analyze the mathematical theory of the underlying search mechanisms of HS to balance exploration and exploitation, establish the iteration equation of HS and propose an improved HS; (c) use the improved HS algorithm to simulate and verify a multi-UAV task-allocation combat example.
The remainder of this paper is organized as follows. In
Section 2, the modeling of heterogeneous UAV task-assignment planning is presented in detail.
Section 3 is the theoretical analysis of harmony search. The proposed algorithm is described in
Section 4. The experiments and their results of the proposed algorithm are presented in
Section 5. The task-allocation simulation experiment is designed and discussed in
Section 6.
Section 7 discusses related work.
Section 8 then makes concluding remarks and indicates future works.
2. Modeling Heterogeneous UAV Task Assignment Planning
Multi-UAV cooperative task assignment is a complex multi-objective optimization and decision problem. Most current research models assume that the mission attributes of UAVs are consistent. However, multi-UAV formation is not a linear addition of a certain number of individual UAVs. Heterogeneous formation disperses the functions of reconnaissance and surveillance, electronic interference, strike and evaluation into a large number of single UAVs with low cost and single function. Through the high level of cooperation and self-organization among a large number of heterogeneous individuals, the cluster function far exceeds the individual function that can be realized. In addition, as the multi-UAV formation combat involves various aspects of the combat target and the task complexity is different, some large tasks need a certain combat sequence, and it is necessary to deploy and combine UAVs with different functions and characteristics to complete the corresponding combat task. For the above reasons, this section proposes a heterogeneous multi-UAV task-allocation model to enhance the antagonism of combat subjects and achieve higher-formation combat effectiveness in view of future diversified combat requirements.
2.1. Objective Function of Multi-UAV Cooperative Target Assignment Model
If the UAV formation of our side performs the task , the target list of the task is , represents the task type, and are the number of UAVs in the formation and the number of targets in the task list respectively, represents reconnaissance, represents attack and represents evaluation. The essence of multi-UAV formation task assignment is to solve a target-assignment scheme to maximize target fitness (). Based on this, the objective function of a multi-UAV cooperative target-allocation model is given as follows
Definition 1. whereisthe fitnessof the UAV numberedperforming the task numbered,isthe rewardof the UAV numberedperforming the task numberedandisthe costof the UAV numberedperforming the task numbered.
It can be seen that task fitnessis directly proportional to the reward of performing tasks and inversely proportional to the cost of performing tasks. 2.2. Task Rewards Modeling
In this paper, the rewards of UAV numbered performing a certain task numbered are mainly determined by three factors:
(1) UAV’s capability to perform specific tasks. The capability of UAV to perform tasks is quantified by its payload, such as detection device and weapon system, combined with its own performance, and described in probability based on historical experience (whether it has performed relevant target tasks in the past).
(2) the worth of enemy targets. The worth of enemy targets is determined in advance based on the information obtained beforehand, combined with the tactical plan available for reference. For example, destroying the enemy’s command center can paralyze the enemy’s operation, so the worth of the command center is relatively high, while striking the enemy’s small warehouse has relatively low worth compared to the whole operation.
(3) the defense capability of enemy targets. Combined with the detected battlefield environment information and advance intelligence information, the enemy targets, such as surface-to-air missile positions, anti-aircraft gun positions, flight bases and ground radar are analyzed and quantified.
From the above analysis, the mission rewards of UAV can be defined as follows:
Definition 2. whereis the rewards of a certain task numberedperformed by a UAV numberedwith a specific load,is the completion capability of the UAV numberedto perform tasks the task numberedandis the defense capability of the target.
The reward of each UAV mission meets the additivity, and the decision variable of the mission is introduced as follows:
Then, the total rewards of multiple UAV formation’s task execution on the target list are:
2.3. Task Cost Modeling
In the course of reconnaissance, attack or evaluation, UAV formation will encounter confrontational threats and terrain threats. Confrontational threats are mainly radar detection and fire attack by enemy targets during the execution of missions, so the defense capability of our UAV and the attack capability of enemy targets should be considered in task allocation. Terrain threats mainly involve the air-range cost of a UAV during its mission, terrain height and altitude change rate. On some terrain that is difficult to leap over, the UAV flying too high will consume too much energy, and the UAV climbing too-steep terrain is likely to impact the terrain and encounter other threats, so it should try to avoid the distribution results of too-long air-range or steep terrain. Based on the above analysis, the task-cost modeling in this section includes terrain-cost modeling and loss-cost modeling. At the same time, terrain cost and loss cost of different dimensions are normalized.
(1) Terrain Cost
The air-range cost of a UAV reflects the duration of the mission, the length of the UAV and the energy consumption. The Euclidean distance between the UAV and the enemy target is used in this paper to estimate the air-range cost. This is because the formation cannot accurately obtain the relevant data before the actual execution of the mission.
where
is the initial position coordinate of the UAV numbered
,
is the position coordinate of the enemy target numbered
,
is the position coordinate of the enemy target numbered
. Equation (5) gives the Euclidean distance between the UAV numbered
and the task target numbered
, as well as the Euclidean distance between the task numbered
and the task numbered
. To facilitate calculation, the matrix
is established as the distance matrix.
According to the above equation,
is a matrix of
, which contains the Euclidean distance between UAV and enemy targets and the Euclidean distance between enemy targets. Thus, the total air-range of the UAV numbered
returning to its initial position after completing the corresponding task target list can be given as follows:
where
represents the total number of tasks performed by UAV numbered
. Further analysis of Equation (7): when
,
represents the distance between the initial position of the UAV numbered
and the mission target numbered 1. When
,
represents the Euclidean distance between the No.
mission target, performed by the UAV numbered
, and the No.
mission target. When
,
represents the distance of UAV numbered
returning to its starting point after completing the last mission target.
Considering the impact of terrain height and altitude-change rate on the UAV’s loss, terrain-steepness factor λ is introduced to the weighted UAV’s track. Assuming that the battlefield environment can be known in advance before the execution of the mission, the value of the terrain-steepness factor λ is determined by combining the result of task assignment with the terrain information known in advance. In this paper, λ ranges from [1.2, 2.5] according to the complexity of the terrain.
(2) Loss Cost
The loss cost of a UAV is mainly caused by the confrontational threats encountered by UAV in the execution of tasks. Therefore, the loss cost mainly considers the following factors: the worth of UAV with specific load, the defensive capability of UAV itself, and the attack capability of enemy targets. Combined with the significance of loss cost, it can be concluded from the above analysis:
Definition 3. whereis the loss cost of the UAV numberedwhen executing the mission targetnumbered,is the attack capability of the mission targetnumbered,is the worth of UAV numbered,is the defense capability of UAV numbered.
Thus, it can be concluded that the task cost of multi-UAV formation in task allocation is as followed:
where
is the maximal element with the total loss cost of performing all tasks,
is the terrain steepness factor reflecting the complexity of different UAV flight paths,
and
respectively represent the UAV’s track weight and loss weight and
is the task decision variables introduced in
Section 2.2.
2.4. Modeling Constraints
(1) Constraints on Task Completion
During the execution of the mission, UAVs involved in such tasks as strike and reconnaissance must perform all the missions, and there cannot be a situation where some tasks are not assigned, namely,
where,
and
represent the number of UAVs and the number of assigned tasks respectively.
(2) Constraints on Task Non-Redundancy
During task execution, it must be ensured that each task can only be performed by a certain UAV, namely,
(3) Constraints on UAV’s Capability
The number of task targets numbered
performed by the UAV numbered
should not exceed the maximum of its own task capability:
where
represents the maximum number of tasks that UAV numbered
can execute.
(4) Constraints on UAV’s Attack Range
When task target assignment is completed, the air-range of the total mission performed by the UAV numbered
cannot exceed its maximal flight distance, namely:
where
is the terrain steepness factor, and
is the maximal air-range of UAV numbered
.
In order to transform constrained optimization problems into unconstrained problems, this section introduces the penalty function and designs an outpoint method to punish the consumption cost of infeasible solutions when infeasible solutions appear in the population. The constrained optimization method improves the efficiency of the task-assignment model. Therefore, Equation (1) is further updated, and the final total task fitness is as follows:
where
is a positive number known as the penalty factor, and
is the penalty function.
2.5. Task-Assignment Coding
Constructing the mapping relationship between particle and feasible solution of cooperative-task assignment for multiple UAVs is the key step for the swarm intelligence algorithm to solve the optimization problem. In the collaborative-task assignment process for multiple UAVs, the response of the UAVs to enemy target reconnaissance, attack and evaluation tasks should be ensured, and the timing of tasks should also be ensured. That is, an enemy target must be reconnoitered before the attack task, and finally the effect evaluation, which embodies the double-layer coding meaning of particles on the task-assignment problem. Referring to reference [
31], this paper adopts real vector coding. Suppose that the dimension of the particle is the number of the task target list, the subscript of the particle corresponds to the number of tasks, the integer part of the particle position is rounded up to indicate the number of UAVs and the fractional part of the particle position corresponds to the task order of UAVs from small to large.
The mapping relationship between particle coding and task target assignment can be expressed by the following example. Suppose that three UAVs must perform nine tasks, and the task-assignment coding is shown in
Figure 1.
4. The Opposition-Based Learning Parameter-Adjusting Harmony Search (OLPDHS) Algorithm
Through the above analysis and discussion, on the one hand, , which is related to the original HM initialization, is the part of , so it is necessary to find a better initialization method to explore an effective offspring population for improving the algorithm’s exploration. On the other hand, adjusting key control parameters’ values is a good choice to obtain better fine-tuning properties and convergence speed.
The OLPDHS algorithm is proposed. On the one hand, the algorithm improves the initialization method. On the other hand, the fine-tuning properties and convergence speed of HS can be improved by adjusting parameter value. What the OLPDHS algorithm has in common with the classical HS algorithm is that both of them go through four processes: initializing the parameters involved in the algorithm, initializing the harmony memory, improvising a new harmony and updating the harmony memory. The difference between the two is the way of population initialization and the way of control parameter change.
4.1. The Opposite-Based Learning Strategy for Population Initialization
Opposite-based learning (OBL) was first proposed by Tizhooshin in the literature [
33]. The main idea of OBL is to obtain a better approximated solution than the current candidate solution, by considering both an estimate and another estimate that is opposite to this estimate, so OBL expands the solution space and reduces the search time [
34,
35].
Definition 4. Letbe a point in the dimensionalcoordinate system, where, then, the opposite pointofis completely determined according to.
In the original HS algorithm, the initial population is generated in a random way. In order to improve the quality of the initial harmonic memory warehouse, this paper proposes a variety of alternative opposite learning strategies, which include five methods based on opposite learning strategies [
36]. Method 1:
, Method 2:
, Method 3:
, Method 4:
, Method 5:
.
4.2. Parameter Adjustment Policy
In order to balance the exploration ability and development ability of the HS algorithm, it is necessary to adjust the key control parameter dynamically during the search process.
According to Theorem 2, the parameter
(
) is dynamically adjusted as follows. To better detail
, the more general expression for the average
of the group is given as Equation (40):
where
is the
th dimension of the problem,
is the maximum dimension of the problem and
represents the average value of the
th dimension variable. Then, get
, here
are randomly selected from HM and
, thus
can be obtained as Equation (41).
Thus the result of the parameter
is obtained.
It can be seen from Equation (42) that the dynamic adjustment of the parameter mainly depends on the square root of the mean value of the harmony vectors of the population after each update. In this way, it is easier to realize the actual operation of dynamic adjustment of the parameter . When the new solution is fine-tuned according to Equation (40) or Equation (41), the information of the current population is carried out by the parameter . The fine-tuning of the new solution has global guiding significance. However, the fine-tuning solution with the fixed value of the parameter does not consider the current population search situation. Additional calculation after each evolution is required, which increases the calculation amount of the algorithm.
7. Related Work
There are many researchers dedicated to the study of task allocation, and many methods have been put forward. This section mainly summarizes the methods of task allocation in various fields and the achievements achieved in UAV task allocation.
The threshold-based method is used to solve task-allocation problems. In 2005, extreme teams, large-scale agent teams operating in dynamic environments, were on the horizon. Low-communication approximate distributed constraint optimization (LADCOP), a distributed threshold-based algorithm, was proposed to solve task-allocation problems in the domain of simulated disaster rescue. The tasks are perceived by the agents in the environment [
44]. In 2007, swarm generalized assignment problem (Swarm-GAP) was used to experiment in the RoboCup rescue simulator [
45]. In 2010, considering the various actors in the RoboCup rescue, Swarm-GAP, LADCOP and a greedy method were used to solve distributed task allocation among teams of agents in a RoboCup rescue scenario [
46]. The results showed that the performance of Swarm-GAP and LADCOP are similar and that they outperform a greedy strategy. The threshold-based approach is applied to division of labor control for a robot group [
47], but the task ordering is performed by a central command unit, which generates a significant number of messages. Swarm-GAP was adopted to deal with this problem and a new method with three algorithm variants was proposed in 2017 [
48]. This method effectively prevented some agents from being overloaded with tasks while others remained idle. The aforementioned researchers aimed at the optimization of their resource usage applied in the context of static environments. In 2020, Amorim J C evaluated the performance of three swarm-GAP variants in dynamic contexts, and extended these algorithms to properly address more realistic dynamic scenarios [
49].
The market-based method is applied to task-allocation problems. Task specification trees (TSTs) as a highly expressive specification language for complex multiagent tasks were used. Meanwhile, a sound and complete distributed heuristic search algorithm for allocating the individual tasks in a TST to platforms was proposed in 2010 [
50]. A decentralized distributed solution approach based on multi-agent systems (MAS) to manage emergency vehicles was developed, and a multi-agent architecture to fit real emergency systems was proposed. A more refined and efficient auction mechanism based on implicit agents’ coordination was examined to coordinate agents to reach good quality solutions in a distributed manner [
51]. For the multi-robot dynamic task-allocation problem, multi-objective optimization (MOO) was used to estimate and subsequently make an offer for its assignment. That is, after task detection, an auction took place amongst robots capable of executing it. Robots calculated their bid using MOO [
52].
There are also other works that propose methods for allocating tasks. ZORLU [
22] considered the UAV load constraints and modeled the problem as a CVRP model with load constraints. Shima T [
53] proposed the CMTAP model for UAVs, which divided the task types into identification, attack and damage assessment. The time sequence between tasks, execution time of tasks and multi-UAVs cooperative constraints were added into the model. Min Yao [
54] designed a collaborative multi-task assignment model for UAV groups that is suitable for multiple UAVs, multi-task targets and multi-task types. Wen Gu [
55] proposed a solution to UAV target allocation problem based on the Hungarian algorithm, and Jin Zhang et al. [
56] extended the Hungarian algorithm to multi-target allocation. In addition, [
57] used the MILP method to solve multi-UAV target allocation, and other optimization methods include dynamic programming and graph theory.
This current paper considers the type of task, the UAV load constraints, multiple UAVs, multi-task targets and multi-task types, features that are similar to previous studies in the literature [
21,
53,
54]. Unlike these studies, this paper builds a model from two aspects of the rewards and costs of performing tasks.
8. Conclusions
In this paper, the complex task set was decomposed into sub-tasks suitable for a single UAV, and then the task-allocation problem was described and defined from the aspects of task rewards and task cost. The UAV’s own constraints and mission constraints were defined. The mathematical model of task assignment was established, being more suitable for actual battlefield environment. The OLPDHS algorithm was proposed through discussion of exploration performance and convergence performance and its parameter values (HMCR > 0.9, PAR is in the interval [0.3, 0.7], HMS is appropriate in [5, 40]) were given via a number of simulation experiments. In comparison with IHS, GHS, SAHS, SGHS, NGHS, NDHS, EHS and ITHS, the superior performance of OLPDHS was proven by a number of simulation experiments. A multi-UAV cooperative task-allocation example was designed to verify the superior performance of OLPDHS (Fitness is 1.5491, time cost is 0.0819). This is the first application of HS to the multi-UAV cooperative task-allocation problem. These performance qualities are ideal for helping decision-makers devise allocation schemes.
In the environment or in the system itself at runtime, real-world scenarios abound with the aforementioned dynamism. It is necessary for cooperative systems to deal with this dynamism to keep the execution and results level. Future work must study the dynamic redistribution problem, which is closer to real-world scenarios, focusing on how to establish the redistribution problem model in a dynamic environment, so that decision makers can make effective decisions in the battlefield’s changeable environment.