Least Squares in a Data Fusion Scenario via Aggregation Operators
Next Article in Journal
An Application of Rabotnov Functions on Certain Subclasses of Bi-Univalent Functions
Next Article in Special Issue
Systematic Review of Aggregation Functions Applied to Image Edge Detection
Previous Article in Journal
New Estimation Method of an Error for J Iteration
Previous Article in Special Issue
Development of the Generalized Multi-Dimensional Extended Partitioned Bonferroni Mean Operator and Its Application in Hierarchical MCDM
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Least Squares in a Data Fusion Scenario via Aggregation Operators

by
Gildson Queiroz de Jesus
* and
Eduardo Silva Palmeira
Postgraduate Program in Science and Technology Computational Modeling (PPGMC), Department of Exact Science and Technology (DCET), State University of Santa Cruz (UESC), Ilhéus 45662-900, BA, Brazil
*
Author to whom correspondence should be addressed.
Axioms 2022, 11(12), 678; https://doi.org/10.3390/axioms11120678
Submission received: 11 October 2022 / Revised: 24 November 2022 / Accepted: 25 November 2022 / Published: 28 November 2022

Abstract

:
In this paper, appropriate least-squares methods were developed to operate in data fusion scenarios. These methods generate optimal estimates by combining measurements from a finite collection of samples. The aggregation operators of the average type, namely, ordered weighted averaging (OWA), Choquet integral, and mixture operators, were applied to formulate the optimization problem. Numerical examples about fitting curves to a given set of points are provided to show the effectiveness of the proposed algorithms.
MSC:
93E24; 03E72; 47S40; 94A16

1. Introduction

Several studies have been carried out on data science. Datasets play an important role in several areas of knowledge, since information can be extracted from them. This information can be used, for example, in decision making, product improvement, process automation, and trend forecasting [1,2,3].
A number of methods and algorithms have been developed in the literature to extract different information from datasets through mathematical and computational methods. In general, these algorithms were developed to model datasets collected from a single source. In this regard, few algorithms have been formulated to solve the problem in a data fusion scenario, that is, in a scenario where data comes from different sources [4].
The least-squares method (LSM) is a widely used technique for data modeling based on the minimization of a quadratic function [4,5,6,7,8,9]. LSM was initially conceived for modeling data from a single source. In [4], an LSM was developed considering a data fusion situation (LSM-DF), that is, a method considering data from different sources. LSM-DF was designed for weighted data fusion.
From a mathematical point of view, the LSM-DF is based on a weighted average of the length of residual vectors of the equations b k = A k x + v k with k = 1 , 2 , , L , expressed by
k = 1 L | | v k | | W k 2 = k = 1 L v k T W k v k
where W k are the weights, that is, an aggregation of L values with their corresponding weightings. Here, a very interesting question arises: is weighted averaging the best method for aggregating the data in all scenarios? Within this context, the study of different aggregation methods has recently gained prominence.
Aggregation operators constitute a subarea of fuzzy theory that has the characteristic of combining finite datasets of the same nature into a single dataset [1,2,6,7,10,11,12,13,14,15,16,17,18,19]. These operators are basically classified into three categories: mean, conjunctive, and disjunctive. Applications of these operators can be found in medical problems, image processing, decision making, and engineering problems.
W k weights are directly related to the | | v k | | 2 length of residual vectors. However, in some situations, it would be interesting to dynamically allocate the weights to the W k weightings, putting more weight on the more important | | v k | | 2 values. Thus, considering the above, the aggregation operators can be considered to bw a viable alternative to change the behavior of LSM-DF.
This study seeks to optimally combine the least-squares method and the aggregation operators of the average type, more specifically, the ordered weighted averaging (OWA) [3,20,21,22] Choquet integral, [23,24], and mixture [25,26] operators. Furthermore, the aim of this study is to formulate and solve appropriate least-squares methods to model finite collections of datasets of the same nature. An important goal of these algorithms is to generate optimal estimates that aggregate data of different sources. This is necessary for situations that involve systems that can operate under different failure conditions. A numerical example is presented to show the effectiveness of the proposed algorithm.
This paper is organized as follows: in Section 2, preliminary results are related with an admissible order for matrices, aggregation operators, and LSM. In Section 3, LSM-DF via aggregation operators are deduced. In Section 4, a numerical example is shown.

2. Preliminaries

This section addresses topics that form the theoretical basis for the development of LSM-DF via aggregation operators. Initially, the admissible order for matrices is discussed, followed by the aggregation operators of the average type and the classical least-squares method.

2.1. Admissible Order for Matrices

In this section, we present the concept of admissible order for matrices based on [2,16,27]. This is a special way to consider total orders on the set of all matrices of order m × n with scalar in R (set of real numbers) denoted by R m × n .
Let A , B R m × n . It is clear that A M B given by
A M B if and only if a i j b i j , i , j
is a partial order on R m × n .
Considering a matrix A R m × n as a vector of columns, i.e., A = [ A 1 , A 2 , , A n ] where A i are the columns of A ( i { 1 , 2 , , n } ), then ≤ can be defined as
A M B if and only if A i M B i , i { 1 , 2 , , n } .
One can extend that partial order for a total order by considering the concept of admissible order as follows.
Definition 1. 
A total order ≼ on R m × n is admissible if, for each A , B R m × n we have that A B whenever A M B .
Example 1. 
Let be A and B column matrices on R m × 1 and π i ( A ) = a i 1 the projection on the i-th line of A. Then,
A c B k { 1 , 2 , , m } s . t . π k ( A ) < π k ( B ) a n d i , 1 i < k , π i ( A ) = π i ( B )
is an admissible order.
Therefore, one can generalize an admissible order on R m × n by considering the following definition: Let A , B R m × n such that A = [ A 1 , A 2 , , A n ] and B = [ B 1 , B 2 , , B n ] . Then
A M B k { 1 , 2 , , m } s . t . A k c B k a n d i , 1 i < k , A i = B i
is an admissible order on R m × n .

2.2. Aggregation Operators

Aggregation operators are numeric operators that combine multiple input values into a single output value. In this data fusion process, operators aggregate data from different sources to obtain a single unit of data from the conducted analysis. Next, the operators used in this study are presented: OWA, Choquet integral, and mixture operators.
Definition 2 
([12]).(OWA operator) Providing an n- dimensional weight vector, that is, a W = ( w 1 , w 2 , , w n ) with k = 1 n w k = 1 , the O W A W : [ 0 , 1 ] n [ 0 , 1 ] function is defined by
O W A W ( x 1 , x 2 , , x n ) = k = 1 n w k x ( k )
where ( x ( 1 ) , x ( 2 ) , , x ( n ) ) is the descending order of vector ( x 1 , x 2 , , x n ) and is named an ordered weighted average function.
Example 2. 
Defining the w = ( w 1 , w 2 , , w n ) vector of weights, where w i = 0 and w k = 1 , for some fixed k { 1 , 2 , , n } . So O W A w ( x ) = x k is the so-called static OWA operator.
Remark 1. 
As one can see in Definition 2, the sum of all the weights in the OWA aggregation results is 1 ( k = 1 n w k = 1 ). If the weights are matrices, the sum is given by k = 1 L | | W k | | 1 = 1 where | | | | 1 is the norm of the matrices given by
| | A | | 1 = m a x 1 j s i = 1 r | a i j | , w h e r e A R r x s .
Remark 2. 
The entries in the OWA aggregation must be sorted; if the entries are a matrix, an ordering relation must be used over the set R m x n . So, we can consider an admissible order on R m x n as defined in 1.
The next definition is the fuzzy discrete measure, a significant result for the definition of the Choquet integral operator.
Definition 3 
([15]).A discrete fuzzy measure is a function μ : 2 N [ 0 , 1 ] where N = { 1 , 2 , , n } and 2 N is the group of parts of N , such that:
  • M 1 : μ ( X ) μ ( Y ) when X Y
  • M 2 : μ ( ) = 0 and μ ( N ) = 1 .
Definition 4 
([10]).(Choquet integral operator) μ : 2 N [ 0 , 1 ] is a discrete fuzzy measure. The discrete Choquet integral related to the measure μ is the function C μ : [ 0 , 1 ] n [ 0 , 1 ] defined by:
C μ ( x 1 , x 2 , , x n ) = k = 1 n x [ k ] [ μ ( { j N : x j x [ k ] } ) μ ( { j N : x j x [ k + 1 ] } ) ]
where ( x [ 1 ] , x [ 2 ] , , x [ n ] ) = S o r t ( x 1 , x 2 , , x n ) is an ascending ordering of the vector ( x 1 , x 2 , , x n ) and x [ n + 1 ] = 2 by convention.
The Choquet integral operator can also be calculated with the following simplified expression:
C μ ( x 1 , x 2 , , x n ) = k = 1 n x [ k ] x [ k 1 ] μ ( G k )
where x [ 0 ] = 0 and G k = { [ k ] , [ k + 1 ] , , [ n ] } .
Example 3. 
Considering fuzzy discrete measure
μ ( X ) = 1 , s e X = N 0 , o t h e r w i s e .
Thus, the following Choquet integral can be defined by:
C μ ( x 1 , x 2 , , x n ) = [ x [ 1 ] x [ 0 ] ] μ ( G 1 ) + + [ x [ n ] x [ n 1 ] ] μ ( G n ) .
μ ( G 1 ) = 1 and μ ( G i ) = 0 for the other values of i; therefore, the result is C μ ( x 1 , x 2 , , x n ) = x [ 1 ] = min ( x 1 , x 2 , , x n ) .
Definition 5 
([15]).(Mixture Operator) w 1 , w 2 , , w n : [ 0 , 1 ] [ 0 , + ) are functions called weight functions. The M I X w 1 , w 2 , , w n : [ 0 , 1 ] n [ 0 , 1 ] function is defined by:
M I X w 1 , w 2 , , w n ( x 1 , x 2 , , x n ) = k = 1 n w k ( x k ) . x k k = 1 n w k ( x k )
is called the mixture function associated with the weight functions w 1 , w 2 , , w n .
Example 4. 
Defining
w i ( x i ) = 1 n , s e x i = 0 x i , otherwise .
For simplicity, consider n = 3 . In this case, considering that
M I X w 1 , w 2 , w 3 ( x 1 , x 2 , x 3 ) = 0 , s e x 1 = x 2 = x 3 = 0 x 1 2 + x 2 2 + x 3 2 x 1 + x 2 + x 3 , otherwise
is the mixture function determined by the w i weights defined above.

2.3. Least-Squares Method

LSM is a widely known and applied mathematical optimization method used to solve several problems, including parameter estimation. This method consists of finding an optimal solution to the problem by minimizing the square of a residual vector.
Considering the equation
b = A x + v
where x R n × 1 is an unknown vector, A R N × n is a known parameter matrix, b R N × 1 is a known vector, and v R N × 1 is a vector named residual.
The least-squares problem is to find a solution x ^ that minimizes the length of the residual vector, that is, satisfying the following property:
| | b A x ^ | | 2 | | b A x | | 2
for all x R n × 1 . The | | | | 2 denotes the square of Euclidean norm
| | v | | 2 = v T v .
Therefore, the solution to the least-squares problem consists of solving the optimization problem
m i n x J ( x )
where the functional cost J ( x ) is given by
J ( x ) = | | b A x | | 2 = b A x T b A x .
Theorem 1 
([4]).(Least-Squares Method) If matrix A has full rank, then there is a single optimal solution x ^ for least-squares Problem (9) that is given by
x ^ = A T A 1 A T b .
Moreover, the resulting minimal value of the cost function can be written as
J ( x ^ ) = b T b b T A A T A 1 A T b .

3. LSM-DF via Aggregation Operators

In this section, LSM-DF is developed via aggregation operators. LSM-DF via an OWA operator, LSM-DF via a Choquet integral operator, and LSM-DF via a mixture operator are also presented. These LSM-DFs are an alternative to estimation problems in the case of several datasources.
The next result is necessary to the proof of the LSM-DF via aggregation operators.
Lemma 1. 
If matrices A k have full rank and matrix W k is symmetric definite-positive with k = 1 , 2 , , L , then A ¯ T WA ¯ where
A ¯ = A 1 A 2 A L , W ¯ = W 1 0 0 0 W 2 0 0 0 W L
is nonsingular.
Proof. 
Let suppose that A ¯ T WA ¯ is singular; then, there must exist a nonzero vector λ , such that A ¯ T WA ¯ λ = 0 , which implies that λ T A ¯ T WA ¯ λ = 0 , i.e.,
λ T A 1 A 2 A L T W 1 0 0 0 W 2 0 0 0 W L A 1 A 2 A L λ = 0
λ T A 1 T W 1 A 1 λ + λ T A 2 T W 2 A 2 λ + + λ T A L T W L A L λ = 0
(15) can be rewritten as
| | A 1 λ | | W 1 2 + | | A 2 λ | | W 2 2 + + | | A L λ | | W L 2 = 0 .
| | | | W 2 denotes the square of the weighted Euclidean norm
| | v | | W 2 = v T W v .
As matrices W k are symmetric definite-positive, it follows from (16) that | | A k λ | | W k 2 = 0 so that A k λ = 0 with k = 1 , 2 , , L . This, in turn, means that the columns of A k are linearly dependent. Hence, A k is not full-rank.    □

3.1. LSM-DF via OWA Operator

For the deduction of LSM-DF via OWA operator, the following equations should be considered
b ( k ) = A ( k ) x + v ( k ) , k = 1 , 2 , , L
where x R n × 1 is an unknown vector, A ( k ) R N × n known parameters arrays, b ( k ) R N × 1 known vectors, and v ( k ) R N × 1 vectors named residuals.
A solution to the least-squares problem via operator OWA x ^ must minimize the length of the residual vector, that is, it must satisfy the following property:
k = 1 L | | b ( k ) A ( k ) x ^ | | W k 2 k = 1 L | | b ( k ) A ( k ) x | | W k 2
for all x R n × 1 and where W k are a positive-definite symmetric matrices.
Optimal solution x ^ is found by solving the following minimization problem:
m i n x J O W A ( x ) .
Functional J O W A ( x ) can be defined as
J O W A ( x ) : = O W A W ( J 1 ( x ) , J 2 ( x ) , , J L ( x ) )
where W = ( W 1 , W 2 , , W n ) are weight matrices and
J k ( x ) : = | | v ( k ) | | 2 = | | b ( k ) A ( k ) x | | 2 , k = 1 , 2 , , L .
Therefore, by defining the OWA operator, Function (21) can be rewritten as
J O W A ( x ) : = k = 1 L | | b ( k ) A ( k ) x | | W k 2 = k = 1 L b ( k ) A ( k ) x T W k b ( k ) A ( k ) x .
The next theorem brings the solution to the least-squares problem via the OWA operator in (20).
Theorem 2. 
(LSM-DF via OWA Operator) If matrices A ( k ) with k = 1 , 2 , , L have full rank and W k are symmetric definite-positive matrices, then there is a unique optimal solution x ^ to the least-squares problem via OWA operator (LSM-DF via OWA operator) that is given by:
x ^ = k = 1 L A ( k ) T W k A ( k ) 1 k = 1 L A ( k ) T W k b ( k ) .
The corresponding minimal value of J O W A ( x ) is
J O W A ( x ^ ) = k = 1 L b ( k ) T W k b ( k ) k = 1 L b ( k ) T W k A ( k ) ( k = 1 L A ( k ) T W k A ( k ) ) 1 k = 1 L A ( k ) T W k b ( k ) .
Proof. 
Consider the cost function
J O W A ( x ) = k = 1 L b ( k ) A ( k ) x T W k b ( k ) A ( k ) x
J O W A ( x ) = b ( 1 ) A ( 1 ) x T W 1 b ( 1 ) A ( 1 ) x + b ( 2 ) A ( 2 ) x T W 2 b ( 2 ) A ( 2 ) x + + b ( L ) A ( L ) x T W L b ( L ) A ( L ) x
J O W A ( x ) = b ( 1 ) A ( 1 ) x b ( 2 ) A ( 2 ) x b ( L ) A ( L ) x T W 1 0 0 0 W 2 0 0 0 W L b ( 1 ) A ( 1 ) x b ( 2 ) A ( 2 ) x b ( L ) A ( L ) x
J O W A ( x ) = b ( 1 ) b ( 2 ) b ( L ) A ( 1 ) A ( 2 ) A ( L ) x T W 1 0 0 0 W 2 0 0 0 W L × b ( 1 ) b ( 2 ) b ( L ) A ( 1 ) A ( 2 ) A ( L ) x
that can be rewritten in matrix form as
J O W A ( x ) = b ¯ A ¯ x T W ¯ b ¯ A ¯ x
where
A ¯ = A ( 1 ) A ( 2 ) A ( L ) , b ¯ = b ( 1 ) b ( 2 ) b ( L ) , W ¯ = W 1 0 0 0 W 2 0 0 0 W L .
Entries ( A ( 1 ) , A ( 2 ) , , A ( L ) ) and ( b ( 1 ) , b ( 2 ) , , b ( L ) ) are descending orders of ( A 1 , A 2 , , A L ) and ( b 1 , b 2 , , b L ) , respectively. W ¯ is a diagonal positive-definite symmetric matrix with entries W k .
To find the critical point in x, J O W A ( x ) must be differentiated and equal to zero
x x T A ¯ T WA ¯ x x T A ¯ T W ¯ b ¯ b ¯ T WA ¯ x + b ¯ T W ¯ b ¯ = 0 x T A ¯ T WA ¯ b ¯ T WA ¯ = 0 .
Via Lemma 1, matrix A ¯ T WA ¯ is invertible. Therefore,
x ^ = A ¯ T WA ¯ 1 A ¯ T W ¯ b ¯ .
Replacing (31) into (33), the solution can be rewritten as
x ^ = k = 1 L A ( k ) T W k A ( k ) 1 k = 1 L A ( k ) T W k b ( k ) .
In fact, for the Hermitian matrix to be defined as positive
2 J O W A ( x ) x T x = A ¯ T WA ¯ = k = 1 L A ( k ) T W k A ( k ) > 0
J O W A ( x ) in (30) must be a strictly convex function; therefore, x ^ is a unique global minimum.
The minimal cost J O W A ( x ^ ) can be expressed as
J O W A ( x ^ ) = k = 1 L | | b ( k ) A ( k ) x ^ | | W k 2 = b ¯ A ¯ x ^ T W ¯ b ¯ A ¯ x ^ = b ¯ T W ¯ b ¯ b ¯ T WA ¯ x ^ x ^ T A ¯ T W ¯ b ¯ + x ^ T A ¯ T WA ¯ x ^
Replacing (33) into (36) results in
J O W A ( x ^ ) = b ¯ T W ¯ b ¯ b ¯ T WA ¯ A ¯ T WA ¯ 1 A ¯ T W ¯ b ¯ .
Replacing (31) into (37), the optimal cost can be rewritten as
J O W A ( x ^ ) = k = 1 L b ( k ) T W k b ( k ) k = 1 L b ( k ) T W k A ( k ) ( k = 1 L A ( k ) T W k A ( k ) ) 1 k = 1 L A ( k ) T W k b ( k ) .
Remark 3. 
Applying k = 1 in Theorem (2), the LSM-DF via OWA operator reduces to the classical LSM in Theorem (1).

3.2. LSM-DF via Choquet Integral Operator

The deduction of the LSM-DF via the Choquet integral operator follows from the equations
b [ k ] = A [ k ] x + v [ k ] , k = 1 , 2 , , L
where x R n × 1 is an unknown vector, A [ k ] R N × n known parameters matrices, b [ k ] R N × 1 known vectors, and v [ k ] R N × 1 vectors named residuals.
A solution to the least-squares problem via the Choquet integral operator x ^ must minimize the length of the residual vector, that is, it must satisfy the following property:
k = 1 L | | b [ k ] A [ k ] x ^ | | I μ ( G k ) 2 k = 1 L | | b [ k ] A [ k ] x | | I μ ( G k ) 2
for all x R n × 1 and where I μ ( G k ) is a matrix identity multiplied by discrete fuzzy measure.
The optimal solution x ^ is found by solving the following minimization problem:
m i n x J C μ ( x )
Functional J C μ ( x ) can be defined as
J C μ ( x ) : = C μ ( J ¯ 1 ( x ) , J ¯ 2 ( x ) , , J ¯ L ( x ) )
where
J ¯ k ( x ) : = | | v [ k ] v [ k 1 ] | | 2 = | | ( b [ k ] A [ k ] x ) ( b [ k 1 ] A [ k 1 ] x ) | | 2 = | | b [ k ] b [ k 1 ] A [ k ] A [ k 1 ] x | | 2 , k = 1 , 2 , , L .
Therefore, by defining the Choquet integral operator, Function (42) can be rewritten as
J C μ ( x ) : = k = 1 L | | b [ k ] b [ k 1 ] A [ k ] A [ k 1 ] x | | I μ ( G k ) 2 = k = 1 L ( b [ k ] b [ k 1 ] A [ k ] A [ k 1 ] ) x T I μ ( G k ) b [ k ] b [ k 1 ] A [ k ] A [ k 1 ] x .
where I μ ( G k ) is a positive-definite symmetric matrix.
The next theorem brings the solution to the least-squares problem via the Choquet integral operator in (41).
Theorem 3. 
(LSM-DF via Choquet Integral Operator) If the A [ k ] A [ k 1 ] matrices with k = 1 , 2 , , L have a full rank and I μ ( G k ) are symmetric definite-positive matrices, then there is a single optimal solution x ^ for the least-squares problem via Choquet integral operator (LSM-DF via Choquet integral operator) that is given by:
x ^ = k = 1 L A [ k ] A [ k 1 ] T I μ ( G k ) A [ k ] A [ k 1 ] 1 k = 1 L A [ k ] A [ k 1 ] T I μ ( G k ) b [ k ] b [ k 1 ] .
The corresponding minimal value of J C μ ( x ) is
J C μ ( x ^ ) = k = 1 L b [ k ] b [ k 1 ] T I μ ( G k ) b [ k ] b [ k 1 ] k = 1 L b [ k ] b [ k 1 ] T I μ ( G k ) A [ k ] A [ k 1 ] k = 1 L A [ k ] A [ k 1 ] T I μ ( G k ) A [ k ] A [ k 1 ] 1 k = 1 L A [ k ] A [ k 1 ] T I μ ( G k ) b [ k ] b [ k 1 ] .
Proof. 
Consider functional cost
J C μ ( x ) = k = 1 L b [ k ] b [ k 1 ] A [ k ] A [ k 1 ] x T I μ ( G k ) b [ k ] b [ k 1 ] A [ k ] A [ k 1 ] x
Using the matrices, this can be rewritten as
J C μ ( x ) = b A x T W b A x
where
A = A [ 1 ] A [ 0 ] A [ 2 ] A [ 1 ] A [ L ] A [ L 1 ] , b = b [ 1 ] b [ 0 ] b [ 2 ] b [ 1 ] b [ L ] b [ L 1 ] , W = I μ ( G 1 ) 0 0 0 I μ ( G 2 ) 0 0 0 I μ ( G L ) .
Entries ( A [ 1 ] , A [ 2 ] , , A [ L ] ) , and ( b [ 1 ] , b [ 2 ] , , b [ L ] ) are ascending orders of ( A 1 , A 2 , , A L ) , and ( b 1 , b 2 , , b L ) , respectively. W is a diagonal symmetric definite-positive matrix with entries I μ ( G k ) .
On the basis of Function (48) and the solution of LSM-DF via the OWA operator presented in Theorem (2), the solution to Optimization Problem (41) is given by
x ^ = A T W A 1 A T W b .
which, through Matrices (49), can be rewritten as
x ^ = k = 1 L A [ k ] A [ k 1 ] T I μ ( G k ) A [ k ] A [ k 1 ] 1 k = 1 L A [ k ] A [ k 1 ] T I μ ( G k ) b [ k ] b [ k 1 ] .
Similar to the procedure performed in Theorem (2), the minimal cost J C μ ( x ^ ) can be expressed as
J C μ ( x ^ ) = b T W b b T W A A T W A 1 A T W b .
Replacing (49) into (52), the optimal cost can be rewritten as
J C μ ( x ^ ) = k = 1 L b [ k ] b [ k 1 ] T I μ ( G k ) b [ k ] b [ k 1 ] k = 1 L b [ k ] b [ k 1 ] T I μ ( G k ) A [ k ] A [ k 1 ] k = 1 L A [ k ] A [ k 1 ] T I μ ( G k ) A [ k ] A [ k 1 ] 1 k = 1 L A [ k ] A [ k 1 ] T I μ ( G k ) b [ k ] b [ k 1 ] .
Remark 4. 
A [ 0 ] is the null matrix and b [ 0 ] is the null vector by convention.
Remark 5. 
By applying k = 1 in Theorem (3), the LSM-DF via Choquet integral operator reduces to the classical LSM in Theorem (1).

3.3. LSM-DF via Mixture Operator

For the deduction of the LSM-DF via the mixture operator, it is necessary to adapt the mixture operator presented in Definition (5).
The weight functions that are dynamic in the mixture operator uses were previously calculated and became constant (static) weight functions. Thus, the adapted mixture operator is calculated in two steps. In the first step, the weights are calculated and fixed. In the next step, aggregations are carried out. The next definition brings the adapted mixture operator.
Definition 6. 
(Adapted Mixture Operator) The adapted MIX function can be calculated using the following steps:
  • Step 1: weight functions w k ( x k ) with k = 1 , 2 , , n can be calculated and fixed as follows:
    w 1 ( x 1 ) = w 1 , w 2 ( x 2 ) = w 2 , , w n ( x n ) = w n .
  • Step 2: with the fixed weight functions, the MIX function can be calculated as follows:
    M I X w 1 , w 2 , , w n ( x 1 , x 2 , , x n ) = k = 1 n w k x k k = 1 n w k .
Now, the LSM-DF via the mixture operator must be deduced. The following equation must be considered:
b k = A k x + v k , k = 1 , 2 , , L
where x R n × 1 is an unknown vector, A k R N × n known parameters matrices, b k R N × 1 known vectors, and v k R N × 1 vectors named residuals.
A solution to the least-squares problem via the mixture operator must minimize the length of the residual vector, that is, it must satisfy the following property:
k = 1 L | | b k A k x ^ | | W k 2 k = 1 L | | W k | | 2 k = 1 L | | b k A k x | | W k 2 k = 1 L | | W k | | 2
for all x R n × 1 and where W k is a positive-definite symmetric matrix.
Optimal solution x ^ is found by solving the following minimization problem:
m i n x J M I X ( x )
Functional J M I X ( x ) can be defined as
J M I X ( x ) : = M I X W 1 , W 2 , , W L J ̲ 1 ( x ) , J ̲ 2 ( x ) , , J ̲ L ( x )
where
J ̲ k ( x ) : = | | v k | | 2 = | | b k A k x | | 2 , k = 1 , 2 , , L .
By defining Mixture Operator (59), the function can be rewritten as
J M I X ( x ) : = k = 1 L | | b k A k x | | W k 2 k = 1 L | | W k | | 2 = k = 1 L b k A k x T W k b k A k x k = 1 L | | W k | | 2 .
The next theorem brings the solution to the least-squares problem via the mixture operator in (58).
Theorem 4. 
(LSM-DF via Mixture Operator) If the A k matrices with k = 1 , 2 , , L have a full rank and W k are symmetric definite-positive matrices, then there is a single optimal solution x ^ to the least-squares problem via the mixture operator (LSM-DF via mixture operator) (58) that is given by:
x ^ = k = 1 L A k T W k A k 1 k = 1 L A k T W k b k .
The corresponding minimal value of J M I X ( x ) is
J M I X ( x ^ ) = k = 1 L b k T W k b k k = 1 L b k T W k A k k = 1 L A k T W k A k 1 k = 1 L A k T W k b k .
Proof. 
Consider the function
J M I X ( x ) = k = 1 L b k A k x T W k b k A k x k = 1 L | | W k | | 2
that can be rewritten as
J M I X ( x ) = α β A x T W β A x
where
A = A 1 A 2 A L , β = b 1 b 2 b L , W = W 1 0 0 0 W 2 0 0 0 W L , α = 1 k = 1 L | | W k | | 2
where W is a diagonal positive-definite symmetric matrix with entries W k .
To find the solution to optimization problem x ^ , J ( x ) must be differentiated in (65) and equal to zero. On the basis of the theorem, the solution of the derivative is given by
x α β A x T W β A x = 0 α x β A x T W β A x = 0 .
On the basis of Theorem (2), the solution of the derivative is given by
α x T A T W A β T W A = 0 .
Therefore,
x ^ = A T W A 1 A T W β .
Through Matrices (66), the solution can be rewritten as
x ^ = k = 1 L A k T W k A k 1 k = 1 L A k T W k b k .
Minimal cost J M I X ( x ^ ) can be expressed as
J M I X ( x ^ ) = β T W β β T W A x ^ x ^ T A T W β + x ^ T A T W A x ^
replacing(69) into (71), the result is
J M I X ( x ^ ) = β T W β β T W A A T W A 1 A T W β .
Replacing (66) into (72), the optimal cost can be rewritten as
J M I X ( x ^ ) = k = 1 L b k T W k b k k = 1 L b k T W k A k ( k = 1 L A k T W k A k ) 1 k = 1 L A k T W k b k .
Remark 6. 
The optimal solution of the LSM-DF via a mixture operator reduces to the LSM-DF in [4].

4. Illustrative Example

In this section, we present artificially created (by authors) datasets in order to illustrate the behavior, effectiveness, and the relationship between the proposed methods for finding the best fitting curve to a given set of points from a mathematical point of view. Table 1 shows two simulated datasets about income and consumption.
First, the LSM was separately applied to the datasets, and the following results were found:
y ^ 1 = 0.49 x 1 + 52.69 ,
y ^ 2 = 0.49 x 2 + 53.65 .
The MSEs between y ^ 1 with y 1 and y ^ 2 with y 2 were 211.52 and 221.67 , respectively. Model (74) was more accurate than Model (75).
Second, the LSM-DF via OWA, Choquet integral, and mixture operators were calculated in the two datasets, and the following weighting matrices were used in the simulation: W 1 = 0.7 d i a g ( 10 ) and W 2 = 0.3 d i a g ( 10 ) ; more weight was given to W 1 than to W 2 . The following results were found:
y ^ O = 0.49 x + 53.34 ,
y ^ C = 0.49 x + 52.65 ,
y ^ M = 0.49 x + 52.95 .
The MSEs between y ^ O , y ^ C and y ^ M with y 1 were 211.18 , 211.57 , 211.28 , respectively. The MSEs between y ^ O , y ^ C and y ^ M with y 2 were 222.14 , 223.87 and 223 respectively. Table 2 and Table 3 compare samples with regard to x 1 and x 2 , respectively, of Equations (76)–(78). Table 4 compares the samples of y 1 to the samples generated by Equations (74), (76)–(78). Table 5 compares the samples of y 2 with the samples generated with Equations (76)–(78).
MSE shows that Models (76)–(78) were more accurate than Model (74). The LSM-DF via OWA, Choquet integral, and mixture operators outperformed the LSM.

5. Conclusions

In this paper, the LSM-DF was studied through aggregation operators in order to explore different ways to aggregate data. More specifically, the LSM-DF via an OWA operator, the LSM-DF via a Choquet integral operator, and the LSM-DF via a mixture operator were defined. These operators were particularly chosen due to their efficiency when applied to other methods in different areas of knowledge [12,13,22,24,26]. These new methods provide a theoretical framework with variations of the classic least square, which may be more suitable in certain applications. For instance, LSM-DF via OWA operator could be chosen for situations where one wants to place greater weights on the first data entries.
The main objective of developing these methods is to estimate an optimal parameter for situations involving more than one dataset, and to show how it can be changed for different types of data. The methods were mathematically demonstrated by applying aggregation operators of the average type to optimization problem. The illustrate example was set up to demonstrate the mathematical behavior of these procedures trough fitting curves in comparison with an approach that does not incorporate the aggregation operators in its formulation.
In future studies, we want to explore some applications that can show the advantages and disadvantages of each method, and set up LSM for other aggregation operators such as a weighted OWA (WOWA) operator and a Sugeno integral operator. Furthermore, these methods will be extended to models subject to parametric uncertainties.

Author Contributions

Conceptualization, G.Q.d.J. and E.S.P.; Methodology, G.Q.d.J. and E.S.P.; Formal analysis, G.Q.d.J. and E.S.P.; Investigation, G.Q.d.J. and E.S.P.; Writing—original draft, G.Q.d.J.; Writing—review & editing, E.S.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Cheng, C.H.; Wang, J.W.; Wu, M.C. OWA-weighted based clustering method for classification problem. Expert Syst. Appl. 2009, 36, 4988–4995. [Google Scholar] [CrossRef]
  2. Milfont, T.; Mezzomo, I.; Bedregal, B.; Mansilla, E.; Bustince, H. Aggregation functions on n-dimensional ordered vectors equipped with an admissible order and an application in multi-criteria group decision-making. Int. J. Approx. Reason. 2021, 137, 34–50. [Google Scholar] [CrossRef]
  3. Flores-Sosa, M.; León-Castro, E.; Merigó, J.M.; Yager, R.R. Forecasting the exchange rate with multiple linear regression and heavy ordered weighted average operators. Eur. J. Oper. Res. 2022, 248, 108863. [Google Scholar]
  4. Sayed, A.H.; Al-Naffouri, T.Y.; Kailath, T. Robust Estimation for Uncertain Models in a Data Fusion Scenario. IFAC Proc. Vol. 2000, 33, 899–904. [Google Scholar] [CrossRef]
  5. Kailath, T.; Sayed, A.S.; Hassibi, B. Linear Estimation, 3rd ed.; Prentice Hall: Upper Saddle River, NJ, USA, 2000; 854p. [Google Scholar]
  6. Sayed, A.H.; Chandrasekaran, S. Parameter estimation with multiple sources and levels of uncertainties. IEEE Trans. Signal Process. 2000, 48, 680–692. [Google Scholar] [CrossRef]
  7. Lopes, C.G.; Sayed, A.H. Diffusion least-mean squares over adaptive networks: Formulation and performance analysis. IEEE Trans. Signal Process. 2008, 56, 3122–3136. [Google Scholar] [CrossRef]
  8. Cattivelli, F.; Lopes, C.G.; Sayed, A.H. Diffusion recursive least-squares for distributed estimation over adaptive networks. IEEE Trans. Signal Process. 2008, 56, 1865–1877. [Google Scholar] [CrossRef]
  9. Takahashi, N.; Yamada, I.; Sayed, A.H. Diffusion least-mean squares with adaptive combiners: Formulation and performance analysis. IEEE Trans. Signal Process. 2010, 58, 4795–4810. [Google Scholar] [CrossRef]
  10. Choquet, G. Theory of capacities. Ann. de lÍnstitut Fourier 1953, 5, 131–295. [Google Scholar] [CrossRef]
  11. Give’on, Y. Lattice matrices. Inf. Control 1964, 7, 477–484. [Google Scholar] [CrossRef]
  12. Yager, R.R. Ordered weighted averaging aggregation operators in multicriteria decision making. IEEE Trans. Syst. Man Cybern. 1988, 18, 183–190. [Google Scholar] [CrossRef]
  13. Zhou, S.M.; Chiclana, F.; John, R.I.; Garibald, J.M. Type-1 OWA operators for aggregating uncertain information with uncertain weights induced by type-2 linguistic quantifiers. Fuzzy Sets Syst. 2008, 159, 3281–3296. [Google Scholar] [CrossRef]
  14. Paternain, D.; Fernandez, J.; Bustince, H.; Mesiar, R.; Beliakov, G. Construction of image reduction operators using averaging aggregation functions. Fuzzy Sets Syst. 2015, 261, 87–111. [Google Scholar] [CrossRef]
  15. Beliakov, G.; Bustince, H.; Calvo, T. A Practical Guide to Averaging Functions (Studies in Fuzziness and Soft Computing); Springer: Berlin/Heidelberg, Germany, 2016; Volume 329. [Google Scholar]
  16. Bedregal, B.; Bustince, H.; Palmeira, E.; Dimuro, G.; Fernandez, J. Generalized Interval-valued OWA operators with interval weights derived from interval-valued overlap functions. Int. J. Approx. Reason. 2017, 90, 1–16. [Google Scholar] [CrossRef]
  17. Joy, G. The Determinant and Rank of a Lattice Matrix. Glob. J. Pure Appl. Math. 2017, 13, 1745–1761. [Google Scholar]
  18. Dimuro, G.P.; Fernandez, J.; Bedregal, B.; Mesiar, R.; Sanz, J.A.; Lucca, G.; Bustince, H. The state-of-art of the generalization of the Choquet integral: From aggregation and pre-aggregation to ordered directionally monotone functions. Inf. Fusion 2020, 57, 27–43. [Google Scholar] [CrossRef]
  19. Asmus, T.; Dimuro, G.; Bedregal, B.; Sanz, J.A.; Fernandez, J.; Rodriguez-Martinez, J.; Mesiar, R.; Bustince, H. A constructive framework to define fusion functions with floating domains in arbitrary closed real intervals. Inf. Sci. 2022, 601, 800–829. [Google Scholar] [CrossRef]
  20. Flores-Sosa, M.; Avilés-Ochoa, E.; Merigó, J.M.; Yager, R.R. Volatility GARCH models with the ordered weighted average (OWA) operators. Inf. Sci. 2021, 565, 46–61. [Google Scholar] [CrossRef]
  21. Medina, J.; Yager, R.R. OWA operators with functional weights. Fuzzy Sets Syst. 2021, 414, 38–56. [Google Scholar] [CrossRef]
  22. Flores-Sosa, M.; Avilés-Ochoa, E.; Merigó, J.M.; Kacprzyk, J. The OWA operator in multiple linear regression. Appl. Soft Comput. 2022, 124, 108985. [Google Scholar] [CrossRef]
  23. Llamazares, B. Constructing Choquet integral-based operators that generalize weighted means and OWA operators. Inf. Fusion 2022, 23, 131–138. [Google Scholar] [CrossRef]
  24. Jia, X.; Wang, Y. Choquet integral-based intuitionistic fuzzy arithmetic aggregation operators in multi-criteria decision-making. Expert Syst. Appl. 2022, 191, 116242. [Google Scholar] [CrossRef]
  25. Pereira, R.A.M.; Ribeiro, R.A. Aggregation with generalized mixture operators using weighting functions. Fuzzy Sets Syst. 2003, 137, 43–58. [Google Scholar] [CrossRef]
  26. Ribeiro, R.A.; Pereira, R.A.M. Generalized mixture operators using weighting functions: A comparative study with WA and OWA. Eur. J. Oper. Res. 2003, 145, 329–342. [Google Scholar] [CrossRef]
  27. Santana, F.; Bedregal, B.; Viana, P.; Bustince, H. On admissible orders over closed subintervals of [0, 1]. Fuzzy Sets Syst. 2021, 399, 44–54. [Google Scholar] [CrossRef]
Table 1. Simulated datasets about income and consumption.
Table 1. Simulated datasets about income and consumption.
Income ( x 1 )Consumption ( y 1 )Income ( x 2 )Consumption ( y 2 )
139122140123
126114129117
90869289
144134145136
163146163147
136107138109
61686468
6211763119
41714373
12098122100
Table 2. Sample with regard to x 1 of Equations (76)–(78).
Table 2. Sample with regard to x 1 of Equations (76)–(78).
Income ( x 1 )Consumption ( y ^ O )Consumption ( y ^ C )Consumption ( y ^ M )
139121.45120.76121.06
126115.08114.39114.69
9097.4496.7597.05
144123.90123.21123.51
163133.21132.52132.82
1361119.98119.29119.59
6183,2382.5482.84
6283.7283.0383.33
4173.4372.7473.04
120112.14111.45111.75
Table 3. Sample with regard to x 2 of Equations (76)–(78).
Table 3. Sample with regard to x 2 of Equations (76)–(78).
Income ( x 2 )Consumption ( y ^ O )Consumption ( y ^ C )Consumption ( y ^ M )
140121.94121.25121.55
129116.55115.86116.16
9298.4297.7398.03
145124.39123.70124
163133.21132.52132.82
138120.96120.27120.57
6487.4084.0184.31
6384.2183.5283.82
4374.4173.7274.02
122113.12112.43112.73
Table 4. Sample of y 1 and samples generated with Equations (74), (76)–(78).
Table 4. Sample of y 1 and samples generated with Equations (74), (76)–(78).
y 1 y ^ 1 y ^ O y ^ C y ^ M
122120.80121.45120.76121.06
114114.43115.08114.39114.69
8696.7997.4496.7597.05
134123.25123.90123.21123.51
146132.56133.21132.52132.82
107119.33119.98119.29119.59
6882.5883,2382.5482.84
11783.0783.7283.0383.33
7172.7873.4372.7473.04
98111.49112.14111.45111.75
Table 5. Sample of y 2 and the samples generated by Equations (76)–(78).
Table 5. Sample of y 2 and the samples generated by Equations (76)–(78).
y 2 y ^ 2 y ^ O y ^ C y ^ M
123122.25121.94121.25121.55
117116.86116.55115.86116.16
8998.7398.4297.7398.03
136124.70124.39123.70124
147133.52133.21132.52132.82
109121.27120.96120.27120.57
6885.0187.4084.0184.31
11984.5284.2183.5283.82
7374.7274.4173.7274.02
100113.43113.12112.43112.73
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

de Jesus, G.Q.; Palmeira, E.S. Least Squares in a Data Fusion Scenario via Aggregation Operators. Axioms 2022, 11, 678. https://doi.org/10.3390/axioms11120678

AMA Style

de Jesus GQ, Palmeira ES. Least Squares in a Data Fusion Scenario via Aggregation Operators. Axioms. 2022; 11(12):678. https://doi.org/10.3390/axioms11120678

Chicago/Turabian Style

de Jesus, Gildson Queiroz, and Eduardo Silva Palmeira. 2022. "Least Squares in a Data Fusion Scenario via Aggregation Operators" Axioms 11, no. 12: 678. https://doi.org/10.3390/axioms11120678

APA Style

de Jesus, G. Q., & Palmeira, E. S. (2022). Least Squares in a Data Fusion Scenario via Aggregation Operators. Axioms, 11(12), 678. https://doi.org/10.3390/axioms11120678

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop