Two Novel Implementations of the Remez Multiple Exchange Algorithm for Optimum FIR Filter Design

One of the main advantages of the linear-phase FIR filters over their IIR counterparts is the fact that there exist efficient algorithms for optimizing the arbitrary-magnitude FIR filters in the minimax sense. In case of IIR filters, the design of arbitrary-magnitude filters is usually time-consuming and the convergence to the optimum solution is not always guaranteed. The most efficient method for designing optimum magnitude linear-phase FIR filters with arbitrary-magnitude specifications is the Remez algorithm and the most frequent method to implement this algorithm is the one originally proposed by Parks and McClellan. Initially, they came upwith the design of conventional low-pass linear phase FIR filters, whose impulse response is symmetric and the order is even [5]. Later on, along with Rabiner they extended the original design technique in [7] such that the generalized algorithm is applicable to the design of all the four linear-phase FIR filter types with arbitrary specifications. Due to the initial work of Parks and McClellan for the extended algorithm, this algorithm is famously known as the Parks-McClellan (PM) algorithm.


Introduction
One of the main advantages of the linear-phase FIR filters over their IIR counterparts is the fact that there exist efficient algorithms for optimizing the arbitrary-magnitude FIR filters in the minimax sense. In case of IIR filters, the design of arbitrary-magnitude filters is usually time-consuming and the convergence to the optimum solution is not always guaranteed. The most efficient method for designing optimum magnitude linear-phase FIR filters with arbitrary-magnitude specifications is the Remez algorithm and the most frequent method to implement this algorithm is the one originally proposed by Parks and McClellan. Initially, they came up with the design of conventional low-pass linear phase FIR filters, whose impulse response is symmetric and the order is even [5]. Later on, along with Rabiner they extended the original design technique in [7] such that the generalized algorithm is applicable to the design of all the four linear-phase FIR filter types with arbitrary specifications. Due to the initial work of Parks and McClellan for the extended algorithm, this algorithm is famously known as the Parks-McClellan (PM) algorithm.
The PM algorithm was generated in the beginning of 1970 by using FORTRAN. During that era, the computer resources were quite limited. When people applied this algorithm in practice for high-order filters, they failed to achieve the optimum results. This gave rise to two main doubts about the PM algorithm. The first doubt was that the algorithm is not properly constructed and is valid for only low-order filters design. However, after noticing that the maximum number of iterations in the algorithm implementation is set to only 25 which is quite low to achieve the optimum solutions for high order filters, it became clear that the first doubt is superfluous.
The second doubt was concerned with the implementation of the PM algorithm's search strategy for the "real" extremal points of the weighted error function, which is formed based on the "trial" extremal points. While mimicking the search technique included in the PM algorithm in various Remez-type algorithms for designing recursive digital filters [15,16,18,19], it was observed that some of the them are quite sensitive to the selection of the initial set of the "trial" extremal points. Moreover, they suffered from certain issues which prevented them to converge at the optimum solution in some cases. For algorithms described in [15], and [18], the convergence issue was solved by increasing the maximum number of iterations in the above-mentioned manner. However, the algorithms described in [16] and [19] have still remained sensitive to the selection of the initial set of the "trial" extremal points. This sensitivity motivated the authors of this contribution to figure out how the search for the "true" extremal points is really carried out in the core discrete Remez algorithm part in the FORTRAN implementation of the PM algorithm. After a thorough study of the FORTRAN code, it was observed that this code utilizes almost twenty interlaced "go to" statements which are quite redundant for locating the "real" extremal points. Meticulous investigation of the code revealed that it is possible to decompose the one large chunk of the search technique into two compact search techniques referred to as Vicinity Search and Endpoint Search in such a manner that the same optimum solution can be achieved in an efficient manner as follows.
In Vicinity Search, the candidate "real" extremal point is located in the vicinity of each "trial" extremal point, which is bounded by the preceding and the following "trial" extremal points with the exception of the first (last) point for which the lower (upper) bound is the first (last) grid point in use. Endpoint Search, in turn, checks whether before the first (after the last) local extremum found by Vicinity Search there is an additional first (last) local extremum of opposite sign. If one or both of such extrema exist, then their locations are considered as candidate "real" extremal points of the overall search consisting of Vicinity Search and Endpoint Search.I n this case, there are one or two more candidate "real" extremal points as Vicinity Search already provides the desired number of "real" extremal points. In the PM algorithm, the desired final "real" extremal points are determined according to the following three options: Option 1: The additional last extremum exists such that its absolute value is larger than or equal to those of the first extremum of Vicinity Search and the possible additional first extremum.

Option 2:
The additional first extremum exists such that its absolute value is larger than or equal to that of the first extremum of the Vicinity Search and larger than that of the possible additional last extremum. For Option 3, the final "real" extremal points are the ones obtained directly from the Vicinity Search,w h e r e a sf o rOption 1 (Option 2), these points are obtained by omitting the first (last) point based on Option 3 and by replacing the last (first) point with the one found by Endpoint Search. Based on the above-mentioned facts, an extremely compact translation of the original FORTRAN implementation into a corresponding MATLAB implementation has been reporeted in [2].
The above study on how the PM algorithm performs the search for the "real" extremal points indicates that mimicking this search principle in the Remez-type algorithms proposed in [16], [19] does not give the flexibility to transfer two extremal points between the two consecutive bands, for example, from a passband to a stopband or vice versa, which is a necessary prerequisite for convergence to the optimum solution in certain cases. Most significantly, the search technique included in the PM algorithm does not follow the fundamental idea of the Remez multiple exchange (RME) algorithm when the approximation interval is a union of three or more disjoint intervals [8,11,20]. That is, if there are more candidate "real" extremal points than required, then the desired points should be selected in such a way that the ones corresponding to the largest absolute values of the weighted error functions are retained subject to the condition that the sign of the weighted error function alternates at the consecutive points. An efficient MATLAB based implementation following the above mentioned fundamental notion of the RME algorithm has been reported in [3] and provides significant improvements in designing the multiband FIR filters.
In the beginning, the main purpose of the authors of this contribution was to modify the core discrete Remez part of the PM algorithm in FORTRAN such that it follows the fundamental principle of the RME algorithm and can ultimately be incorporated in the algorithms proposed in [16] and [19]. However, during the course of modifications, it was observed that a modified MATLAB implementation mimicking the original FORTRAN implementation is quite effective and superior to already available MATLAB implementation of the algorithm in the function firpm [21]. Based on the above discussion, this chapter describes two novel MATLAB based implementations of the Remez algorithm within the PM algorithm. Implementation I is an extremely fast and compact translation of the Remez algorithm part of the original FORTRAN code to the corresponding MATLAB code and is valid for general purpose linear-phase FIR filters design [2]. It is worth noting that Implementation I imitates the implementation idea of the Remez algorithm presented in PM algorithm. Implementation II is based on the fundamental notion of the Remez algorithm as described in [20] and provides significant improvements in designing the multiband FIR filters [3]. It is important to note that this chapter emphasizes on the practical MATLAB based implementation aspects of the Remez algorithm. In order to get an overview of the theoretical aspects of the Remez algorithm, please refer to [10,17]. The organization of this chapter is as follows. Section (2) formally states the problem under consideration, Section (3) describes the Implementation I in detail, Section (4) discusses the Implementation II in detail, and finally, the concluding remarks are presented in Section (5).

Problem statement
After specifying the filter type, the filter order, and the filter specifications such that the problem is solvable using the RME algorithm, the essential problem in the PM algorithm is the following: Core Discrete Approximation Problem: Given nz − 1 1 , the number of unknowns a[n] for n = 0, 1, . . . , nz − 2, and the grid points grid(k) included in the vector grid of length ngrid,which contains values between 0 and 0.5 2 , along with the vectors des and wt of the same length ngrid, the entries of which carry the information of the desired and weight values, respectively, at the corresponding grid points of the vector grid, find the unknowns a[n] to minimize the following quantity: 1 In this contribution, nz − 1 is chosen to be the number of adjustable parameters in both the FORTRAN and the MATLAB implementations of the PM algorithm because in this case nz stands for the number of extrema at which Alternation Theorem should be satisfied in order to guarantee the optimality of the solution. 2 In the original PM algorithm, this range is the baseband for the so-called normalized frequencies from which the corresponding angular frequencies are obtained by multiplying these frequencies by 2π [17]. and According to Alternation (characterization) Theorem [4,12,13], G(k) of the form of (1c) is the best unique solution minimizing ǫ as given by (1a) if and only if there exists a vector ℓ opt that contains (at least) nz entries ℓ opt (1), ℓ opt (2),...,ℓ opt (nz) having the values of k within 1 ≤ k ≤ ngrid such that It is worth mentioning that the core discrete approximation problem is the same for both the implementations I and II as defined in the Introduction.

Implementation I
This section discusses Implementation I in detail as follows. First, the theoretical formulation of the algorithm is described so that the reader can grasp the very essence of the MATLAB code snippet provided later on in this section. Second, during this inclusion, it is emphasized that instead of using approximately 15 nested loops and around 300 lines of code, only 3 looping structures and approximately 100 lines of code are required by Implementation I. Third, it is shown, by means of four examples, that the overall CPU execution time required by the proposed implementation to arrive practically in the same manner at the optimum FIR filter designs is only around one third in comparison with the original implementation. Fourth, in the last two examples, there are unwanted peaks in the transition bands. In order to suppress these peaks to acceptable levels, two methods of including the transition bands in the original problem are introduced.

Theoretical formulation
The theoretical formulation of the proposed algorithm is roughly classified into the initialization phase and the iteration phase. The initialization phase performs the necessary initializations for the algorithm, whereas the iteration phase carries out the actual Remez exchange loop. In order to explain why Implementation I is a compact and efficient MATLAB based routine, the iteration phase is further decomposed into four well-defined primary segments. Each segment is constructed in such a way that before the start of the basic steps, there is a thorough explanation on the benefits of carrying out the segment under consideration with the aid of the proposed basic steps.
• Initialize the iteration counter as niter = 1 and set the maximum number of iterations as itrmax = 250.

Iteration phase
The Remez exchange loop iteratively locates the desired optimum vector ℓ trial having as its entries the nz values of k within 1 ≤ k ≤ ngrid,a tw h i c hAlternation Theorem is satisfied as follows. In the first loop, the vector ℓ trial found in the initialization phase is the first "trial" vector for being the desired optimum vector ℓ opt . As this is extremely unlikely to happen, The efficiency of Implementation I in comparison with the original MATLAB function firpm, in terms of significant reduction in the code compactness and a considerable reduction in the CPU execution time for obtaining practically in the same manner the best linear-phase FIR solutions are mostly based on the following two novel facts. First, the steps under Segment 1 are accomplished by employing the efficient MATLAB vectorization operations whenever possible and, most importantly, by avoiding the call for one subroutine by replacing this call with highly efficient matrix operations available in MATLAB. Second, as already mentioned in the introduction, the lengthy search technique involved in the function firpm for locating the true extremal points based on the weighted error function can be compressed into Vicinity Search and Endpoint Search.I nt h es e q u e l ,Segment 2 and Segment 3 will take care of Vicinity Search and Endpoint Search, respectively. More detail can be found in the actual implementations of Segments 1, 2, and 3.
Finally, Segment 4 checks whether ℓ real ≡ ℓ trial or not. If this equivalence is established, then the best solution has been found as in this case ℓ opt ≡ ℓ real ≡ ℓ trial . Otherwise, the whole process is repeated by using the "real" vector ℓ real of the present iteration as a "trial" vector for the next iteration. This exchange of the vectors is continued until ℓ real and ℓ trial coincide or the maximum allowable number of the iterations is exceeded, which is extremely unlikely to occur.

Segment 1:
After knowing the "trial" vector ℓ trial that contains the nz trial values of k in the ascending order for 1 ≤ k ≤ ngrid in the present iteration, this first segment guarantees that Alternation Theorem is satisfied when concentrating only on those nz values of k being involved in the vector ℓ trial . For this purpose, it generates the weighted error vector wei_err(k) for 1 ≤ k ≤ ngrid such that the following system of nz equations: is implicitly solved for the nz − 1 unknowns a[0], a [1],...,a[nz − 2] as well as for dev. 4 For this purpose, similar to the function firpm, for a given "trial" vector ℓ trial ,thevalueofwei_err(k) at k = ℓ trial (1), denoted by dev, the corresponding abscissa vector x, the ordinate vector y, and the coefficient vector ad, each of which are of length nz,aredetermined.Thesevectorsare required to express the zero-phase frequency response when using the Lagrange interpolation formula in the barycentric form at each value of k for 1 ≤ k ≤ ngrid, thereby making the implementation of the Remez loop very accurate and efficient.
In comparison with many scattered scalar operations in the original function firpm,t h e MATLAB code snippet, which is available in the following subsection, is computationally efficient and is highly compact due to the above-mentioned vectors. In addition to that, the time consuming subroutine of "remezdd" is replaced with simple and highly efficient matrix operations. Further improvements are obtained by using the vector grid, which contains the grid points under consideration, as well as des and wt, which carry information of the desired values and weights at these grid points, respectively. With the above-mentioned data, the weighted error function is generated only once during each iteration and is a single vector wei_err. This vector plays a pivotal role in the implementations of Vicinity Search in Segment 2 and Endpoint Search in Segment 3.
This segment is performed by using the following ten steps: Step 1: Determine the entries of the vectors x and ad as and respectively, as well as the corresponding deviation value as .

( 6 )
Step 2: Determine the entries of the vector y as Step 3: Generate the entries of the abscissa vector x_all covering all the entries in the vector grid as Step 4: Select the entries of the vectors err_num and err_den of length ngrid to be zero valued, set m = 1, and go to the next step.
Step 6: Set m = m + 1. If m > nz, then go to the next step. Otherwise, go to the previous step.

wei_err(k)=[err_num(k) err_den(k) − des(k)]wt(k).
The resulting wei_err contains undefined values at the entries of ℓ trial d u et ot h eu s eo f the Lagrange interpolation formula in the barycentric form. The undefined values can be conveniently filled based on the fact that at ℓ trial (m) with m odd (even), the desired value is dev (−dev),wheredev is given by (6). Hence, the vector wei_err can be completed by using the following three steps: Step 8: Set m = 1 and go to the next step.
Step 9: Update the vector wei_err as wei_err(ℓ trial (m)) = +dev for m odd −dev for m even.
Step 10: Set m = m + 1. If m < nz + 1, then go to the previous step. Otherwise, go to Step 1 under Segment 2.

Segment 2:
This segment explains how to perform Vicinity Search based on the values of the weighted error function wei_err(k) for 1 ≤ k ≤ ngrid, which has been generated at Segment 1, and the "trial" vector ℓ trial , which is under consideration in the present iteration. The key idea in Vicinity Search is to determine the mth entry of the "real" vector ℓ real , denoted by ℓ real (m) for m = 1, 2, . . . , nz,tobethevalueofk in the close vicinity of k = ℓ real (m),wherea local extremum of wei_err(k) with the same sign occurs. The location of these nz entries are simplified as follows.
In the first phase, the search of both local minima and maxima is reduced to that of local maxima by multiplying the values of wei_err(k) for 1 ≤ k ≤ ngrid by sign[wei_err(ℓ trial (m))] as in this case the values of the resulting signed weighted function sign[wei_err(ℓ trial (m))] × 43 Two Novel Implementations of the Remez Multiple Exchange Algorithm for Optimum FIR Filter Design wei_err(k) become positive at k = ℓ trial (m) as well as in its proximity. In the second phase, the proper location of each ℓ real (m) for m = 1, 2, . . . , nz can be obtained conveniently based on the following facts. First Segment 1 guarantees that for m > 1 [m < nz], the "signs" of wei_err(k) at both k = ℓ trial (m − 1) and k = ℓ trial (m + 1) is opposite to that of k = ℓ trial (m),or correspondingly, at k = ℓ real (m). Second, during the course of the present search, ℓ real (m − 1) is located before ℓ real (m) in such a way that the sign of wei_err(k) at k = ℓ real (m − 1) is opposite to that of the sign of wei_err(k) at k = ℓ real (m + 1). The above mentioned facts together with the reasoning that the lowest value of k for locating ℓ real (1) is 1 and the highest value of k for locating ℓ real (nz) is ngrid inherently lead to carrying out Vicinity Search by using the following three steps: Step 1: Set m = 1 and go to the next step.
Step 2: Find the mth element, denoted by ℓ real (m), at which the vector Step 3: Set m = m + 1. If m < nz + 1, then go to the previous step. Otherwise, go to Step 1 under Segment 3.

Segment 3:
This segment explains how to perform Endpoint Search.A f t e rVicinity Search,t h e role of Endpoint Search is to check whether the weighted error function wei_err(k) contains an additional local extremum before k = ℓ real (1) [after k = ℓ real (nz)] such that its sign is opposite to that of occurring at k = ℓ real (1)[k = ℓ real (nz)]. It is worth emphasizing that in order to take into account all the candidate extrema, Endpoint Search is necessary to be used after Vicinity Search as Vicinity Search totally omits the existence of these possible additional extrema.
The appearance of the additional first local extremum implies that 5 is larger than or equal to 1. If this fact holds true, then the largest entry in the sub-vector −sign[wei_err(ℓ real (1))] · wei_err(1:upp end ) should be positive. Similarly, the existence of the additional last local extremum implies that is smaller than or equal to ngrid. If this fact holds true, then the largest entry in the subvector −sign[wei_err(ℓ real (nz))] · wei_err(low end : ngrid) should be positive.
If no additional extremum exists, then the "final" ℓ real is the one found in Vicinity Search. Otherwise (that is, one or both the additional extrema exist), the final ℓ real is constructed according to the following alternatives: Alternative 1: The additional last extremum exists such that its absolute value is larger than or equaltothoseofthefirstextremumfoundbyVicinity Search and the possible additional first extremum.

Alternative 2:
The additional first extremum exists such that its absolute value is larger than or equal to that of the first extremum found by Vicinity Search and larger than that of the possible additional last extremum.
If Alternative 1 (Alternative 2) is valid, then the final ℓ real is formed such that the first (last) entry of ℓ real of Vicinity Search is disregarded and the last (first) entry is the value of k for 1 ≤ k ≤ ngrid, where the additional last (first) maximum of the signed weighted error function The above explanation is the key idea to perform Endpoint Search in the function firpm. However, the function firpm performs Endpoint Search in a lengthier manner and in order to exactly follow this strategy, it is carried out by using the following eight steps: Step 1:S e tendsearch = 0.
Step 4: Determine low end according to (14). If low end = ngrid + 1, then go to Step 6.Ot h erw ise, find the index, denoted by˜ end_real (nz), where the vector achieves its maximum entry value. Set ℓ end_real (nz)=l end_real (nz)+low end − 1a n ds t o r e the corresponding maximum entry value as err_end(nz).
45 Two Novel Implementations of the Remez Multiple Exchange Algorithm for Optimum FIR Filter Design Step 6:I fendsearch = 0, then go to Step 1 under Segment 4. Otherwise, go to the next step.

Segment 4:
This concluding segment check the convergence of the Remez exchange loop as follows. If the entries of the vectors ℓ trial and ℓ real are the same, then stop as in this case the ultimate goal ℓ opt ≡ ℓ real ≡ ℓ trial has been achieved. Otherwise, use the present "real" vector ℓ real as the "trial" vector for the subsequent iteration by using the substitution ℓ trial = ℓ real and go to Step 1 under Segment 1. Continue the Remez loop until ℓ trial and ℓ real coincide or the value of the iteration counter niter exceeds itrmax = 250, which is extremely unlikely to occur.
This segment requires only the following two basic steps: Step 1:I fℓ trial and ℓ real coincide, then stop. Otherwise, go to the next step.

MATLAB code snippet
The code pasted below has been tested for realizing Implementation I by using MATLAB version 7.11.0.584(R2010b). In order to embed this code snippet in the MATLAB function firpm, edit the function by taking away the code between lines 214 and 514, introduce the function call (the first line of the function of remez_imp1) and copy the function at the end of the file. Remember to take the backup copy of the original function. It is worth emphasizing that the implementation of Remez algorithm in the function firpm uses approximately 15 nested loops and 300 lines of code, whereas the code snippet provided below requires only 3 looping structures and approximately 100 lines of code to achieve the same optimum solution. 1 function [x,y,ad,dev] = remez_imp1(nz,iext,ngrid,grid,des,wt) 2 % remez_imp1 implements the Segments 1 -4 described in the preceding 3 % section, the function needs to be inserted within the MATLAB function 4 % firpm. The input argument values come directly from the function firpm 5 % and the output arguments are required to perform the Inverse Fourier 6 % transform in order to calculate the filter coefficients. In case of 7 % any issues send an e-mail to muhammad"dot"ahsan"at"tut "dot" fi.      (1)

Performance comparison
This section presents a performance comparison between the Implementation I and the implementation available in the MATLAB function firpm. The performance measurement criteria is the time taken by both the implementations to design a particular filter and is measured with the help of MATLAB built-in function profiler. This function indicates the CPU execution time taken by a function and provides the following information: • Calls − The number of times the function was called while profiling was on.
• Total Time − The total time spent in a function, including all child functions called, in seconds.
•S e l f T i m e − The total time taken by an individual function, not including the time for any child functions called, in seconds.
Time measurement was carried out on an IBM ThinkCentre machine equipped with Intel Core 2 Duo processor E6550 running at a speed of 2.33 GHz with a memory of 3 GB.
The following four filters were designed with both implementations. After the examples, the time taken by them will be tabulated in Table 1.

Example 1:
It is desired to design a lowpass filter meeting the following criteria: ω p = 0.05π, ω s = 0.1π, δ p = 0.01, and δ s = 0.001. The magnitude response of the resulting filter is shown in Fig. 1. The magnitude response of the resulting filter is shown in Fig. 2.

Example 3:
It is desired to synthesize a bandpass filter meeting the following criteria:   Figure 3 illustrates the magnitude response of the resulting filter. Although the amplitude response is optimal according to Alternation Theorem, it is worth noting that this particular filter has an extra peak in the second transition band region of approximately 16 dB. This is because of the fact that the approximation interval is a union of passband and stopband regions and transition bands are considered as "don't care" regions. This assumption works perfectly for filters having bands less than three. However, in case of three or more bands, there is no guarantee that the response is well-behaved in the transition bands, even though it is optimal according to the approximation theory. This fact is especially prominent if any one of the following holds true [9]: • Transition bandwidths are very large compared to the passband and/or stopband widths.
• Width of transition bands is different; the larger the difference, the greater the problem.
In order to conveniently avoid the appearance of the unwanted transition peaks, consider an original problem stated for linear-phase Type I and Type II filters as follows.
First, there are R interlaced passband and stopband regions as given by Such that these regions do not overlap. The lower and upper limits for the zero-phase frequency response in these bands are, respectively, specified as Here, the D κ 's alternatingly achieve the values of zero and unity such that the first value is zero (unity) if the first band is a stopband (passband). For this original problem, the overall approximation region is This region can be extended to cover the transition bands as follows: where In order to guarantee that Ω κ is still a closed subset of [0, π], α should be a small positive number. 6 There are two natural ways to state the transition band constraints, referred to as Type A and Type B transition band constraints. For both types, the upper and lower limits for the zero-phase frequency response in the κth transition band Ω T κ are specified as follows. For both Type A and Type B, the upper limit is 7 whereas the lower limits depend on the type as follows: The above limits for Type A are determined such that if the filter meets the overall criteria, then the maximum (minimum) value of the zero-phase frequency response in each transition band 6 The only condition for α is that it should be small enough to avoid the extra peaks between the adjacent passbands and the newly formed intervals in the transition band regions. 7 It is worth emphasizing that the use of max{D κ + δ κ , D κ+1 + δ κ+1 } implies that the maximum allowable value in the nearest passband is the upper limit. Similarly, in the following equation, min{D κ − δ κ , D κ+1 − δ κ+1 } means that the lower limit is the one in the nearest stopband. 51 Two Novel Implementations of the Remez Multiple Exchange Algorithm for Optimum FIR Filter Design is less than or equal to the stated upper limit in the nearest passband region (larger than or equal to the stated lower limit in the nearest stopband region). For Type B, in turn, the upper limit is the same, whereas the lower limit is obtained from the upper limit by changing its sign, thereby indicating that the magnitude response of the filter is less than or equal to the stated upper limit in the nearest passband region.
The desired value in the κth transition band Ω T κ for both types is the average of the corresponding lower and upper limits, whereas the admissible deviation is the difference between the upper limit and the above-mentioned desired value. Hence, in Ω T κ for κ = 1, 2, . . . , R − 1 the desired values, denoted by D κ , and the admissible deviations, denoted by δ κ , are as follows: and The following MATLAB function converts the original design specifications into those ones including either Type A or Type B transition band constraints as follows. The first three input parameters F_ori, Des_ori,a n dDev_ori contain the 2R edges of the R bands as a fraction of π as well as the desired values and the admissible deviations from these values in the R bands in the original specifications. . "alpha" corresponds directly to α which is used in (17e), whereas itype=1 (itype=2) means that Type A (Type B) transition band constraints are in use. The output of this function consists of vectors F, Des,a n dWt that are in the desired form when calling the MATLAB function firpm in its original form or its modifications referred to as Implementation I or II in this contribution.     for k=1:R 25 F(4 * (k-1)+1)=F_ori(2 * k-1); F(4 * (k-1)+2)=F_ori(2 * k);

Example 4:
It is required to synthesize a bandstop filter meeting the following criteria: The magnitude response of the resulting bandstop filter is shown in Fig. 5. This response is the best one according to Alternation Theorem, but contains two extra peaks of approximately 33 and 15 dB in the first transition band. By using the technique described above, the transition band peaks can be attenuated to an acceptable level. The relevant MATLAB commands are 1  As seen in Fig. 6, the overall response of the filters of the same order as the original one, that is, 102, stay within the desired limits for both Type A and Type B transition band constraints. Among these two constrained designs, Type A constrained design is preferred as for it the undershoot of the zero-phase frequency response is limited to be larger than or equal to −δ s = −0.001. Furthermore, the response in the first passband remains equiripple.  Table 1 indicates the outcomes obtained from the original implementation and the Implementation I of the Remez algorithm, both of which work practically in the same manner. It is evident that the time required by the Implementation I is almost one third of the time taken by the original implementation and illustrates the superiority of the proposed MATLAB implementation of the algorithm.

Implementation II
This section discusses the Implementation II in detail. First, a theoretical formulation of the algorithm is presented and, then, the corresponding MATLAB code is specified. After that, a detailed comparison shows how this implementation is superior to the original implementation of the Remez algorithm in the function firpm,i nt e r m so fs i g n i fi c a n t reductions in the number of iterations and the CPU execution time required to generate the same optimum solution, especially in multiband cases.

Theoretical formulation
As mentioned in the introduction, the key difference between Implementations I and II is the search strategies employed for the "real" extremal points of the weighted error function, which is formed based on the "trial" extremal points. Consequently, Segment 1 and Segment 4 are same for both the implementations. The remaining Segment 2 and Segment 3 along with the accompanying steps are as follows.

Segment 2:
Based on the values of wei_err(k) for 1 ≤ k ≤ ngrid generated at Segment 1,this main step generates the vector ℓ (start) real to include as many entries as possible in the ascending order subject to the following three conditions: Condition 1:A te a c he n t r yo fℓ (start) real , wei_err(k), when regarded as a function of k,achievesa local extremum whose absolute value is larger than or equal to |dev|,where|dev| is determined according to (6).

Condition 2:
In case of several consecutive local extrema of wei_err(k) with the same sign for k (low) ≤ k ≤ k (upp) , only one entry is included in ℓ (start) real and its value is the value of k for k (low) ≤ k ≤ k (upp) ,where|wei_err(k)| achieves its maximum value.

Performance comparison
This subsection shows how the proposed Implementation II, following the fundamental principle of the RME algorithms, outperforms the original implementation in terms of significant reductions in both the number of iterations and the CPU execution time required to arrive at the same optimum solution. For this purpose, four practical filter design examples are discussed. In all these examples, the problem is to design a filter having five interlaced passbands and stopbands. In order to achieve the accepted behavior in the transition band regions, the last two examples require the use of the Type A or Type B transition band constraints described in Subsection 3.3.

Example 5:
It is desired to design a five-band filter with two passbands and three stopbands meeting the following specifications:  The magnitude response of the resulting filter is shown in Fig. 7.
The minimum order to meet the criteria is 106 and the relevant MATLAB commands are  The magnitude response of the resulting filter is shown in Fig. 8.

Example 7:
It is desired to design a five-band filter with two passbands and three stopbands with following specifications: ω s1 = 0.15π, ω p1 = 0.2π, ω p2 = 0.45π, ω s2 = 0.55π, ω s3 = 0.7π, ω p3 = 0.8π, ω p4 = 0.85π, ω s4 = 0.93π, δ s1 = δ s2 = δ s3 = 0.001, and δ p1 = δ p2 = 0.01. The minimum filter order required to meet these specifications is 100. The magnitude response of the resulting filter designed without any constraints in the transition bands is shown by the solid blue line in Fig. 9. It is observed that in the second and third transition bands there are unwanted peaks of approximately 9 dB and 16 dB, respectively. These undesired peaks can be avoided by using Type A and Type B transition band constraints in the approximation problem according to the discussion of Subsection 3.3. When using α = 0.0005, the minimum order to meet the resulting criteria for both Type A and Type B constraints is 101. The responses of the resulting filters meeting these additional constraints are depicted in Fig.  9 by using a dashed red line and a dot-dashed black line, respectively.  Example 8: It is desired to design a five bands filter with three passbands and two stopbands with following specifications: The minimum filter order to meet these specifications is 102. The magnitude response of the resulting filter designed without any constraints in the transition bands is shown by the solid blue line in Fig. 10. It is observed that in the first, second, and third transition bands, there are undesired peaks of approximately 1 dB, 2 dB, and 1.7 dB, respectively. These undesired peaks can be avoided in a manner similar to that used in the previous example. In this example, for Type A and Type B constraints, the minimum filter orders are 104 and 102, respectively, whereas the responses of the resulting filters are depicted in Fig. 10   The number of iterations as well as the CPU execution times required by the original implementation and the proposed second implementation are summarized in Table 2 for synthesizing all the eight filters considered in this section. The hardware and MATLAB versions are the same as used earlier during the comparison of the original and the proposed first implementations. It is seen that the reduction in the numbers of iterations is 79 and 54 percent, respectively, when synthesizing the filters in Example 5 and 6. In case of examples 7 and 8 with the transition band constraints in effect, the reduction in the numbers of iterations is 45 percent (Type A), 38 percent (Type B)a n d1 8p e r c e n t( Type A), and 55 percent (Type B), respectively. The reduction percentage in the CPU execution time is between 71 and 86 percent for all the eight filter designs under consideration. Hence, the proposed second implementation is highly efficient in the design of multiband filters.

Conclusion
This chapter has introduced two novel MATLAB based Remez algorithms for the design of optimum arbitrary-magnitude linear-phase FIR filters. The first algorithm is a highly optimized and compact translation of the PM algorithm from its original FORTRAN code to its MATLAB counterpart in comparison with the existing MATLAB function firpm.T h e s e attractive properties have been achieved by first observing that the PM algorithm's very lengthy search strategy for the "real" extremal points of the weighted error function, which is formed based on the "trial" extremal points, can be compressed into two very compact basic search techniques. Second, the MATLAB vectorization techniques are employed whenever possible. As a result, the CPU execution time is roughly one third to synthesize linear-phase FIR filters practically in the same manner in comparison with the function firpm being more or less a direct translation from the FORTRAN code. Moreover, the code complexity is reduced to a considerable extent. The original implementation utilizes approximately 15 nested loops and around 300 lines of code whereas the first proposed implementation uses only 3 looping structures and approximately 100 lines of code. Thus, same efficient results are achieved with one fifth of the looping structures and one third of the code lines in the original implementation.
It is, however, important to note that the first technique does not follow the fundamental idea of Remez algorithm as suggested in [20] as it tries to find the new "trial" extremal points in the vicinity of previously found points as well as in the surroundings of the first and last grid points under consideration.
The second implementation obeys the fundamental principle of the Remez multiple exchange algorithm. This means that while searching for the "real" set of extrema, there is a possibility to obtain more than the required points in intermediate calculations. In this situation, the idea is to keep as many extremal points as possible subject to the condition that the corresponding error values are the maximum absolute ones and they obey the sign alternation property. Another prominent feature is that the weighted error function is calculated over the entire grid. This, not only makes sure that no potential extremal frequency point is skipped during a particular iteration, but also enables to transfer the two extremal points between the consecutive bands which, in some cases, is a necessary prerequisite for the algorithms in [16] and [19] to converge. Furthermore, the number of iterations as well as the CPU execution times required by the proposed second implementation to design the linear-phase FIR filters 64 MATLAB -A Fundamental Tool for Scientifi c Computing and Engineering Applications -Volume 2 in comparison with the existing MATLAB function firpm, especially in multi-band cases, are significantly lower. Examples have shown that in most five-band cases with some constraints in the transition bands, the reduction in the number of iteration is more than 50 percent, whereas the reduction in the CPU execution time is around 80 percent.
The quality of the filters designed with the proposed implementations is analogous to that of the PM algorithm with the added advantages of less number of iterations and CPU execution time.
The proposed two implementations have concentrated only on the core discrete Remez part of the PM algorithm. Future work is devoted to explore the possibilities of further improvements in the overall function firpm and reimplementing the other portions of this function efficiently.

Muhammad Ahsan and Tapio Saramäki
Tampere University of Technology, Tampere, Finland