Highlights
Multivariate polynomial regression can be used for reducing the number of experiments, making predictions, and analyzing values outside of the DoE.
A third-degree polynomial with an R2 of 0,8652 was considered to be the best-fitting method in this work.
Polynomial degrees greater than 3 overfit the model's curves, even if a better R2 is obtained.
The catalyst dose and the pH had a negative influence on the percentage of DOC degradation.
Model predictions allow inferring that, at low catalyst doses and medium and high pH levels, it is possible to find maximum degradations at low cumulative energies.
Introduction
Heterogeneous photocatalysis is an advanced oxidation process (AOP) that uses a semiconductor photocatalyst which, when subjected to a specific radiation, allows for oxidation-reduction processes, enabling the removal of pollutants and providing effluents with lower toxicity (Gout et al., 2022). Landfill leachates contain high amounts of heavy metals, solids (suspended and settleable), and organic matter (Vahabian et al., 2019). The recalcitrant characteristics of the individual pollutants and the highly variable generation rates are the leading factors that hinder their treatment (Chaturvedi and Kaushal, 2018; Müller et al., 2015). It is there that AOPs, by improving the prospects for the treatment of wastewater with high pollutant loads, accomplish their objective (Ruiz-Delgado et al., 2020).
In addition, the catalyst is the main factor that directly impacts the efficiency of photocatalytic treatments. During the last few years, several kinds of these semiconductors have appeared, such as metal oxides, nanomaterials, and organic catalysts, among others. According to Al-Mamun et al. (2021), TiO2 is considered one of the best photocatalytic materials due to it is non-toxicity, high photocatalytic efficiency, and photostability. Additionally, TiO2 can be combined with other compounds such as Ag and MoO3 (Hasan Khan Neon and Islam, 2019). This material has been used over the years for treating complex contaminants present in pharmaceutical, textile, and industrial wastewater, achieving values of more than 90% regarding the degradation of emerging pollutants (Akter et al., 2022).
The complete mineralization of contaminants in any AOP requires a large amount of energy, along with chemicals such as catalysts and oxidants, which increases the cost of treatments (Thanekar et al., 2018) and limits research due to the availability of resources and time. In wastewater treatment processes, the accurate modeling and optimization of the most suitable conditions is required to obtain the highest possible efficiency (Azadi et al., 2018). Modeling arises, then, as an alternative to knowing the behavior of a phenomenon from actual experimental data, allowing to explain such behavior through a model and to see the prediction in measurement points that were not experimentally considered. This allows obtaining a complete knowledge of the whole phenomenon and, simultaneously, determining whether there are non-measured points or values in which the process could have behaved better (Florea, 2019).
Photocatalytic modeling has been applied from different perspectives, the oldest of which involves reaction kinetics models (Marien et al., 2019; Xu et al., 2020). Most recently, artificial neuronal networks (ANN) have been used to analyze the effects of operating conditions on photocatalyst performance (Ateia et al., 2020) and the efficiency of the photocatalytic process (Jing et al., 2017), as well as to predict temporal variations of leachate COD in photocatalytic treatment processes (Azadi et al., 2018). However, these works have trained ANNs with a minimal data set, thus causing their predictions to have a high percentage of error; according to Ciresan et al. (2012), datasets below 50 samples per class show prediction errors of up to 30%. Another recent approach is the response surface methodology (RSM), one of the most common for these applications since it is oriented towards experimental design and the optimization of operating conditions (Ateia et al., 2020; Becerra et al., 2020; Colombo et al., 2013). Moreover, various regression models have been applied to simulate the treatment conditions, determining the significance of variables such as treatment time, catalyst doping percentage, and catalyst dose to the linear model (Nugraha and Fatimah, 2016). Other models have been applied to a lesser extent, as is the case of Support Vector Machines (SVM), which allow optimizing the pollutants removal percentage. This was shown by Ateia et al. (2020), who modeled the removal of a pollutant through the least-squares method and optimized it with the Cuckoo algorithm, achieving an R2 higher than 0,94 for the acid pH conditions and the medium catalyst concentration.
The objective of this research is not to describe the time-dependent relationship between the operating conditions of the system and the degradation rate of organic pollutants (Ateia et al., 2020), as is the case of reaction kinetics models, but to implement a model that allows predicting values of the response variable for which no experimental validations were made. Therefore, this article presents a multiple polynomial regression model, a particular case of multiple linear regression in which the relationship between the independent and the dependent variables is an n-degree polynomial in x. To this effect, based on experimental data of a TiO2/UV photocatalytic process used for the decontamination of landfill leachates, our work proposes to construct the process model using this polynomial regression, with the catalyst concentration (mg.L-1 TiO2), the pH, and the UV radiation level (accumulated energy in kJ.L-1) as independent variables, and the degradation (mg.L-1 DOC) as the dependent variable. As a result, an equation is determined which represents the process as a function of the described influential variables, predicting the degradation values in the intermediate of the accumulated energy ranges that could not be measured experimentally. This, in order to determine whether, in each photocatalytic experiment (combination of TiO2 and pH), there was an optimal UV value different from those calculated by the RSM presented in Becerra et al. (2020).
Materials and methods
Our methodology consists of four stages: collection and characterization of the leachate sample, which is the input for heterogeneous photocatalysis, followed by multiple polynomial regression modeling, and finally, results analysis (Figure 1).
The leachate sample was collected from the disposal of ordinary solid waste at a landfill in Norte de Santander (Colombia). The sample was taken directly from the main collection pipe of the different disposal cells. Subsequently, the sample was transported to the Environmental Quality Laboratory of Universidad Francisco de Paula Santander, where it was characterized in terms of dissolved organic carbon (DOC), chemical oxygen demand (COD), total suspended solids (TSS), volatile suspended solids (VSS), pH, and temperature under the guidelines of the Standard Methods for the Analysis of Water and Wastewater (Pawlowski, 1994). As a result of this process, the input concentration of the leachate to the heterogeneous photocatalysis process was obtained.
The heterogeneous photocatalysis, aided by TiO2 in the presence of UV radiation for the decontamination of leachate from a landfill, was carried out in a laboratory-scale Composite Parabolic Collector (CPC) with a surface area of 0,83 m2, borosilicate tubes, and aluminum reflective material. The photocatalyst used was commercial TiO2 P25 (Degussa-Evonik) powder with an anatase:rutile concentration proportion of 70:30, a surface area of 50 m/g, and an average diameter of 20 nm. The material properties of this catalyst were reported by Acosta-Herazo et al. (2019), Ohtani et al. (2010), and Satuf et al. (2005). Our work also employed a storage tank with a capacity of 20 L and a Humboldt pump responsible for recirculating the treatment throughout the system (Figure 2). This operation consists of adding domestic wastewater and leachate (the latter at a concentration of 500 mg.L-1 COD) to the storage tank. The pump is turned on to recirculate the mixture for a few moments throughout the CPC. Afterwards, the amount of TiO2 required for the experiment is added, i.e., 100, 350, or 600 mg.L-1. The pH is immediately stabilized at the required level (3, 6, or 9) with a waterproof multiparameter, using solutions of HCl or NaOH (0,1 N). At this point, the zero samples are taken, initiating the procedure described in the section devoted to calculating the accumulated energy. The experiment ends when the 60 kJ.L-1 are reached, according to the analysis carried out by Borges et al. (2016).
The StatGraphics Centurion XV statistical tool was used for the experimental design, given its capabilities for randomized testing and data visualization. Furthermore, a 32 factorial design with replication at the center point was developed to model the results using a response surface. The summary of the experiments is shown in Table 1. The influential variables measured for the factorial design were the catalyst dose (100, 350, and 600 mg. L-1 TiO2) and the pH (3, 6, and 9). These factors were considered since, when TiO2 has been used as catalyst in the presence of UV radiation, it generates a strong oxidant radical such as OH-, in addition to the fact that it is non-toxic and anti-corrosive and it features a high pollutant removal (Al-Mamun et al., 2019; Villamizar et al., 2022). Fluctuations in the pH level were considered in order to understand which wastewater conditions stimulate the production of hydroxyl radicals (Balarabe and Maity, 2022; Hassan et al., 2016; Yasmin et al., 2020).
Table 1 Photocatalysis experiments

Note: The experiments were conveniently arranged, but StatGraphics Centurion XV randomized the experiments
Source: Authors
Taking into account that photocatalysis is carried out after combining the catalyst and UV radiation to accelerate the chemical transformation of pollutants (Moura and Picão, 2022), the accumulated energy was regarded as a blocking variable in values of 20, 40, and 60 kJ.L-1 for each experiment involving the TiO2/pH combination. Finally, a system was assembled which consisted of a multimeter (Uni-T UT71C) connected to a pyranometer (SP-110); the latter, when in contact with radiation, sequentially marks a value in millivolts (mV). Collecting precise data depends on the time lapses in which the reading is provided, which, in this case, was every 10 min (600 seconds). The next step was a mathematical conversion from mV to Wm 2 , where 1 mV represents 5 W.m -2 . Finally, Equation (1) was used to calculate the energy:
Where:
Qn: total accumulated energy kJ.L-1
Qn-1: previous accumulated energy kJ.L-1
Δtn: irradiation time (600 seconds)
In: average irradiation (W.m-2 UV)
Af: irradiated reactor surface (0,83 m2)
VT: volume discussed (20 L)
To calculate the response variable representing the DOC, reported as a percentage (%) of degradation in each TiO2/ pH combination, a sample was taken at time zero of the treatment. This value is then calculated in comparison with the other accumulated energies via Equation (2). This sample collection process is carried out in amber containers with a 60 ml capacity. Once collected, the samples pass through membrane filters and are prepared for DOC measurements in the Teledyne Tekmar Torch Total Organic Carbon Analyzer.
Where:
% Degradation DOC: percent degradation in each TiO2/pH combination for each accumulated energy at 20, 40 and 60 kJ.L-1
mg.L− DOC (0) : measurement of each DOC value at time zero of treatment in each TiO2/pH combination
mg.L −1 DOC (kJ.L −1): measurement of each DOC value at each accumulated energy value of 20, 40 and 60 kJ.L-1
The results obtained from these experiments were the input for the multiple polynomial regression model, in order to make predictions about the intermediate of the accumulated energies in each TiO2/pH combination, i.e., to understand the behavior of the process in the 20-40 and 40-60 kJ.L-1 intervals. The accumulated energy was measured every 10 min, so that, in each experiment there was a significant amount of data to calculate the response variable. Multiple polynomial regression was carried out using the MultiPolyRegress function (Cecen, 2021), which implemented Equation (3), a matrix expression of the model. Parameter estimation was performed using the least-squares method.
Here,
Y is the vector of observations of the phenomenon - for this case, the percent degradation in terms of DOC.
X is the vector matrix of the independent variables: the catalyst concentration (mg.L-1 TiO2), the pH level, and the solar UV dose applied (kJ.L-1).
β is the vector of the parameters to be estimated.
ε is the vector of model errors.
It is essential to remember that, on many occasions, the aim is to adjust the model in order to obtain very high R2 values, which measure the relevance of the model. Still, this research did not intend for the model to learn the data by memory, but for it to understand the influence of the independent variables on the response variable. Therefore, the selection regarding the degree of the polynomial that best represents the model was based on the R2 value, considering that it must be higher than 0,7 if the model is appropriate and pertinent. However, the model must not be so close to 1, which would imply overfitting (Frost, 2020).
The subsequent analysis considered the results obtained by the RSM in comparison with the predictions calculated using the multivariate polynomial regression model. This allowed determining whether, among the values not measured experimentally, there were degradation percentage values higher than those calculated via the response surface. The polynomial regression model analyzes the correlation between the variables, the distribution of the response variable, and the model's fit.
Results and discussion
The leachate characterization is shown in Table 2.
The COD is used to calculate the inlet concentration of the photocatalytic reactor. From 7 920 mg.L-1 COD, an inlet concentration of 500 mg.L-1 COD was determined for the process.
The results obtained during the TiO2/UV heterogeneous photocatalysis are consolidated in Table 3. Each experiment is specified in terms of TiO2/pH concerning the degradation percentages (DOC) for each measured accumulated energy value. According to what was proposed in Table 1, the latter was set at the values of 20, 40, and 60 kJ. L-1. However, since these values were measured every 10 min, they are close to the fixed values. This is why, in Table 3, this variable appears in decimal form; it depends directly on the solar irradiance on the days of the experiment, some of which were cloudy and others very sunny.
According to these results, it can be inferred that there is an influence of the TiO2 and pH variables on the percentage of degradation in each combination. This is shown by the fact that the highest degradations are associated with low and medium catalyst doses together with low and medium pH levels. Likewise, the process is favored at low pH - in this case, level 3. For the high catalyst dose (600 mg.L-1 TiO2), the results are not consistent with the pH levels in comparison with the other combinations, given that, in this case, the maximum degradation occurred at maximum pH. Thus, it can be concluded that high doses of this catalyst can unbalance the degradation process in the sense that they do not allow for an efficient absorption of solar radiation. This was confirmed by Yashni et al. (2021), who reported the best photocatalytic performance with a pH equal to 3 and concluded that this variable directly impacts the oxidative strength of the generated gap, enhancing the compound's ionization and increasing the removal percentages. Moreover, the concentrations allow stating that, in photocatalytic processes with a low catalyst dosage, hydroxyl radical generation was fostered, improving the pollutant removal efficiency and exhibiting a better light transmittance into the reactor (Rizzo et al., 2014). Consequently, the lower efficiencies could be related to the fact that the aim of our studies was to transform toxic waste liquid into one with more biodegradable characteristics and suitable for biological degradation, taking into account that, when AOPs are used as the only treatment, they are usually expensive due to energy and chemical products consumption (Berberidou et al., 2017; Oller et al., 2011 ; Pazdzior et al., 2019).
As for the accumulated energies, the highest degradations in all cases occurred at the maximum measurement of 60 kJ.L-1. This can be seen as a limitation of the process since the accumulated energy is related to the duration of the treatment. Therefore, the higher the accumulated energy required, the greater the time needed for the process to reach the desired or required degradations of the evaluated pollutants. It is essential that this variable is regarded as influential in the design of experiments to determine whether there are degradation values in nonmaximum accumulated energy ranges, thus contributing not only to optimizing the photocatalytic process, but also to the efficient use of resources in wastewater treatment, an issue that has been controversial in recent years.
In this case, fluctuation was identified between the degradation percentages of each sampled accumulated energy, as is the case of experiment numbers 1, 3, 6, 9, and 10. This fluctuation is generated by the composition of the leachate since, when it is subjected to a chemical process such as photocatalysis, a partial transformation of its contaminant species (i.e., trace metals, recalcitrant compounds, and organic macro- and micro-pollutants) may occur, reaching maximum removal efficiency only when the appropriate UV dose is applied (60 kJ. L-1). This phenomenon was similarly reported by Amigh and Mokhtarani (2022), Çifçi and Meriç (2015), Hassan et al. (2017), Nomura et al. (2020), and Rocha et al. (2011).
Figure 3 shows the correlation between the independent variables and the response variable: degradation - ph = -0,47; degradation - UV = 0,34; and degradation - catalyst = -0,21. The highest correlation is found with pH; the higher the pH, the lower the degradation. The same happens with the catalyst, which, in higher quantities, impairs the degradation process. Finally, regarding the UV dose, which maintains a direct proportional relationship, the higher the accumulated energy, the higher the degradation. This was confirmed by Jia et al. (2013), who achieved the highest removal percentage in the longest UV-radiation exposure time. Thus, this analysis is consistent and reproducible with the results obtained via the RSM.
With the dataset in Table 3, a multivariate polynomial regression model was developed in the MATLAB Online R2020b software. The degree of the polynomial was set at n = 3, obtaining an R2 = 0,8652. For lower degrees, the model did not fit correctly, and, for higher degrees, it showed overfitting. Some researchers state that an R2 = 0,8 clearly indicates a very good regression model performance and explains a significant amount of the variance in the data (Chicco et al., 2021). In our case, other models were tested with a fourth-degree polynomial, obtaining an R2 of around 0,9. However, we noticed that the model's curve was overfitted to the experimental data. Therefore, the reported R2 was considered to be more adequate in explaining the variance of our experimental data.
Equation (4) models the heterogeneous photocatalysis in the decontamination of landfill leachate.
Figure 4 reports the accuracy and ecision of the model, and it identifies the type of bias in it. Figure 5 demonstrates that the residuals follow a random pattern, i.e., no trend and no non-constant variance. Figure 6 shows that the residuals fit a normal distribution, i.e.,Figures 5 and 6 confirm that the ordinary least-squares assumptions are valid for the model.
Conclusions
This paper modeled the TiO2/UV heterogeneous photocatalysis process for the decontamination of landfill leachate using experimental data as the input. The fitting method implemented was a multivariate polynomial regression, resulting in a third-degree polynomial with an R2 of 0,8652. This model was validated using residual analysis, indicating that the errors are independent, follow a normal distribution, and have a constant variance. Therefore, the model and the conclusions derived from it are valid.
The catalyst dose and the pH have a negative influence on the percentage of degradation. This agrees with the RSM analysis, which showed that the maximum degradations occurred at low catalyst levels and acid pH values, so they have a negative correlation, i.e., if either of the two increases, the degradation will decrease, which will affect the treatment process. Based on a similar analysis, it was inferred that the influence of the accumulated energy is positive on the percentage of degradation, and this is coherent with that obtained via the RSM, as the maximum degradations occurred at the highest levels evaluated. This means that, as the UV levels increase, the degradation will also increase, which is the purpose of all wastewater treatments - to obtain the highest possible pollutant removals.
Based on the predictions of the model at intermediate accumulated energy values that were not measured experimentally, it can be concluded that, in the case of low catalyst doses (100 mg.L-1) both the polynomial regression model and the RSM agreed that the highest degradation in terms of DOC was achieved at the maximum cumulative energy of 60 kJ.L-1 and at a pH = 3. However, at this same low dose, the model predicted higher degradation values at much lower accumulated energies for pH levels of 6 and 9. Thus, for a range between 25 and 30 kJ.L-1 the model predicted degradations of 6% (pH = 9) and 28% (pH = 6), while similar degradation values were obtained in the RSM, but at the maximum accumulated energy. Therefore, it can be inferred that, at low catalyst doses and medium and high pH levels, it is possible to find maximum degradations at low cumulative energies. This phenomenon occurred in a similar way for the other pH and catalyst combinations.
It is important to remember that leachates naturally have a pH close to neutrality (7,29), so evaluating their degradation at a low catalyst dose and their natural pH can provide promising results regarding the inferences made via modeling. This would allow for a decrease in the use of reagents to acidify the pH, thus optimizing not only the influential variables, but also the use of resources.
This work intends to be a partial conclusion concerning leachate treatment via heterogeneous photocatalysis, as well as to contribute to the modeling of this phenomenon given the amount of available data - please note that all the possible combinations that could occur in the treatment were not evaluated.