Christina
Schenk‡
*a,
Jose
Hobson‡
ab,
Maciej
Haranczyk
a and
De-Yi
Wang
*a
aIMDEA Materials Institute, C/Eric Kandel 2 - Getafe, 28906, Madrid, Spain. E-mail: christina.schenk@imdea.org; deyi.wang@imdea.org
bUniversidad Carlos III de Madrid, Departamento de Ciencia e Ingeniería de Materiales e Ingeniería Química, IAAB, Avda. Universidad, 30, 28911 Leganés, Madrid, Spain
First published on 13th May 2025
This work introduces, for the first time, an innovative bio-based flame retardant (FR) system for biocomposites, integrating experimental insights and machine learning (ML) to optimize both composition and performance. By employing a computationally guided, cost-efficient experimentation strategy, we systematically combine design of experiments for space exploration, ML-driven property prediction, and optimization methods to rapidly identify high-performance formulations. Crucially, this approach demonstrates how data-driven techniques can be seamlessly incorporated into conventional experimental material design, ensuring proper sampling of the design space and leveraging the collected data to generate new predictions and optimize the properties of these sustainable materials. As a result, mechanical strength is significantly enhanced and fire safety improved, minimizing reliance on resource-intensive trial-and-error processes. The optimal formulation achieved an 18.4% increase in tensile strength (TS) and a 53.1% reduction in the peak heat release rate (pHRR) compared to the neat polymer. Bayesian optimization further validated individual optimal solutions, delivering up to a 22.3% improvement in TS and a 73.7% reduction in pHRR. Overall, this research establishes a digitally integrated workflow that accelerates the development of sustainable, high-performance biocomposites and bio-based flame retardants, providing eco-friendly alternatives to conventional fire-safe polymeric materials.
The drive to replace fossil-based polymers has brought biopolymers with low density and high mechanical performance to the spotlight of research, such as polyamide (PA56). This bio-based polyamide, here produced from microorganisms via lysine decarboxylase through whole-cell biotransformation,3 shows potential for reducing pollutants due to its non-toxicity, biodegradability, and biocompatibility. PA56 has found applications in textiles, drug delivery, food packaging, and water treatment.4 However, most polyamides containing aliphatic segments are intrinsically flammable because they depolymerize into cyclic monomers that produce little to no char and are susceptible to degradation by additives. This makes them more challenging to flame-retard compared to other polyolefins,5 thus opening new avenues for flame-retardant design. The use of phosphorus-based flame retardants, particularly those derived from sustainable biosources, is gaining traction as a safer alternative to organobromine chemicals, which face increasing regulatory scrutiny. Consequently, various bio-based flame retardant strategies, including phytic acid, nanoclays, boron compounds, and chitosan, have been extensively explored, raising awareness about the environmental concerns associated with traditional flame retardants and highlighting the need for safer, more sustainable solutions.6,7
Multicomponent or hybrid flame retardants are an essential class of non-halogen flame retardants known for their ability to form a compact carbonaceous layer on material surfaces. This layer acts as a thermal barrier, preventing heat transfer into the polymer, isolating oxygen sources, and effectively disrupting the path of combustible materials toward the flaming zone.8–10 These flame retardants typically consist of three components: an acid source, a carbon source, and a gas source. Common acid sources include phytic acid, phosphoric acid, ammonium polyphosphate, and phosphate esters. Carbon sources often used are cyclodextrin, tannic acid, lignin, vitamins, or cellulose, while gas sources typically include melamine, urea, casein, collagen, or amino acids.11,12
In addition, advanced carbonaceous materials such as graphene oxide, carbon nanotubes, and halloysite nanotubes have garnered significant scientific interest due to their exceptional mechanical properties and high surface area. Recent studies have shown that these materials enhance flame retardancy by isolating the carbon layer and suppressing the release of combustion gases, creating an effective physical barrier.13–15 For instance, Song et al. functionalized graphene oxide with piperazine and phytic acid to enhance epoxy resin composites. This modification reduced the peak heat release rate (pHRR) and total heat release (THR) by 42% and 22%, respectively. The mechanism was assigned to the synergistic combination of gas dilution from piperazine, the catalytic char formation from phytate, and the “zigzag path” barrier effect of graphene oxide.16
Similarly, Ayou Hao et al. investigated the critical mass ratio between a flame-retardant metallic phosphate and halloysite nanotubes for polyamide 11 composites, using twin-screw extrusion for composition preparation. The results indicated that the reinforcement provided by halloysite nanomaterials allowed for a balanced combination of flame retardancy, mechanical strength, stiffness, and toughness, with an increase in elongation at break and achieving a UL-94 V-0 rating. The study also demonstrated that halloysite nanotubes acted as nucleating agents, raising the crystallization temperature (as measured by DSC), while micro-combustion calorimetry (MCC) confirmed a reduction in combustion activity.17
While the examples discussed thus far largely rely on traditional approaches—rooted in human intuition, empirical screening, and iterative trial-and-error—there is significant untapped potential in accelerating materials development through data-driven techniques and machine learning (ML). These conventional methods, though foundational, are inherently time-consuming and often limit the speed at which new formulations can reach industrial and commercial applications.18 In contrast, ML offers a powerful alternative, capable of rapidly predicting material properties by leveraging existing literature data and integrating even small datasets from experiments or simulations.19 Embracing these tools presents a clear opportunity to shift from slow, manual discovery to more agile, informed, and scalable innovation in materials science.
ML-driven models have shown promise in enhancing the prediction and optimization of various material properties. For example, they have been used to optimize materials and devices in organic photovoltaics, enable rapid prediction of metal–organic framework (MOF) band gaps using unsupervised dimensionality reduction techniques, benchmark the performance of graph neural networks (GNNs) for predicting properties from crystal stability to electronic characteristics and modify liquid flammability ratings through unsupervised clustering.20–22
ML has recently made notable contributions to the field of flame retardancy, with various significant research studies published. For instance, a Bayesian regularized artificial neural network with a Gaussian prior (BRANNGP) and multiple linear regression (MLR) were employed to predict the heat release characteristics of flammable fiber-reinforced polymer laminates.23 Additionally, Chen Z. et al. utilized property descriptors and polynomial regression techniques to design organic, phosphorus-containing flame retardant composites.24 Recently, the prediction of 3D printability and printing quality of bio- and thermoplastic-based nanocomposites based on material properties, such as thermal and rheological characteristics, has been explored comparing different ML models, specifically Random Forest, Neural Network and Support Vector Machine models.25 However, the application of machine learning (ML) to the analysis and development of bio-based flame retardants integrated with biocomposites remains unexplored. Additionally, current ML approaches can be complex and challenging to implement effectively.26 To address some of these complexities, our group has begun to incorporate laboratory automation to faciliate data collection of polymers and polymer nanocomposites.27
This contribution introduces an innovative bio-based flame retardant (FR) system for biocomposite materials developed by employing a data-driven experimentation and optimization framework. By systematically integrating ML models with experimental data, this research enables the rapid identification of optimal formulations, improving performance while minimizing resource-intensive trial-and-error processes.
Our discovery workflow, depicted in Fig. 1, consists of the following: (1) initial experimental material prototyping based on chemist intuition, (2) post-analysis of the initially explored composition space and a proper Design of Experiments (DoE) approach to find composition space regions missed in the initial experimentation; this step involved a new implementation of DoE designed for nanocomposites (i.e. ConstrAined Sequential laTin hypeRcube sampling methOd (CASTRO)), (3) further experimentation effort to prepare and characterize DoE-CASTRO-indicated compositions, (4) using the complete data points to build ML-based composition-property models and (5) use these ML models to further optimize the composition of the materials to maximize key mechanical and fire-safety properties. The following Sections 2 and 3 use this order to discuss the methodology and the obtained results, respectively. Finally, we summarize the key findings in Section 4.
3-Aminopropyldiethoxymethylsilane, purity >97%, mw: 191.35 g mol−1, d: 0.92 g mL−1 was purchased from TCI Europe N.V. and Aluminum phosphinate, Exolit OP-935 23.3–24.0% P (w/w), d: 1.35 g cm−3 and Exolit OP-1400 24.5–25.5% P (w/w), d: 1.45 g cm−3 were purchased from Clariant international plastics & coatings (Frankfurt, Germany). Hexagonal boron nitride nanopowder, avg. particle size: 70 nm was purchased from Lower friction MK Impex Corp. Polyamide 56, biobased polyamide resin Ecopent 1273, Injection grade: relative viscosity: 2.76 cP (25 °C), mp: 255 °C was kindly provided by Cathay Industrial Biotech.
The nine components under investigation are bio-based polyamide (PA56), phytic acid (PhA), the four amino-based components—chitosan (CS), boron nitride (BN), tromethamine (THAM), and melamine (MEL)—and three metallic-based components—calcium borate (CaBO), zinc borate (ZnBO), and halloysite nanotube (HNT). We divide the problem into 3 subproblems, in detail, the primary problem, i.e. PA56, PhA, one subproblem for the amino-based and another subproblem for the metallic-based components. The weight fractions of all components ck, k = 1, …, ncomp, where here ncomp = 9 need to sum up to 1, i.e.
![]() | (1) |
0.8 ≤ PA56 ≤ 1, | (2) |
0 ≤ PhA ≤ 0.05, | (3) |
0 ≤ amino-based component ≤ 0.1, | (4) |
0 ≤ metal-containing component ≤ 0.14. |
For the metal-based components, a strict constraint applies – no intra-combinations are permitted. Mathematically, this is expressed as
cki ∈ {0,1} ∀k,i, |
![]() | (5) |
We used the RandomForestRegressor class of the sklearn.ensemble package in Python for this purpose, where specifically for multi-task regression we used the MultiOutputRegressor routine. For data pre-processing we split the dataset to use 40% for training and 60% for testing. The RF model hyperparameters were chosen as the number of trees in the forest (n_estimators = 5000), the maximum depth of the tree (max_depth = 900), the minimum number of samples required to split an internal node (min_samples_split = 2), the minimum number of samples required to be at a leaf node (min_samples_leaf = 1), and the number of features to consider when looking for the best split (max_features = 1.0 = max(1, int(max_features × n_features_in_)) with n_features_in_denoting the number of input features). We tested different hyperparameter settings and used the hyperparameter configuration that showed the best performance. Additionally, we use cross-validation with five folds to ensure robust performance evaluation. To assess and validate the RF prediction models, the squared correlation coefficient (R2) and the mean absolute error (MAE) were calculated using the functions from sklearn.metrics package, i.e. r2_score and mean_absolute_error.
0 ≤ HNT ≤ 14, | (6) |
0 ≤ PhA ≤ 5, | (7) |
0 ≤ CS ≤ 10 and | (8) |
0 ≤ Mel ≤ 10 or | (9) |
0 ≤ THAM ≤ 10. |
PA56 = 100 − (HNT + PhA + CS + Cmt) | (10) |
pHRRval(Ĉ) = (pHRRpred(Ĉ) − pHRRpure)/pHRRpure | (11) |
TSval(Ĉ) = (TSpred(Ĉ) − TSpure)/TSpure | (12) |
![]() | (13) |
First, as in previous steps, we split the dataset into 40% training and 60% test/candidate data using the train_test_split routine from sklearn.model_selection with shuffling enabled. We then defined thresholds for both the TS and pHRR cases to determine which candidates should be included in the candidate set and removed from the training set. After shuffling both sets, we proceeded with Bayesian optimization for 20 iterations. For the surrogate model, we tested both an RBF Gaussian Process (GP) kernel from the gpytorch library and the Random Forest (RF) model introduced earlier. Since the RF model demonstrated superior performance in this case, we focused on it for subsequent analysis. Additionally, we selected an expected improvement (EI) acquisition function, leveraging the predicted mean and standard deviation from the surrogate model.
At initialization, we set the best observed value to the maximum of the training set, i.e. best_observed = y_train.max. During each of the 20 iterations, we maximized the EI acquisition function, removed the optimal point from the candidate set, and updated the best observed value if a new, higher value was found. This implementation was built upon.31
The first 75 formulations used to generate the initial dataset were prepared specifically for the ML model construction, each of them representing a newly selected combination of polymer matrix and flame retardant fillers. From Fig. 2 we can see that several rational concepts for the design of polymer composites formulations were confirmed,33e.g. the LOI (Fig. 2d) and char residue (Fig. 2a) values mostly increased after the introduction of the flame retardant additives, while the pHRR (Fig. 2e) mostly and notably decreased meaning the flame retardant composites exhibited the expected higher performance against the fire in terms of heat generation when compared to the unmodified polyamide 56 (results represented by the dashed lines in Fig. 2a–f), whereas other metrics such as the TSP (Fig. 2f) revealed different outcomes as some of the formulations yielded a smoke suppression effect and other ones generated higher smoke production due to the gas dilution effects caused by the incorporation of amino based synergists used as blowing agents. In contrast, a more dispersed trend was obtained after the mechanical properties testing (Fig. 2b and c), indicating that by designing hybrid materials using the combination of different additives, the resulting hybrid usually acquires new features that depend on the physicochemical properties of the single components, their structure, and interfaces between the systems.
![]() | ||
Fig. 2 PA56 testing results summary (first 75 experiments) (a) char residue (in wt%) at 800 °C, collected from TGA test (b) tensile strength (c) Young's modulus (d) LOI (e) pHRR and (f) TSP. |
While an in-depth understanding of the interaction between individual components and the polymer matrix could enable the design of composite materials with targeted improvements, the multiscale design of hybrid composites presents significant challenges. These challenges include the complexities of the used flame retardants, the diversity of involved reactants, and the variability of combustion circumstances in complicated arrangements, all of which create intricate interactions that are difficult for individuals to fully discern. Consequently, computational guidance through ML methods offers a more effective approach, providing superior predictive performance that facilitates the optimization of single or multiple desired properties, ultimately enhancing the efficiency of future material designs.34,35
More specifically, the new CASTROLHS suggestions prioritize less-explored combinations, such as those with lower PA56 contents while varying more additives simultaneously. By incorporating these 15 new suggestions alongside the previously collected data, the overall sample distribution expands into areas that were previously underexplored or entirely unexplored (cf. ESI†).
We conducted the first 14 experiments, listed in Table 1, while sample 15 could not be synthesized due to the excessive chitosan and melamine in the flame retardant formulation. This overloading led to a complexation yield below 10% and a reduced amount of solid precipitate after the reaction was completed.
Sample | PA56 | PhA | MEL | THAM | CS | BN | ZnBO | CaBO | HNT |
---|---|---|---|---|---|---|---|---|---|
76 | 85.1 | 0.9 | — | 5.9 | — | — | — | 8.1 | — |
77 | 82.4 | 1.6 | 6.5 | 2.8 | — | — | — | 6.7 | — |
78 | 81.9 | 3.2 | 4.0 | — | 1.6 | — | 9.3 | — | — |
79 | 80.3 | 4.2 | — | — | — | 6.6 | — | 8.9 | — |
80 | 81.4 | 1.7 | 3.0 | — | 1.5 | — | — | 12.4 | — |
81 | 81.6 | 1.9 | — | — | 3.8 | — | — | 12.7 | — |
82 | 83.6 | 0.4 | — | — | 3.2 | — | — | 12.8 | — |
83 | 80.1 | 4.1 | — | — | 8.5 | — | — | 7.3 | — |
84 | 84.1 | 4.8 | — | — | 9.4 | — | — | 1.7 | — |
85 | 80.1 | 2.2 | — | — | — | 6.8 | — | 10.9 | — |
86 | 84.1 | — | — | 6.0 | — | — | — | — | 9.9 |
87 | 82.8 | 2.5 | — | — | 6.4 | — | 8.3 | — | — |
88 | 81.6 | 3.0 | — | — | 7.6 | — | 7.8 | — | — |
89 | 80.4 | 1.9 | 6.6 | 3.2 | — | — | 7.9 | — | — |
90 | 80.2 | 0.1 | 4.6 | — | 3.4 | — | 11.7 | — | — |
![]() | ||
Fig. 3 Flame retardant PhA@Mel@THAM FTIR Spectra (a) 13C NMR (b) 31P NMR (c) and PhA@Mel@CS FTIR Spectra (d) 13C NMR (e)31 P NMR (f). |
The relevant NMR characteristic peaks were also detected for the PhA@Mel@THAM, in 13C NMR in D2O (Fig. 3b), the peaks at 59.3 and 61.3 ppm were assigned to the methylene groups in THAM,37,38 while the detected signals at 74.2 and 166.52 ppm were assigned to the hexacyclic carbons in phytic acid and the triazine group in melamine respectively.39 In 31P NMR spectrum in D2O (Fig. 3c), the typical phytic acid signals were observed at 2.28 and 0.06 ppm related to the external phosphate groups. The NMR scanning of the PhA@Mel@CS yielded considerably smaller signal levels due to the presence of more complex chitosan oligomeric structures. Nevertheless, in the 13C NMR in D2O (Fig. 3e) several peaks were visible around 60 ppm which could be assigned to the hydroxymethyl groups in chitosan,40 whereas the corresponding signals at 74.2 and 166.5 ppm were attributed to phytic acid and melamine as previously described. A similar result was obtained in the 31P NMR spectrum in D2O (Fig. 3f), the phytic acid peaks were observed at 2.28 and 0.06 ppm with lower signal strength.
![]() | ||
Fig. 5 PA56 composites's char residue digital images (a–d) SEM-EDS (e and h) and Raman spectra (i–l). |
TGA-FTIR was employed to study the release of gaseous volatiles in the internal layers of the polymer matrix under an anaerobic atmosphere. In Fig. 6, the IR spectra of samples 80, 83, 85, and 89 as a function of the temperature increase in TGA are presented in order to analyze the gas phase effects of the flame retardant polyamide products. Aliphatic polyamides degrade thermally by an initial chain scission mechanism yielding amide and alkene structures, and after that, by other reactions including the scission of the C–N bonds and the carbonyl groups producing ammonia, cycloalkanes and carbon dioxide.44 In the FTIR spectra of the polyamide composites, several signals were detected such as 929 cm−1 indexed as bending vibration of NH3, at 1760 cm−1 for the carbonyl stretch vibration of CO group and at 2360 cm−1 for the asymmetrical stretching of CO2 were detected for all samples. Meanwhile, other different peaks in 1440 cm−1, 690 cm−1 and 785 cm−1 correspond to the bending vibration of the phosphate and aromatic carbon group, the P–O–C stretching vibration and vibration of benzene ring belonging to the pyrophosphoric groups generated during the phytic acid combustion process. For samples 80 and 83, i.e.Fig. 6a and d, the NH3 signal was also detected at 300 °C indicating that the non-combustible gases were also released at lower temperatures due to the presence of melamine which was able to extend the gas phase activity of the flame retardant system.
In sample 83, i.e.Fig. 6b, lower gaseous phase products were detected at low temperatures and after 400 °C the signal intensity of the polymer combustion products reached the peak of thermal degradation revealing more evident signals at 1760 cm−1, 2360 cm−1 and 2940 cm−1, while the signals at 1440 cm−1, 930 cm−1 and 755 cm−1 were also detected with low signal intensity. For sample 85, i.e.Fig. 6c the peak at 2360 cm−1 was clearly detected in the 300 °C spectra alongside with a slight signal in the alkyl substituents region at 3300 cm−1, at 400 °C an additional peak was detected at 3340 cm−1 related to the N–H vibration signal from the primary aliphatic amide, which was also detected in a lower intensity in the 500 °C spectra, where the signals at 960 cm−1, 930 cm−1 and 760 cm−1 were also detected with low intensity, indicating a lower level of gas phase activity which also assisted the combustion hindrance to some extent by preventing oxidation of CO by the flame dilution effect. Based on the results presented above, it can be observed that the combustion of flame-retardant polyamide products generates a complex variety of gas fragments, depending on the specific flame-retardant system. This contributes significantly to the overall flame retardancy mechanism, which involves promoting charring through the dehydration of PA56 into a viscous char. This char is primarily formed by a high graphitic degree carbon, reinforced with metallic elements such as phosphorus, calcium, zinc, and boron.
Whether modified flame retardants (mHNT and mPHA) are used is very important for some of the properties. This is why we extended our descriptors by these two.
Fig. 8 illustrates the Spearman correlation heatmap for descriptors (PA56, HNT, PhA, CS, BN, THAM, CaBO, ZnBO, Mel, mHNT, mPhA) and properties (Residue, LOI, pHRR, THR, TS, YM). We also checked Pearson and Kendall correlations but our observations mostly aligned with what is explained in the following. Thus, the results can be found in the ESI.†
The heatmap Fig. 8 highlights PA56 as the most influential descriptor for fire properties, showing strong negative correlations with Residue and LOI (−0.87, −0.83) and strong positive correlations with pHRR and THR (0.77, 0.63). This suggests that higher PA56 levels reduce Residue and LOI while increasing pHRR and THR. HNT shows a consistent correlation with YM (0.41), reinforcing its role in influencing Young's modulus. Also, TS has a strong correlation with PA56, indicating its relationship with tensile strength (0.63). The correlation between TS and fire properties (pHRR and THR) is moderate, suggesting some shared factors but less direct interaction.
Fire properties (Residue, LOI, pHRR, THR) show high cross-correlation, indicating they tend to vary together. Mechanical properties (TS and YM), however, show a weaker interrelationship, (particularly in the Pearson analysis cf. ESI;† correlation of 0.21), suggesting these properties are influenced by different factors.
Spearman's analysis, which captures monotonic relationships, shows stronger correlations between YM and HNT, indicating potential non-linear associations. The correlation between mPhA and the fire properties pHRR and THR is more negative in Spearman (−0.45, −0.35) compared to Pearson (ESI†), suggesting mPhA might influence these properties in a more complex, non-linear way. Overall, all analyses confirm PA56 as a primary factor for fire properties and TS, and HNT as key for YM. However, Spearman highlights potential non-linear interactions, providing deeper insight into the relationships among descriptors and properties.
In the following, we decided to focus on two properties, one fire property, the pHRR, and one mechanical property, the TS. We selected these two since the pHRR is among the most reliable measures, assumes real values, and is correlated with other fire properties. We chose the TS for mechanical representation as it is one of the most important performance metrics of polyamides and more generalizable than the YM.
We made predictions based on the RF model described in Section 2.6 for these properties building single-task and multi-task models. The RF-predicted TS values were compared with experimentally measured values (cf. ESI†). To evaluate the model's accuracy, we analyzed the squared correlation coefficient (R2) and the mean absolute error (MAE) for both the training and test datasets, also shown in Table 2. For the TS predictions shown in SI, the test data yielded an R2 of 0.72 and an MAE of 4.83. In contrast, the training dataset achieved an R2 of 0.95 with an MAE of 1.81. The mean R2 from cross-validation indicates a score of 0.57 for the training set and 0.73 for the test set.
Metric | Training set | Test set |
---|---|---|
Tensile strength (TS) | ||
R 2 | 0.95 | 0.72 |
MAE | 1.81 | 4.83 |
Mean CV R2 | 0.57 | 0.73 |
![]() |
||
Peak heat release rate (pHRR) | ||
R 2 | 0.95 | 0.65 |
MAE | 24.37 | 68.50 |
Mean CV R2 | 0.57 | 0.49 |
![]() |
||
Multi-task approach | ||
R 2 | 0.95 | 0.68 |
MAE | 13.09 | 36.66 |
Mean CV R2 | 0.57 | 0.61 |
When predicting the pHRR (cf. observations vs. predictions figures in ESI†), the test data resulted in an R2 of 0.65 and an MAE of 68.50, while the training dataset also achieved an R2 of 0.95 with an MAE of 24.37. The mean cross-validation R2 scores for the training set were 0.70 and 0.38 for the test set. These results indicate that the model demonstrates better performance in predicting TS compared to pHRR, as evidenced by the higher R2 values and lower MAE across both datasets. The mean R2 scores from cross-validation indicate a drop in performance when the model is applied to the training set (0.57) compared to the test set (0.73). This variability suggests that while the model can learn patterns in the training data, it may not consistently apply those patterns to new data. Note that this happened due to chance, as one of the scores is terrible, such that on average it performed worse on the test set than the training set. The remaining scores lay between 0.36 and 0.77 for the training set, leading to an average score of 0.72, while between 0.58 and 0.82 for the test set. The results indicate that the model has more difficulty accurately predicting pHRR values. This lower performance may suggest that the factors influencing pHRR are more complex.
The multi-task prediction approach (cf. observations vs. predictions figures in ESI†) demonstrates comparable effectiveness to the single-task method when predicting TS and peak heat release rate (pHRR). The multi-task model achieves an R2 of 0.68 and a mean absolute error (MAE) of 36.66 for the test set, alongside an R2 of 0.95 and an MAE of 13.09 for the training set. These results indicate that the multi-task model can effectively learn from both objectives simultaneously, providing a robust predictive framework.
In contrast, the single-task approach (cf. observations vs. predictions figures in ESI†) has shown distinct performance characteristics, particularly with higher R2 values and lower MAE for TS predictions, while exhibiting more variability in predicting pHRR. The mean cross-validation R2 scores of 0.61 for the test set and 0.57 for the training set further support the idea that the multi-task model may enhance generalizability by leveraging shared information between the objectives, though it does show a slight decrease in accuracy compared to the single-task model for TS.
When comparing feature importance between the single-task (Fig. 9) and multi-task approaches (Fig. 10), several noteworthy differences emerge, although PA56 consistently stands out as the most important feature across all models.
In the single-task pHRR RF model, the four most significant features are PA56, HNT, CS, and THAM/MEL. This suggests that these features play a critical role in accurately predicting pHRR when evaluated in isolation. Conversely, for the single-task TS model, the leading features shift slightly to PA56, BN, CS, and MEL. The variation in feature importance indicates that different factors may be more relevant for predicting TS than pHRR.
In the multi-task approach, the feature importance for TS includes PA56, PhA, CS, and HNT as the top contributors, while for pHRR, the most influential features are PA56, PhA, Mel, and HNT. This overlap highlights the shared relevance of PA56 and HNT in both tasks, suggesting that these features may capture underlying relationships that benefit both predictions.
Overall, the differences in feature importance underscore the potential advantages of the multi-task approach, which integrates insights from both objectives, allowing for a more comprehensive understanding of the factors influencing TS and pHRR.
For the second optimization campaign (green plus signs in Fig. 11), we adopted a similar strategy but adjusted our approach based on insights from the first campaign. Since the latter struggled to achieve the desired TS, we expanded the range of weight factors to place greater emphasis on tensile strength. Specifically, we selected ω ∈ 20, 50, 200 for THAM and ω ∈ 10, 20, 100 for MEL. From the optimal solutions, we identified the most promising candidates that enhanced TS without significantly increasing pHRR.
Fig. 11 provides a comprehensive view of all stages of our data-driven workflow: Initial design points (75 total, grey circles), CASTRO-DoE points for chemical space exploration (red stars, 14 total) and Optimization campaigns (Campaign 1: blue crosses, Campaign 2: green plus signs) derived via weighted point optimization using the multi-task RF model. The progression from DoE points to optimization campaigns 1 to 2 reveals a clear trend of improvement, refining both pHRR and TS values. However, it is worth noting that not all points from the second optimization campaign lie closer to the Pareto front. This is due to the high sensitivity of the problem: small changes in composition can lead to disproportionately large variations in the output properties. For instance, a formulation that significantly improves pHRR may simultaneously compromise TS, and vice versa. As a result, even within an optimization campaign with an increased emphasis on tensile strength, trade-offs between objectives can lead to points that are not Pareto-optimal.
Additionally, Fig. 11 highlights the Pareto front (purple dashed line), which connects non-dominated solutions that minimize pHRR while maximizing TS. The front demonstrates how the data-driven strategy (red stars → blue crosses → green plus signs) systematically enhanced performance. However, it is important to note that in the DoE approach, the material properties were not explicitly considered during the selection process.
A key highlight of Fig. 11 is the best compromise solution (red circled green plus sign), located near the top of the Pareto front. This optimized formulation achieved in the second optimization campaign corresponds to a TS of 85.4 MPa, marking an 18.4% improvement over the pure polymer and a pHRR of 416.6 kW m−2, indicating a 53.1% reduction from the pure polymer. This formulation represents a well-balanced trade-off, maximizing TS while significantly reducing pHRR, making it an optimal material candidate.
As shown in Fig. 13a and b, the optimization process guided the compositional design of the bio-based flame retardants and their subsequent incorporation into the polyamide matrix. Such ML-based recommendations created a logic reduction of time and expensive experimental tuning of processing parameters and functional properties that normally delay the development of both synthetic chemistry and polymer manufacturing and testing protocols.45 Moreover, it was possible to develop ML-designed polyamide composites featuring enhanced functional performance of one or multiple properties (such as samples 96 and 102 that simultaneously achieved more than 50% reduction in pHRR and more than 18% of tensile strength increase), showing that ML effectively improved the development of flame retardant formulations and enabled the fabrication of polymer composites with customized attributes as complementary advantage.
The synthesis procedure of the flame retardants, material preparation, characterization, and measurement steps were described in detail. Next, the computationally guided design of experiments method, CASTRO, was introduced. This technique focused on space exploration for material composition and synthesis. Following this, predictive ML models using Random Forest regression were developed for both single-task and multi-task learning, aimed at forecasting mechanical and fire properties. The multi-task model was used to identify designs that optimized both mechanical and fire properties in a Pareto-efficient manner. Bayesian optimization was then introduced to validate the single-task results.
The preliminary results of flame-retardant polyamide composites were analyzed, followed by the computationally guided design of experiments optimization, which used the CASTRO method to enhance biocomposite development. The 15 experimental DoE-CASTRO suggestions were analyzed with respect to flame retardancy and mechanical properties. After sufficient design space exploration, the focus shifted to property selection, prediction, and optimization. The optimized samples were then thoroughly analyzed.
The best Pareto design discovered demonstrated a tensile strength (TS) of 85.4, showing an 18.4% improvement over the pure polymer, and a peak heat release rate (pHRR) of 416.6, indicating a 53.1% reduction compared to the pure polymer. Further validation using Bayesian optimization on the entire dataset confirmed the optimal solutions for individual objectives. The most effective individual designs achieved a TS of 88.2, a 22.3% improvement, and a pHRR of 233.2, reflecting a 73.7% reduction over the pure polymer.
This research demonstrated how a computationally guided approach could significantly accelerate the development of novel, sustainable, bio-based flame retardants and biocomposites, offering high-performance alternatives to traditional solutions. By leveraging novel DOE techniques, predictive analytics and iterative optimization, the study identified novel material compositions and flame retardant (FR) systems with enhanced functional properties. Specifically, it optimized the peak heat release rate (pHRR) while demonstrating correlations to other fire properties and mechanical performance through tensile strength (TS) optimization. These innovative designs provide more sustainable and environmentally friendly alternatives with superior functional properties compared to conventional solutions, paving the way for the design of next-generation materials with improved mechanical and fire-resistant properties and contributing to a more efficient, environmentally friendly material design process.
Footnotes |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d5ta02511g |
‡ These authors contributed equally to this work. |
This journal is © The Royal Society of Chemistry 2025 |