Data-driven design and green preparation of bio-based flame retardant polyamide composites

Christina Schenk; Jose Hobson; Maciej Haranczyk; De-Yi Wang

doi:10.1039/D5TA02511G

View PDF VersionPrevious ArticleNext Article

DOI: 10.1039/D5TA02511G (Paper) J. Mater. Chem. A, 2025, 13, 26228-26243

Data-driven design and green preparation of bio-based flame retardant polyamide composites†

Christina Schenk‡ *^a, Jose Hobson‡ ^ab, Maciej Haranczyk ^a and De-Yi Wang *^a
^aIMDEA Materials Institute, C/Eric Kandel 2 - Getafe, 28906, Madrid, Spain. E-mail: christina.schenk@imdea.org; deyi.wang@imdea.org
^bUniversidad Carlos III de Madrid, Departamento de Ciencia e Ingeniería de Materiales e Ingeniería Química, IAAB, Avda. Universidad, 30, 28911 Leganés, Madrid, Spain

Received 29th March 2025 , Accepted 12th May 2025

First published on 13th May 2025

Abstract

This work introduces, for the first time, an innovative bio-based flame retardant (FR) system for biocomposites, integrating experimental insights and machine learning (ML) to optimize both composition and performance. By employing a computationally guided, cost-efficient experimentation strategy, we systematically combine design of experiments for space exploration, ML-driven property prediction, and optimization methods to rapidly identify high-performance formulations. Crucially, this approach demonstrates how data-driven techniques can be seamlessly incorporated into conventional experimental material design, ensuring proper sampling of the design space and leveraging the collected data to generate new predictions and optimize the properties of these sustainable materials. As a result, mechanical strength is significantly enhanced and fire safety improved, minimizing reliance on resource-intensive trial-and-error processes. The optimal formulation achieved an 18.4% increase in tensile strength (TS) and a 53.1% reduction in the peak heat release rate (pHRR) compared to the neat polymer. Bayesian optimization further validated individual optimal solutions, delivering up to a 22.3% improvement in TS and a 73.7% reduction in pHRR. Overall, this research establishes a digitally integrated workflow that accelerates the development of sustainable, high-performance biocomposites and bio-based flame retardants, providing eco-friendly alternatives to conventional fire-safe polymeric materials.

1 Introduction

The development of bio-based polymer composites with high-performance properties and reduced environmental impact is an important area of scientific and applied research in the context of the reduction of the use of fossil-based materials and the mitigation of risks of climate change. As awareness of the need to promote the use of natural resources continues to grow, bio-based polymers are increasingly being integrated across various sectors such as manufacturing, packaging, healthcare, electronics, transportation, and construction, where polymeric materials play the critical role of holding a specific combination of mechanical, chemical and physical properties, namely strength, stiffness, or flammability, which are often required to fulfill the needs of potential end-user appliances. Therefore, functional additives that improve the flame retardant and mechanical properties of polymers while also reducing their carbon footprint are in high demand.^1,2

The drive to replace fossil-based polymers has brought biopolymers with low density and high mechanical performance to the spotlight of research, such as polyamide (PA56). This bio-based polyamide, here produced from microorganisms via lysine decarboxylase through whole-cell biotransformation,³ shows potential for reducing pollutants due to its non-toxicity, biodegradability, and biocompatibility. PA56 has found applications in textiles, drug delivery, food packaging, and water treatment.⁴ However, most polyamides containing aliphatic segments are intrinsically flammable because they depolymerize into cyclic monomers that produce little to no char and are susceptible to degradation by additives. This makes them more challenging to flame-retard compared to other polyolefins,⁵ thus opening new avenues for flame-retardant design. The use of phosphorus-based flame retardants, particularly those derived from sustainable biosources, is gaining traction as a safer alternative to organobromine chemicals, which face increasing regulatory scrutiny. Consequently, various bio-based flame retardant strategies, including phytic acid, nanoclays, boron compounds, and chitosan, have been extensively explored, raising awareness about the environmental concerns associated with traditional flame retardants and highlighting the need for safer, more sustainable solutions.^6,7

Multicomponent or hybrid flame retardants are an essential class of non-halogen flame retardants known for their ability to form a compact carbonaceous layer on material surfaces. This layer acts as a thermal barrier, preventing heat transfer into the polymer, isolating oxygen sources, and effectively disrupting the path of combustible materials toward the flaming zone.^8–10 These flame retardants typically consist of three components: an acid source, a carbon source, and a gas source. Common acid sources include phytic acid, phosphoric acid, ammonium polyphosphate, and phosphate esters. Carbon sources often used are cyclodextrin, tannic acid, lignin, vitamins, or cellulose, while gas sources typically include melamine, urea, casein, collagen, or amino acids.^11,12

In addition, advanced carbonaceous materials such as graphene oxide, carbon nanotubes, and halloysite nanotubes have garnered significant scientific interest due to their exceptional mechanical properties and high surface area. Recent studies have shown that these materials enhance flame retardancy by isolating the carbon layer and suppressing the release of combustion gases, creating an effective physical barrier.^13–15 For instance, Song et al. functionalized graphene oxide with piperazine and phytic acid to enhance epoxy resin composites. This modification reduced the peak heat release rate (pHRR) and total heat release (THR) by 42% and 22%, respectively. The mechanism was assigned to the synergistic combination of gas dilution from piperazine, the catalytic char formation from phytate, and the “zigzag path” barrier effect of graphene oxide.¹⁶

Similarly, Ayou Hao et al. investigated the critical mass ratio between a flame-retardant metallic phosphate and halloysite nanotubes for polyamide 11 composites, using twin-screw extrusion for composition preparation. The results indicated that the reinforcement provided by halloysite nanomaterials allowed for a balanced combination of flame retardancy, mechanical strength, stiffness, and toughness, with an increase in elongation at break and achieving a UL-94 V-0 rating. The study also demonstrated that halloysite nanotubes acted as nucleating agents, raising the crystallization temperature (as measured by DSC), while micro-combustion calorimetry (MCC) confirmed a reduction in combustion activity.¹⁷

While the examples discussed thus far largely rely on traditional approaches—rooted in human intuition, empirical screening, and iterative trial-and-error—there is significant untapped potential in accelerating materials development through data-driven techniques and machine learning (ML). These conventional methods, though foundational, are inherently time-consuming and often limit the speed at which new formulations can reach industrial and commercial applications.¹⁸ In contrast, ML offers a powerful alternative, capable of rapidly predicting material properties by leveraging existing literature data and integrating even small datasets from experiments or simulations.¹⁹ Embracing these tools presents a clear opportunity to shift from slow, manual discovery to more agile, informed, and scalable innovation in materials science.

ML-driven models have shown promise in enhancing the prediction and optimization of various material properties. For example, they have been used to optimize materials and devices in organic photovoltaics, enable rapid prediction of metal–organic framework (MOF) band gaps using unsupervised dimensionality reduction techniques, benchmark the performance of graph neural networks (GNNs) for predicting properties from crystal stability to electronic characteristics and modify liquid flammability ratings through unsupervised clustering.^20–22

ML has recently made notable contributions to the field of flame retardancy, with various significant research studies published. For instance, a Bayesian regularized artificial neural network with a Gaussian prior (BRANNGP) and multiple linear regression (MLR) were employed to predict the heat release characteristics of flammable fiber-reinforced polymer laminates.²³ Additionally, Chen Z. et al. utilized property descriptors and polynomial regression techniques to design organic, phosphorus-containing flame retardant composites.²⁴ Recently, the prediction of 3D printability and printing quality of bio- and thermoplastic-based nanocomposites based on material properties, such as thermal and rheological characteristics, has been explored comparing different ML models, specifically Random Forest, Neural Network and Support Vector Machine models.²⁵ However, the application of machine learning (ML) to the analysis and development of bio-based flame retardants integrated with biocomposites remains unexplored. Additionally, current ML approaches can be complex and challenging to implement effectively.²⁶ To address some of these complexities, our group has begun to incorporate laboratory automation to faciliate data collection of polymers and polymer nanocomposites.²⁷

This contribution introduces an innovative bio-based flame retardant (FR) system for biocomposite materials developed by employing a data-driven experimentation and optimization framework. By systematically integrating ML models with experimental data, this research enables the rapid identification of optimal formulations, improving performance while minimizing resource-intensive trial-and-error processes.

Our discovery workflow, depicted in Fig. 1, consists of the following: (1) initial experimental material prototyping based on chemist intuition, (2) post-analysis of the initially explored composition space and a proper Design of Experiments (DoE) approach to find composition space regions missed in the initial experimentation; this step involved a new implementation of DoE designed for nanocomposites (i.e. ConstrAined Sequential laTin hypeRcube sampling methOd (CASTRO)), (3) further experimentation effort to prepare and characterize DoE-CASTRO-indicated compositions, (4) using the complete data points to build ML-based composition-property models and (5) use these ML models to further optimize the composition of the materials to maximize key mechanical and fire-safety properties. The following Sections 2 and 3 use this order to discuss the methodology and the obtained results, respectively. Finally, we summarize the key findings in Section 4.


	Fig. 1 Data-driven design workflow.

2 Materials and methods

2.1 Materials

Halloysite Nanoclay (Kaolin aluminum silicate, hydrated) molecular weight (m_w): 294.19 g mol⁻¹, density (d): 2.53 g cm⁻³. Phytic acid solution, 50 wt% in H₂O, m_w: 660.04 g mol⁻¹, d: 1.432 g mL⁻¹ (25 °C). Chitosan with a m_w: 1526.464 g mol⁻¹, viscosity: 20–300 cP, 1 wt% in 1% acetic acid (25 °C). Acetic acid: reagent grade with a purity of ≥99%, m_w: 60.05 g mol⁻¹, melting point (mp): 16.2 °C, d: 1.049 g mL⁻¹ (25 °C), sodium hydroxide: reagent grade 97%, d: 2.13 g cm⁻³, M: 40 g mol⁻¹, tris(hydroxymethyl)aminomethane, titration grade, purity >99%, m_w: 121.14 g mol⁻¹, mp: 167–172 °C, bp: 219–220 °C, pH: 10.5–12. Zinc borate: technical grade, ≥45% ZnO basis, ≥36% B₂O₃. Calcium borate: technical grade, 31–37% CaO basis, 39–44% B₂O₃, mw: 161.73 g mol⁻¹ and melamine, molecular weight (m_w): 126.12 g mol⁻¹, density (d): 1.57 g cm⁻³ were purchased form Sigma-Aldrich.

3-Aminopropyldiethoxymethylsilane, purity >97%, mw: 191.35 g mol⁻¹, d: 0.92 g mL⁻¹ was purchased from TCI Europe N.V. and Aluminum phosphinate, Exolit OP-935 23.3–24.0% P (w/w), d: 1.35 g cm⁻³ and Exolit OP-1400 24.5–25.5% P (w/w), d: 1.45 g cm⁻³ were purchased from Clariant international plastics & coatings (Frankfurt, Germany). Hexagonal boron nitride nanopowder, avg. particle size: 70 nm was purchased from Lower friction MK Impex Corp. Polyamide 56, biobased polyamide resin Ecopent 1273, Injection grade: relative viscosity: 2.76 cP (25 °C), mp: 255 °C was kindly provided by Cathay Industrial Biotech.

2.2 Synthesis of flame retardants

The experimental procedures for the preparation of different flame retardant systems for polyamide 56 are explained in detail in the ESI.†. In the initial experimental campaign (first 75 experiments), 7 biobased hybrid flame retardants with various chemical assemblies (e.g., organophosphates, amino-based, aluminosilicates, layered nanosheets) were synthesized according to previously reported literature, with detailed references and information for each FR system provided in Table S1 in ESI.† After the ML guidelines for the composition optimization were obtained (more information in Section 2.5), the same experimental routes were used, followed by the predictive modifications of the components concentrations.

2.3 Preparation of polyamide composites

All the PA56 composites were processed under the same conditions, by hot-melt compounding of the flame retardant into the polyamide matrix using a twin-screw extruder (Brabender KETSE 20/40) to make the composite polymer granules at 260 °C. The samples for the fire tests and mechanical properties tests were obtained by Injection molding (Arburg allrounder injection machine). The formulations of the samples are given in Table S2 in ESI.†

2.4 Characterization and measurement

Fourier transform infrared spectroscopy (FTIR) was tested on a Nicolet IS50 spectrometer with a wavenumber from 4000 to 400 cm⁻¹. Thermogravimetric analysis (TGA) was performed with a TA instrument (Q50, New Castle, PA, USA) from 0 to 800 °C, with a heating rate of 10 °C min⁻¹ under a nitrogen atmosphere. The limiting oxygen index (LOI) was obtained using an oxygen index meter (FTT, East Grinstead, UK) according to ASTM D2863-77 standard. NMR tests were performed by using a Varian Infinity AS400 (Bruker Co, Germany), ¹³C NMR and ³¹P NMR were obtained by using D₂O as solvent at room temperature. The extended combustion properties of the samples were determined on a cone calorimeter (FTT, East Grinstead, UK) according to the ISO5660 standard, under a heat flux of 50 kW m⁻², using a sample size of 100 × 100 × 4 mm³. Tensile testing was performed on a universal electromechanical testing machine (INSTRON 3384, Norwood, MA, USA) according to the ISO 527-2 standard at a test speed of 50 mm min⁻¹, and with a load cell of 2000 N.

2.5 Computationally guided design of experiments for material composition and synthesis design

Expensive testing procedures involving significant time and resources are required for executing the underlying experiments. Thus, we could conduct 15 more experiments for space exploration. To find these 15 experiments that fulfill the given mixture and synthesis requirements, we developed a methodology that can incorporate preliminary data while handling the underlying composition constraints. This method called CASTRO, ConstrAined Sequential laTin hypeRcube sampling methOd, is described in detail in ref. 28. CASTRO is a DoE method that uniformly explores the input space while accommodating mixture and synthesis constraints. However, it focuses solely on the input space and does not consider property values. The method involves sampling sequentially with Latin Hypercube Sampling (with multidimensional uniformity), finding feasible permutations, permuting the order of the bounds, and a divide-and-conquer strategy. The latter means that due to the curse of dimensionality, the problem is divided into subproblems, where then, each subproblem is analyzed and the solutions for all subproblems are reassembled to obtain the full problem solution. To select the final 15 experiments, we leverage previously collected experimental data and assess the Euclidean distances of CASTRO-generated samples from existing data, selecting those that provide maximal coverage of unexplored regions. For the study in this paper, we used a preliminary version of CASTRO that was later improved leading to CASTRO Version 1.0 and the latest version available on Github.²⁹

The nine components under investigation are bio-based polyamide (PA56), phytic acid (PhA), the four amino-based components—chitosan (CS), boron nitride (BN), tromethamine (THAM), and melamine (MEL)—and three metallic-based components—calcium borate (CaBO), zinc borate (ZnBO), and halloysite nanotube (HNT). We divide the problem into 3 subproblems, in detail, the primary problem, i.e. PA56, PhA, one subproblem for the amino-based and another subproblem for the metallic-based components. The weight fractions of all components c_k, k = 1, …, n_comp, where here n_comp = 9 need to sum up to 1, i.e.


	(1)

The bounds are given by


0.8 ≤ PA56 ≤ 1,	(2)


0 ≤ PhA ≤ 0.05,	(3)


0 ≤ amino-based component ≤ 0.1,	(4)

0 ≤ metal-containing component ≤ 0.14.

In addition to the predefined constraints, we impose specific synthesis requirements on the amino-based components. Only certain combinations are allowed, i.e. MEL with CS, THAM with CS, and MEL with THAM. Additionally, MEL, THAM, CS, and BN can be used individually as single amino-based components. When enforcing these constraints on individual components, we technically need integer values. However, we initially treat these variables as continuous (real) during optimization and apply a post-processing rounding strategy to ensure the results adhere to integer constraints, specifically, we choose combinations where the fraction, that is, c_kⁱ, k = 1, …n_comp, i = 1, …n_feas for component k and sample i is greater than 0.5. This means c_kⁱ ≥ 0.5. Here, n_feas is the number of feasible CASTRO samples. If the CASTRO algorithm does not select a valid combination with the second-largest value, we round the fraction for the highest value to 1. Alternatively, we select a valid combination with the second-largest value, guaranteeing that the fractions sum up to 1. Subsequently, we choose 90 random points from the feasible post-processed points.

For the metal-based components, a strict constraint applies – no intra-combinations are permitted. Mathematically, this is expressed as

c_kⁱ ∈ {0,1} ∀k,i,

where k is the index representing a specific metal-based component. Essentially, each metal-based component can either be present or absent, without mixing metal components. To enforce the integer constraint, we refine the selected CASTRO points by setting the component with the highest fraction, c_kⁱ to one while assigning zero to all others:


	(5)

∀i = 1, … n_feas. As previously for the amino-based components, we select 90 random points from the feasible post-processed points.

2.6 Prediction of material composition and synthesis

After exploring the design space within our constrained experimental budget, we included property prediction to find new material compositions with optimal mechanical and fire properties. For this, we built a machine-learning model, specifically a Random Forest (RF) model. This model is based on an ensemble of multiple decision trees.³⁰ Here, it is used to (1) predict the tensile strength and the peak heat release rate and (2) to determine the importance of the material components, such as the main polymer and other additives for the associated mechanical and fire properties. As in section 2.5, the feature space spans 11 components, the 9 components from the previous section, i.e. the biobased polyamide (PA56), phytic acid (PhA), the four amino-based components—chitosan (CS), boron nitride (BN), tromethamine (THAM), and melamine (MEL)—and three metallic-based components—calcium borate (CaBO), zinc borate (ZnBO), and halloysite nanotube (HNT) and two additional features specifying whether a modified HNT flame retardant and PhA flame retardant are used, denoted by mHNT and mPhA respectively.

We used the RandomForestRegressor class of the sklearn.ensemble package in Python for this purpose, where specifically for multi-task regression we used the MultiOutputRegressor routine. For data pre-processing we split the dataset to use 40% for training and 60% for testing. The RF model hyperparameters were chosen as the number of trees in the forest (n_estimators = 5000), the maximum depth of the tree (max_depth = 900), the minimum number of samples required to split an internal node (min_samples_split = 2), the minimum number of samples required to be at a leaf node (min_samples_leaf = 1), and the number of features to consider when looking for the best split (max_features = 1.0 = max(1, int(max_features × n_features_in_)) with n_features_in_denoting the number of input features). We tested different hyperparameter settings and used the hyperparameter configuration that showed the best performance. Additionally, we use cross-validation with five folds to ensure robust performance evaluation. To assess and validate the RF prediction models, the squared correlation coefficient (R²) and the mean absolute error (MAE) were calculated using the functions from sklearn.metrics package, i.e. r²_score and mean_absolute_error.

2.7 Optimization of material composition and synthesis

For optimization to navigate the complex constrained space, the most promising features allowed in combination were varied, modifying the amount of HNT, PhA, CS, and MEL or THAM in % within the following percentage bounds.


0 ≤ HNT ≤ 14,	(6)


0 ≤ PhA ≤ 5,	(7)


0 ≤ CS ≤ 10 and	(8)


0 ≤ Mel ≤ 10 or	(9)

0 ≤ THAM ≤ 10.

The PA56 content was calculated as


PA56 = 100 − (HNT + PhA + CS + C_mt)	(10)

with component C_mt either being MEL or THAM. The pHRR and TS results were normalized to be of similar scales, resulting in the functions pHRR_val


pHRRval(Ĉ) = (pHRRpred(Ĉ) − pHRRpure)/pHRRpure	(11)

and


TSval(Ĉ) = (TSpred(Ĉ) − TSpure)/TSpure	(12)

with Ĉ = (PA56,HNT,PhA,CS,C_mt)^T. The pHRR and TS values for the modified compositions were predicted based on the multi-task RF model from Section 2.6. For multi-task optimization of TS and pHRR, we define the following loss function

where ω denotes the weight parameter controlling the balance between mechanical and fire properties. We minimized the loss with respect to Ĉ:


	(13)

For this point optimization, we used the minimize and differential_evolution routine from the scipy.optimize package with a population size of popsize = 60. Initially, we ran the optimization with weight values ω ∈ {0.1, 1, 10, 20, 50} and then selected the six most promising candidates among the MEL and THAM modifications. Later, we updated the data with the testing results for these six candidates and reran the optimization with the updated multi-task RF model. To further improve TS, we expanded the set of selected weights to ω ∈ {0.1, 1, 10, 20, 50, 100, 200, 400}. Again, we selected the six most promising candidates among the MEL and THAM modifications.

2.8 Validation for single-task optimization

To validate that we identified the best points for each objective individually, we employed a Bayesian optimization approach. However, due to design space limitations, we restricted our selection to points from the full dataset, including both optimization batches.

First, as in previous steps, we split the dataset into 40% training and 60% test/candidate data using the train_test_split routine from sklearn.model_selection with shuffling enabled. We then defined thresholds for both the TS and pHRR cases to determine which candidates should be included in the candidate set and removed from the training set. After shuffling both sets, we proceeded with Bayesian optimization for 20 iterations. For the surrogate model, we tested both an RBF Gaussian Process (GP) kernel from the gpytorch library and the Random Forest (RF) model introduced earlier. Since the RF model demonstrated superior performance in this case, we focused on it for subsequent analysis. Additionally, we selected an expected improvement (EI) acquisition function, leveraging the predicted mean and standard deviation from the surrogate model.

At initialization, we set the best observed value to the maximum of the training set, i.e. best_observed = y_train.max. During each of the 20 iterations, we maximized the EI acquisition function, removed the optimal point from the candidate set, and updated the best observed value if a new, higher value was found. This implementation was built upon.³¹

3 Results

3.1 Preliminary results of flame retardant polyamide composites

Since a consistent dataset with reliable experimental data is essential for developing a high-quality ML model,³² we designed and systematically studied an initial set of 75 experiments to explore the functional properties of the flame retardant composites selected for this work. Therefore, the relationship between the composition of the flame retardant polyamide composites and their functional performance indicators was selected for initial exploration. Key features that are considered important for fire safety, such as the residue from TGA test, Limit oxygen index (LOI), peak heat release rate (pHRR), and total smoke production (TSP) were obtained in the first run along with the mechanical properties of the composites, specifically tensile strength (TS) and Young's modulus (YM).

The first 75 formulations used to generate the initial dataset were prepared specifically for the ML model construction, each of them representing a newly selected combination of polymer matrix and flame retardant fillers. From Fig. 2 we can see that several rational concepts for the design of polymer composites formulations were confirmed,³³e.g. the LOI (Fig. 2d) and char residue (Fig. 2a) values mostly increased after the introduction of the flame retardant additives, while the pHRR (Fig. 2e) mostly and notably decreased meaning the flame retardant composites exhibited the expected higher performance against the fire in terms of heat generation when compared to the unmodified polyamide 56 (results represented by the dashed lines in Fig. 2a–f), whereas other metrics such as the TSP (Fig. 2f) revealed different outcomes as some of the formulations yielded a smoke suppression effect and other ones generated higher smoke production due to the gas dilution effects caused by the incorporation of amino based synergists used as blowing agents. In contrast, a more dispersed trend was obtained after the mechanical properties testing (Fig. 2b and c), indicating that by designing hybrid materials using the combination of different additives, the resulting hybrid usually acquires new features that depend on the physicochemical properties of the single components, their structure, and interfaces between the systems.


	Fig. 2 PA56 testing results summary (first 75 experiments) (a) char residue (in wt%) at 800 °C, collected from TGA test (b) tensile strength (c) Young's modulus (d) LOI (e) pHRR and (f) TSP.

While an in-depth understanding of the interaction between individual components and the polymer matrix could enable the design of composite materials with targeted improvements, the multiscale design of hybrid composites presents significant challenges. These challenges include the complexities of the used flame retardants, the diversity of involved reactants, and the variability of combustion circumstances in complicated arrangements, all of which create intricate interactions that are difficult for individuals to fully discern. Consequently, computational guidance through ML methods offers a more effective approach, providing superior predictive performance that facilitates the optimization of single or multiple desired properties, ultimately enhancing the efficiency of future material designs.^34,35

3.2 Computationally guided design of experiments for material composition and synthesis design

Reducing the sampled CASTRO suggestions to the 15 farthest points from the original data results in the distributions and points illustrated in the ESI,† which show the sampled components for both the previously collected experimental data and the new 15 suggestions. The experimental data exhibit bias, with clustering in certain regions. The CASTRO_LHS suggestions complement the preliminary dataset by filling gaps where fewer points exist, enhancing the exploration of the parameter space, and reducing bias toward specific regions.

More specifically, the new CASTRO_LHS suggestions prioritize less-explored combinations, such as those with lower PA56 contents while varying more additives simultaneously. By incorporating these 15 new suggestions alongside the previously collected data, the overall sample distribution expands into areas that were previously underexplored or entirely unexplored (cf. ESI†).

We conducted the first 14 experiments, listed in Table 1, while sample 15 could not be synthesized due to the excessive chitosan and melamine in the flame retardant formulation. This overloading led to a complexation yield below 10% and a reduced amount of solid precipitate after the reaction was completed.

Table 1 CASTRO suggestions in % for 9 components

Sample	PA56	PhA	MEL	THAM	CS	BN	ZnBO	CaBO	HNT
76	85.1	0.9	—	5.9	—	—	—	8.1	—
77	82.4	1.6	6.5	2.8	—	—	—	6.7	—
78	81.9	3.2	4.0	—	1.6	—	9.3	—	—
79	80.3	4.2	—	—	—	6.6	—	8.9	—
80	81.4	1.7	3.0	—	1.5	—	—	12.4	—
81	81.6	1.9	—	—	3.8	—	—	12.7	—
82	83.6	0.4	—	—	3.2	—	—	12.8	—
83	80.1	4.1	—	—	8.5	—	—	7.3	—
84	84.1	4.8	—	—	9.4	—	—	1.7	—
85	80.1	2.2	—	—	—	6.8	—	10.9	—
86	84.1	—	—	6.0	—	—	—	—	9.9
87	82.8	2.5	—	—	6.4	—	8.3	—	—
88	81.6	3.0	—	—	7.6	—	7.8	—	—
89	80.4	1.9	6.6	3.2	—	—	7.9	—	—
90	80.2	0.1	4.6	—	3.4	—	11.7	—	—

3.2.1 Synthesis and characterization of digitally designed flame retardants. After further exploring the design space using the DoE-CASTRO method, we proceeded to synthesize the flame retardants based on the identified compositions. The resulting materials were characterized using various analytical techniques to confirm the chemical structures and evaluate their thermal stability. A detailed description of this synthesis and characterization process is provided in the following.
3.2.1.1 Synthesis of melamine THAM phytate. A figure detailing the synthesis route of PhA@Mel@THAM can be found in the ESI.† First, the calculated amount of THAM was dissolved in ethanol in a three-neck flask equipped with a reflux condenser at 70 °C. Secondly, MEL was dissolved in deionized water and stirred for ten minutes. Then, the MEL solution was dropped into the flask and the mixture was stirred for 30 minutes. PhA was mixed with 19.8 mL of deionized water and then poured dropwise into the flask. The reaction was maintained at 80 °C for 30 minutes. Finally, the suspension was filtered and washed 3–4 times with 60 °C deionized water to remove the excess reagents and then dried at 90 °C for 24 h.
3.2.1.2 Synthesis of melamine chitosan phytate. The reaction scheme can be found in the ESI.† First, MEL was dissolved in 500 mL deionized water under magnetic stirring at 90 °C, while the CS powder was dissolved in 2 wt% acetic acid aqueous solution. The CS and MEL solutions were mixed and continuously stirred until uniform. Subsequently, the diluted aqueous solution of phytic acid was prepared in 300 mL of deionized water, named PhA solution. Subsequently, the PhA solution (300 mL) was slowly dropped into the CS/MEL solution (500 mL) mentioned earlier under magnetic stirring at 90 °C. After the addition, the system was refluxed at 90 °C under constant magnetic stirring for 4 h. When the reaction stopped, the final products were cooled to room temperature. Then, the precipitates were collected by vacuum filtration, followed by repeated washes until the pH of filtrates was neutral. Finally, the final products obtained were dried at 80 °C under vacuum for one day and were labeled PhA@Mel@CS.
3.2.1.3 Characterization of flame retardants. The structures of the flame retardants were analyzed using various spectral and analytical methods. The FTIR spectra of the flame retardants are presented in Fig. 3a and d, with the peak assignments provided in ESI (Table S3).† It could be observed that the sharp characteristic peaks of melamine, THAM, and chitosan in the 3000–3500 cm⁻¹ region disappeared after reaction with phytic acid. In the PhA@Mel@THAM spectra, the wide peak at 3345 cm⁻¹ (–OH stretching vibration)³⁶ was obtained and probably overlapped the sharp peaks of melamine and THAM in this region. Simultaneously, the peaks at 2938 cm⁻¹ (–OH stretching) and 1748 cm⁻¹ (Deformation of N–H) were ascribed to the THAM vibrational interactions. Moreover, the peaks at 1436 cm⁻¹ (Vibration of triazine ring), 772 cm⁻¹ (Deformation of triazine ring), and 1631 cm⁻¹ (P–OH stretching vibration) were assigned to the participation of melamine and PhA respectively. Meanwhile, in the PhA@Mel@CS spectra the signals at 2891 and 2897 cm⁻¹ corresponding to the stretching vibrations of the aliphatic CH, CH₂ from chitosan were detected. Several peaks of MEL at around 1436, 1205, and 780 cm⁻¹ were assigned to the deformation and stretching vibrations of the triazine ring in melamine. The typical phosphate peak at 1631 cm⁻¹, assigned to the bending vibration of O–P–O groups in phytic acid, was detected with a significant reduction in signal strength. This suggests that an ionic exchange reaction occurred between the anionic polyphosphate sites of phytic acid and the polycationic moieties in chitosan and melamine.


	Fig. 3 Flame retardant PhA@Mel@THAM FTIR Spectra (a) ¹³C NMR (b) ³¹P NMR (c) and PhA@Mel@CS FTIR Spectra (d) ¹³C NMR (e)³¹ P NMR (f).

The relevant NMR characteristic peaks were also detected for the PhA@Mel@THAM, in ¹³C NMR in D₂O (Fig. 3b), the peaks at 59.3 and 61.3 ppm were assigned to the methylene groups in THAM,^37,38 while the detected signals at 74.2 and 166.52 ppm were assigned to the hexacyclic carbons in phytic acid and the triazine group in melamine respectively.³⁹ In ³¹P NMR spectrum in D₂O (Fig. 3c), the typical phytic acid signals were observed at 2.28 and 0.06 ppm related to the external phosphate groups. The NMR scanning of the PhA@Mel@CS yielded considerably smaller signal levels due to the presence of more complex chitosan oligomeric structures. Nevertheless, in the ¹³C NMR in D₂O (Fig. 3e) several peaks were visible around 60 ppm which could be assigned to the hydroxymethyl groups in chitosan,⁴⁰ whereas the corresponding signals at 74.2 and 166.5 ppm were attributed to phytic acid and melamine as previously described. A similar result was obtained in the ³¹P NMR spectrum in D₂O (Fig. 3f), the phytic acid peaks were observed at 2.28 and 0.06 ppm with lower signal strength.

3.2.2 Flame retardancy of polyamide composites. In this section, the LOI and pHRR properties were selected as performance indicators due to their relevance in assessing flame retardancy in polymer composites. These properties are commonly used to build ML models because they provide quantitative data and are critical in the analysis of flame-retardant systems.^24,32 The combined exploration performed in the second run containing 14 experiments (samples 76–89 in Table 1) that followed the DoE-CASTRO for complementary space exploration is presented in figures Fig. 4a and b. The LOI of the polyamide composites shows a notably higher level if compared to the pure polyamide LOI of 26.8%, specifically the samples 80 and 89 that reached LOI higher than 40% after the introduction of the synthesized flame retardants PhA@Mel@CS and PhA-Mel@THAM. The peak heat release results of the composites also obtained an important reduction of more than 70% in samples 79, 80, 81, 83, 85, 89, confirming the high level of flame retardant performance obtained in the initial LOI characterization. The flame retardancy results achieved in this section, using a higher flame-retardant content ranging from 15–20 wt%, are significant. The ML suggestions, initially aimed at space exploration purposes, generated a more balanced set of results with lower deviations, especially for the pHRR values. This, combined with the substantial improvement in flame retardancy, continued to align with the scope of the initial modeling methodology.


	Fig. 4 Flame retardant properties of polyamide composites (a) LOI and (b) pHRR.

3.2.3 Analysis of flame retardant mechanisms. From the previous flame retardancy test results, a superior flame retardant performance was obtained in samples 80, 83, 85, and 89, enabling a pHRR reduction of 73.7%, 72.8%, 71.2%, and 72.1% respectively. Different studies were conducted on the char residues obtained after the cone calorimeter test to further understand the combustion mechanisms of the modified composites. The digital residue images in Fig. 5a–d show the composites' capacity to generate a stable and compact char barrier after 18–20 wt% of FR addition. The SEM-EDS and Raman results of sample 80 are presented In Fig. 5e and i. The phosphorus element was detected by EDS with a 1.8% in this sample, along with a 4.2% of calcium and 2.7% of boron. In the Raman testing, the D band related to amorphous carbon and the G band corresponding to the vibrations of ordered carbon were used to determine the graphitization degree (I_D/I_G).⁴¹ In this case, I_D/I_G: 2.29 indicates the formation of highly stable carbon, which generated the reinforced char barrier in combination with degraded polyphosphoric products from phytic acid (that acted as charring catalyst) and the metallic borate that promoted the stability of the char at high temperatures.⁴² In the EDS of sample 83, the elemental contents of P:3.6%, Ca:1.2%, and B:1.5% were obtained. The I_D/I_G: 2.73 also confirmed the existence of a highly stable carbonaceous residue. Meanwhile, in sample 85 a higher carbon content of 49.3% was obtained in the EDS (Fig. 5g) along with the P: 1.1%, Ca: 2.9% and B: 7.6%, which confirmed the presence of phosphorus and metallic components such as calcium borate and functionalized boron nitride acting as charring agents by crosslinking reactions between pyrolysis products and flame retardant.⁴³ Sample 85 was also able to produce efficient condensed phase action, obtaining the elemental contents of C: 45.2% P: 1.3%, Zn: 3.2% and B: 2.9% in the EDS inspection, see Fig. 5h. The Raman results (I_D/I_G: 2.43) revealed in Fig. 5l were in line with the flame retardant levels obtained in the surface morphology observed in the SEM evaluation. The porosity of both internal and external layers was significantly decreased, hindering the access of heat and oxygen towards the underlying flammable materials during combustion.


	Fig. 5 PA56 composites's char residue digital images (a–d) SEM-EDS (e and h) and Raman spectra (i–l).

TGA-FTIR was employed to study the release of gaseous volatiles in the internal layers of the polymer matrix under an anaerobic atmosphere. In Fig. 6, the IR spectra of samples 80, 83, 85, and 89 as a function of the temperature increase in TGA are presented in order to analyze the gas phase effects of the flame retardant polyamide products. Aliphatic polyamides degrade thermally by an initial chain scission mechanism yielding amide and alkene structures, and after that, by other reactions including the scission of the C–N bonds and the carbonyl groups producing ammonia, cycloalkanes and carbon dioxide.⁴⁴ In the FTIR spectra of the polyamide composites, several signals were detected such as 929 cm⁻¹ indexed as bending vibration of NH₃, at 1760 cm⁻¹ for the carbonyl stretch vibration of C [double bond, length as m-dash] O group and at 2360 cm⁻¹ for the asymmetrical stretching of CO₂ were detected for all samples. Meanwhile, other different peaks in 1440 cm⁻¹, 690 cm⁻¹ and 785 cm⁻¹ correspond to the bending vibration of the phosphate and aromatic carbon group, the P–O–C stretching vibration and vibration of benzene ring belonging to the pyrophosphoric groups generated during the phytic acid combustion process. For samples 80 and 83, i.e.Fig. 6a and d, the NH₃ signal was also detected at 300 °C indicating that the non-combustible gases were also released at lower temperatures due to the presence of melamine which was able to extend the gas phase activity of the flame retardant system.


	Fig. 6 Evolved gas analysis of PA56 composites (a) S80 (b) S83 (c) S85 and (d) S89.

In sample 83, i.e.Fig. 6b, lower gaseous phase products were detected at low temperatures and after 400 °C the signal intensity of the polymer combustion products reached the peak of thermal degradation revealing more evident signals at 1760 cm⁻¹, 2360 cm⁻¹ and 2940 cm⁻¹, while the signals at 1440 cm⁻¹, 930 cm⁻¹ and 755 cm⁻¹ were also detected with low signal intensity. For sample 85, i.e.Fig. 6c the peak at 2360 cm⁻¹ was clearly detected in the 300 °C spectra alongside with a slight signal in the alkyl substituents region at 3300 cm⁻¹, at 400 °C an additional peak was detected at 3340 cm⁻¹ related to the N–H vibration signal from the primary aliphatic amide, which was also detected in a lower intensity in the 500 °C spectra, where the signals at 960 cm⁻¹, 930 cm⁻¹ and 760 cm⁻¹ were also detected with low intensity, indicating a lower level of gas phase activity which also assisted the combustion hindrance to some extent by preventing oxidation of CO by the flame dilution effect. Based on the results presented above, it can be observed that the combustion of flame-retardant polyamide products generates a complex variety of gas fragments, depending on the specific flame-retardant system. This contributes significantly to the overall flame retardancy mechanism, which involves promoting charring through the dehydration of PA56 into a viscous char. This char is primarily formed by a high graphitic degree carbon, reinforced with metallic elements such as phosphorus, calcium, zinc, and boron.

3.2.4 Mechanical properties of polyamide composites. The experimental data of the mechanical properties of the CASTRO-designed polyamide composites is presented in Fig. 7a and b. In this case, the tensile strength of the polyamide composites exhibited a lower performance than the pure polyamide TS of 72.1 MPa, with results ranging between 50.2 and 61.1 MPa. The highest strength result of this iteration (sample 86) represented a 15% decrease in the TS of the polyamide, suggesting that although the polyamide was reinforced by the HNT fillers, the higher addition amounts also affected the interfacial bonding of the composites. Meanwhile, Young's modulus performance was obtained within the range of 1200–1800 MPa, and for some formulations (samples 78, 80, 81, 83, 86, 87, 88, 89), the YM reached a decreased performance until 24%. For other samples, the result was slightly higher obtaining an increase of 13%, e.g. sample 85. Considering the multicomponent additive systems introduced into the polyamide, the lower mechanical performance of the DoE-CASTRO suggestions can be primarily attributed to the higher content of low molecular weight chemicals in the composites, which likely caused structural degradation of the polyamide. Additionally, the variety of flame retardant additives makes it challenging to assign a specific degradation mechanism, indicating that this area requires further investigation.


	Fig. 7 Mechanical properties of polyamide composites (a) stress–strain curves (b) tensile strength and Young's modulus.

3.3 Feature, property selection and prediction

After a detailed analysis of our previously obtained samples, we performed some correlation analysis for property selection and developed ML models to gain deeper insights into feature importance and to make predictions.

Whether modified flame retardants (mHNT and mPHA) are used is very important for some of the properties. This is why we extended our descriptors by these two.

Fig. 8 illustrates the Spearman correlation heatmap for descriptors (PA56, HNT, PhA, CS, BN, THAM, CaBO, ZnBO, Mel, mHNT, mPhA) and properties (Residue, LOI, pHRR, THR, TS, YM). We also checked Pearson and Kendall correlations but our observations mostly aligned with what is explained in the following. Thus, the results can be found in the ESI.†


	Fig. 8 Spearman correlation between the descriptors (PA56, HNT, PhA, CS, BN, THAM, CaBO, ZnBO, Mel, mHNT, mPhA) and properties (residue, LOI, pHRR, THR, TS, YM). PA56 is most influential for fire properties, while HNT is key for YM. Fire properties are highly interrelated, and mechanical properties are also cross-correlated.

The heatmap Fig. 8 highlights PA56 as the most influential descriptor for fire properties, showing strong negative correlations with Residue and LOI (−0.87, −0.83) and strong positive correlations with pHRR and THR (0.77, 0.63). This suggests that higher PA56 levels reduce Residue and LOI while increasing pHRR and THR. HNT shows a consistent correlation with YM (0.41), reinforcing its role in influencing Young's modulus. Also, TS has a strong correlation with PA56, indicating its relationship with tensile strength (0.63). The correlation between TS and fire properties (pHRR and THR) is moderate, suggesting some shared factors but less direct interaction.

Fire properties (Residue, LOI, pHRR, THR) show high cross-correlation, indicating they tend to vary together. Mechanical properties (TS and YM), however, show a weaker interrelationship, (particularly in the Pearson analysis cf. ESI;† correlation of 0.21), suggesting these properties are influenced by different factors.

Spearman's analysis, which captures monotonic relationships, shows stronger correlations between YM and HNT, indicating potential non-linear associations. The correlation between mPhA and the fire properties pHRR and THR is more negative in Spearman (−0.45, −0.35) compared to Pearson (ESI†), suggesting mPhA might influence these properties in a more complex, non-linear way. Overall, all analyses confirm PA56 as a primary factor for fire properties and TS, and HNT as key for YM. However, Spearman highlights potential non-linear interactions, providing deeper insight into the relationships among descriptors and properties.

In the following, we decided to focus on two properties, one fire property, the pHRR, and one mechanical property, the TS. We selected these two since the pHRR is among the most reliable measures, assumes real values, and is correlated with other fire properties. We chose the TS for mechanical representation as it is one of the most important performance metrics of polyamides and more generalizable than the YM.

We made predictions based on the RF model described in Section 2.6 for these properties building single-task and multi-task models. The RF-predicted TS values were compared with experimentally measured values (cf. ESI†). To evaluate the model's accuracy, we analyzed the squared correlation coefficient (R²) and the mean absolute error (MAE) for both the training and test datasets, also shown in Table 2. For the TS predictions shown in SI, the test data yielded an R² of 0.72 and an MAE of 4.83. In contrast, the training dataset achieved an R² of 0.95 with an MAE of 1.81. The mean R² from cross-validation indicates a score of 0.57 for the training set and 0.73 for the test set.

Table 2 Performance metrics for RF model predictions

Metric	Training set	Test set
Tensile strength (TS)
R ²	0.95	0.72
MAE	1.81	4.83
Mean CV R²	0.57	0.73

Peak heat release rate (pHRR)
R ²	0.95	0.65
MAE	24.37	68.50
Mean CV R²	0.57	0.49

Multi-task approach
R ²	0.95	0.68
MAE	13.09	36.66
Mean CV R²	0.57	0.61

When predicting the pHRR (cf. observations vs. predictions figures in ESI†), the test data resulted in an R² of 0.65 and an MAE of 68.50, while the training dataset also achieved an R² of 0.95 with an MAE of 24.37. The mean cross-validation R² scores for the training set were 0.70 and 0.38 for the test set. These results indicate that the model demonstrates better performance in predicting TS compared to pHRR, as evidenced by the higher R² values and lower MAE across both datasets. The mean R² scores from cross-validation indicate a drop in performance when the model is applied to the training set (0.57) compared to the test set (0.73). This variability suggests that while the model can learn patterns in the training data, it may not consistently apply those patterns to new data. Note that this happened due to chance, as one of the scores is terrible, such that on average it performed worse on the test set than the training set. The remaining scores lay between 0.36 and 0.77 for the training set, leading to an average score of 0.72, while between 0.58 and 0.82 for the test set. The results indicate that the model has more difficulty accurately predicting pHRR values. This lower performance may suggest that the factors influencing pHRR are more complex.

The multi-task prediction approach (cf. observations vs. predictions figures in ESI†) demonstrates comparable effectiveness to the single-task method when predicting TS and peak heat release rate (pHRR). The multi-task model achieves an R² of 0.68 and a mean absolute error (MAE) of 36.66 for the test set, alongside an R² of 0.95 and an MAE of 13.09 for the training set. These results indicate that the multi-task model can effectively learn from both objectives simultaneously, providing a robust predictive framework.

In contrast, the single-task approach (cf. observations vs. predictions figures in ESI†) has shown distinct performance characteristics, particularly with higher R² values and lower MAE for TS predictions, while exhibiting more variability in predicting pHRR. The mean cross-validation R² scores of 0.61 for the test set and 0.57 for the training set further support the idea that the multi-task model may enhance generalizability by leveraging shared information between the objectives, though it does show a slight decrease in accuracy compared to the single-task model for TS.

When comparing feature importance between the single-task (Fig. 9) and multi-task approaches (Fig. 10), several noteworthy differences emerge, although PA56 consistently stands out as the most important feature across all models.


	Fig. 9 Feature importance for single-task (a) TS RF and single-task (b) pHRR prediction.


	Fig. 10 Feature importance for multi-task (a) TS RF and (b) pHRR prediction.

In the single-task pHRR RF model, the four most significant features are PA56, HNT, CS, and THAM/MEL. This suggests that these features play a critical role in accurately predicting pHRR when evaluated in isolation. Conversely, for the single-task TS model, the leading features shift slightly to PA56, BN, CS, and MEL. The variation in feature importance indicates that different factors may be more relevant for predicting TS than pHRR.

In the multi-task approach, the feature importance for TS includes PA56, PhA, CS, and HNT as the top contributors, while for pHRR, the most influential features are PA56, PhA, Mel, and HNT. This overlap highlights the shared relevance of PA56 and HNT in both tasks, suggesting that these features may capture underlying relationships that benefit both predictions.

Overall, the differences in feature importance underscore the potential advantages of the multi-task approach, which integrates insights from both objectives, allowing for a more comprehensive understanding of the factors influencing TS and pHRR.

3.4 Optimization of material composition and synthesis

3.4.1 Multi-task optimization for TS and pHRR. To optimize resource efficiency and find optimal materials concerning both objectives simultaneously, for further optimization, we continued with the multi-task RF model. We analyzed the results and identified the most promising candidates based on the feature importance Section 3.3 and to-date data in combination with the synthesis limitations described in Section 2.5, leading to the amount of HNT, PhA, CS, and MEL or THAM, and indirectly PA56 as our degrees of freedom for optimization.In the first optimization campaign (blue crosses in Fig. 11), we selected formulations using weight factors ω ∈ 1, 10, 20 to vary the amount of THAM and ω ∈ 10, 20, 50 for MEL. The two objectives naturally oppose each other: minimizing the peak heat release rate (pHRR) tends to lower tensile strength (TS) while maximizing TS often increases pHRR.


	Fig. 11 Evolution of points from data-driven workflow.

For the second optimization campaign (green plus signs in Fig. 11), we adopted a similar strategy but adjusted our approach based on insights from the first campaign. Since the latter struggled to achieve the desired TS, we expanded the range of weight factors to place greater emphasis on tensile strength. Specifically, we selected ω ∈ 20, 50, 200 for THAM and ω ∈ 10, 20, 100 for MEL. From the optimal solutions, we identified the most promising candidates that enhanced TS without significantly increasing pHRR.

Fig. 11 provides a comprehensive view of all stages of our data-driven workflow: Initial design points (75 total, grey circles), CASTRO-DoE points for chemical space exploration (red stars, 14 total) and Optimization campaigns (Campaign 1: blue crosses, Campaign 2: green plus signs) derived via weighted point optimization using the multi-task RF model. The progression from DoE points to optimization campaigns 1 to 2 reveals a clear trend of improvement, refining both pHRR and TS values. However, it is worth noting that not all points from the second optimization campaign lie closer to the Pareto front. This is due to the high sensitivity of the problem: small changes in composition can lead to disproportionately large variations in the output properties. For instance, a formulation that significantly improves pHRR may simultaneously compromise TS, and vice versa. As a result, even within an optimization campaign with an increased emphasis on tensile strength, trade-offs between objectives can lead to points that are not Pareto-optimal.

Additionally, Fig. 11 highlights the Pareto front (purple dashed line), which connects non-dominated solutions that minimize pHRR while maximizing TS. The front demonstrates how the data-driven strategy (red stars → blue crosses → green plus signs) systematically enhanced performance. However, it is important to note that in the DoE approach, the material properties were not explicitly considered during the selection process.

A key highlight of Fig. 11 is the best compromise solution (red circled green plus sign), located near the top of the Pareto front. This optimized formulation achieved in the second optimization campaign corresponds to a TS of 85.4 MPa, marking an 18.4% improvement over the pure polymer and a pHRR of 416.6 kW m⁻², indicating a 53.1% reduction from the pure polymer. This formulation represents a well-balanced trade-off, maximizing TS while significantly reducing pHRR, making it an optimal material candidate.

3.4.2 Optimization-guided composites experimental results. A new group of polymer composites was prepared based on the optimization guidelines provided by the multi-task RF model, with the detailed compositions presented in the ESI (Tables S4 and S5).† The newly created formulations were divided into two different campaigns (campaign 1 and 2, as in Section 3.4.1) for the required experimental characterization pHRR_pure = 888 kW m⁻² in eqn (11) and TS_pure = 72.1 MPa in eqn (12). The ML predictions versus the observed pHRR and TS for the two campaigns are shown in Fig. 12. Notably, the predicted and measured values for pHRR exhibit considerable variation, although in some instances the predictions align closely with the observed data. In contrast, the TS predictions are generally more accurate, with discrepancies remaining within 5% for nearly all results. The larger discrepancies in pHRR may be attributed to the limited size of the dataset and the different interactions between the multicomponent flame retardants which could lead to more complex synergistic effects, suggesting that the ML model requires a larger and more diverse training set to achieve improved predictive accuracy. In the experimental investigation of the first campaign (Fig. 12a), the pHRR results achieved more than 50% reduction (in samples 93, 95, and 96) by using a lower concentration of FR (3–7 wt%) than the previous group. The TS also exhibited higher performance than the pure polyamide in all cases within this batch. Among them, sample 96 achieved an 18% increase in TS, attributed to the higher polymer matrix content and the incorporation of HNT reinforcement nanomaterial. This enabled the design of composites with targeted improvements in both mechanical and flame-retardant properties. Meanwhile, the second campaign of samples presented in 12b also demonstrated a high level of flame retardancy, further improving the pHRR reduction to 53% and increasing the TS to 18.4%. This observed improvement reflects a more balanced formulation, as the strength of polymer composites typically depends on the reinforcement content and its ability to disperse effectively within the polymer matrix volume.


	Fig. 12 PA56 testing results summary (a) optimization campaign 1 and (b) optimization campaign 2.

As shown in Fig. 13a and b, the optimization process guided the compositional design of the bio-based flame retardants and their subsequent incorporation into the polyamide matrix. Such ML-based recommendations created a logic reduction of time and expensive experimental tuning of processing parameters and functional properties that normally delay the development of both synthetic chemistry and polymer manufacturing and testing protocols.⁴⁵ Moreover, it was possible to develop ML-designed polyamide composites featuring enhanced functional performance of one or multiple properties (such as samples 96 and 102 that simultaneously achieved more than 50% reduction in pHRR and more than 18% of tensile strength increase), showing that ML effectively improved the development of flame retardant formulations and enabled the fabrication of polymer composites with customized attributes as complementary advantage.


	Fig. 13 PA56 testing results summary including the tensile strength (a) and peak heat release rate results (b) obtained in the initial testing, design of experiments (DoE via CASTRO), and optimization campaigns (OPT1, OPT2).

3.4.3 Validation for single-task optimization. To ensure that optimal points were identified for each individual objective, we used Bayesian optimization (BO), selecting points from the full data set based on the maximization of an expected improvement (EI) acquisition function. The EI function prioritizes the choice of data points with the potential for significant improvements over the current best-known solution. Following the procedure outlined in Section 2.8, we observed convergence towards the best point in the data set (indicated by the dashed line in Fig. 14), with values of 88.2 for the TS and 233.2 for the pHRR. The optimal mechanical design identified comprises high PA56 content (97%) reinforced with 1.5% HNT and 0.8% of PhA and CS. In contrast, the most effective fire-resistant design features a low PA56 content (81.4%) and includes 1.7% PhA, 1.5% CS, 12.4% CaBO, and 3% MEL. Although the BO formulation aims to maximize both objectives, in practice, we seek to minimize the pHRR. To accommodate this, we treat the pHRR with an opposite sign, effectively converting the minimization problem into a maximization problem for this objective. For TS it reaches its maximum after 14 iterations, while the pHRR reaches its optimal value after 11 iterations.


	Fig. 14 Validation with Bayesian optimization results for (a) the tensile strength (TS) and (b) the peak heat release rate (pHRR) single-task problem.

4 Conclusions

Given the rapid advancements in bio-based and natural materials, this study develops an innovative bio-based flame retardant (FR) system for biocomposite materials by integrating computationally guided ML techniques with experimental data to optimize material composition and performance. The research introduces a cost-efficient, data-driven experimentation framework, employing design of experiments to sample efficiently and ML models to predict and optimize key properties such as mechanical strength (TS) and flame retardancy (pHRR), streamlining the process and minimizing the reliance on resource-intensive trial-and-error methods.

The synthesis procedure of the flame retardants, material preparation, characterization, and measurement steps were described in detail. Next, the computationally guided design of experiments method, CASTRO, was introduced. This technique focused on space exploration for material composition and synthesis. Following this, predictive ML models using Random Forest regression were developed for both single-task and multi-task learning, aimed at forecasting mechanical and fire properties. The multi-task model was used to identify designs that optimized both mechanical and fire properties in a Pareto-efficient manner. Bayesian optimization was then introduced to validate the single-task results.

The preliminary results of flame-retardant polyamide composites were analyzed, followed by the computationally guided design of experiments optimization, which used the CASTRO method to enhance biocomposite development. The 15 experimental DoE-CASTRO suggestions were analyzed with respect to flame retardancy and mechanical properties. After sufficient design space exploration, the focus shifted to property selection, prediction, and optimization. The optimized samples were then thoroughly analyzed.

The best Pareto design discovered demonstrated a tensile strength (TS) of 85.4, showing an 18.4% improvement over the pure polymer, and a peak heat release rate (pHRR) of 416.6, indicating a 53.1% reduction compared to the pure polymer. Further validation using Bayesian optimization on the entire dataset confirmed the optimal solutions for individual objectives. The most effective individual designs achieved a TS of 88.2, a 22.3% improvement, and a pHRR of 233.2, reflecting a 73.7% reduction over the pure polymer.

This research demonstrated how a computationally guided approach could significantly accelerate the development of novel, sustainable, bio-based flame retardants and biocomposites, offering high-performance alternatives to traditional solutions. By leveraging novel DOE techniques, predictive analytics and iterative optimization, the study identified novel material compositions and flame retardant (FR) systems with enhanced functional properties. Specifically, it optimized the peak heat release rate (pHRR) while demonstrating correlations to other fire properties and mechanical performance through tensile strength (TS) optimization. These innovative designs provide more sustainable and environmentally friendly alternatives with superior functional properties compared to conventional solutions, paving the way for the design of next-generation materials with improved mechanical and fire-resistant properties and contributing to a more efficient, environmentally friendly material design process.

Data availability

All the data collected for this article are available on Zenodo at https://zenodo.org/records/15097536.⁴⁶

Author contributions

Christina Schenk: writing – original draft, review and editing, visualization, implementation, software, data curation, methodology, investigation, formal analysis, validation, conceptualization. Jose Hobson: writing – original draft, review and editing, investigation, data curation, formal analysis, methodology, visualization, validation, conceptualization. Maciej Haranczyk: writing – review and editing, supervision, methodology, funding acquisition, conceptualization. De-Yi Wang: writing – review and editing, supervision, methodology, funding acquisition, conceptualization.

Conflicts of interest

There are no conflicts to declare.

Acknowledgements

This article is part of the project TED2021-131409B-100, funded by MCIN/AEI/10.13039/501100011033 and by the European Union “NextGenerationEU”/PRTR.

Notes and references

X. Ao, R. Crouse, C. González and D.-Y. Wang, Constr. Build. Mater., 2024, 436, 136922 CrossRef CAS .
X. L. Qi, D. D. Zhou, J. Zhang, S. Hu, M. Haranczyk and D. Y. Wang, ACS Appl. Mater. Interfaces, 2019, 11, 20325–20332 CrossRef CAS .
Y. K. Leong, C.-H. Chen, S.-F. Huang, H.-Y. Lin, S.-F. Li, I.-S. Ng and J.-S. Chang, Biochem. Eng. J., 2020, 157, 107547 CrossRef CAS .
T. Yang, Y. Gao, X. Wang, B. Ma and Y. He, Polymer, 2021, 237, 124356 CrossRef CAS .
E. D. Weil and S. Levchik, J. Fire Sci., 2004, 22, 251–264 CrossRef CAS .
Y. Daniel and B. Howell, Polym. Degrad. Stab., 2018, 156, 14–21 CrossRef CAS .
M. Wang, G.-Z. Yin, Y. Yang, W. Fu, J. L. D. Palencia, J. Zhao, N. Wang, Y. Jiang and D.-Y. Wang, Adv. Ind. Eng. Polym. Res., 2023, 6, 132–155 CAS .
Y. Liu, A. Zhang, Y. Cheng, M. Li, Y. Cui and Z. Li, Polym. Test., 2023, 124, 108100 CrossRef CAS .
W. He, P. Song, B. Yu, Z. Fang and H. Wang, Prog. Mater. Sci., 2020, 114, 100687 CrossRef CAS .
S. Bourbigot and G. Fontaine, Polym. Chem., 2010, 1, 1413–1422 RSC .
Z. Zheng, C. Liao, Y. Xia, Y. Liu, B. Dai and A. Li, Polym. Test., 2020, 90, 106741 CrossRef CAS .
L. Costes, F. Laoutid, S. Brohez and P. Dubois, Mater. Sci. Eng., R, 2017, 117, 1–25 CrossRef .
D. C. O. Marney, W. Yang, L. J. Russell, S. Z. Shen, T. Nguyen, Q. Yuan, R. Varley and S. Li, Polym. Adv. Technol., 2012, 23, 1564–1571 CrossRef CAS .
X. Cheng, L. Shi, Z. Fan, Y. Yu and R. Liu, Polym. Degrad. Stab., 2022, 199, 109898 CrossRef CAS .
S. Członka, A. Kairytė, K. Miedzińska and A. Strąkowska, Materials, 2021, 14, 3620 CrossRef .
F. Fang, S. Ran, Z. Fang, P. Song and H. Wang, Composites, Part B, 2019, 165, 406–416 CrossRef CAS .
A. Hao, I. Wong, H. Wu, B. Lisco, B. Ong, A. Sallean, S. Butler, M. Londa and J. H. Koo, J. Mater. Sci., 2015, 50, 157–167 CrossRef CAS .
P. Jafari, R. Zhang, S. Huo, Q. Wang, J. Yong, M. Hong, R. Deo, H. Wang and P. Song, Compos. Commun., 2024, 45, 101806 CrossRef .
Z. Zhang, Z. Jiao, R. Shen, P. Song and Q. Wang, ACS Appl. Eng. Mater., 2023, 1, 596–605 CrossRef CAS .
B. Cao, L. A. Adutwum, A. O. Oliynyk, E. J. Luber, B. C. Olsen, A. Mar and J. M. Buriak, ACS Nano, 2018, 12, 7434–7444 CrossRef CAS PubMed .
A. S. Rosen, S. M. Iyer, D. Ray, Z. Yao, A. Aspuru-Guzik, L. Gagliardi, J. M. Notestein and R. Q. Snurr, Matter, 2021, 4, 1578–1597 CrossRef CAS .
V. Fung, J. Zhang, E. Juarez and B. G. Sumpter, npj Comput. Mater., 2021, 7, 84 CrossRef CAS .
H. T. Nguyen, K. T. Nguyen, T. C. Le, L. Soufeiani and A. P. Mouritz, Compos. Sci. Technol., 2021, 215, 109007 CrossRef CAS .
Z. Chen, B. Yang, N. Song, T. Chen, Q. Zhang, C. Li, J. Jiang, T. Chen, Y. Yu and L. X. Liu, Chem. Eng. J., 2023, 455, 140547 CrossRef CAS .
B. Ozdemir, M. H. del Valle, M. Gaunt, C. Schenk, L. Echevarría-Pastrana, J. P. Fernández-Blázquez, D.-Y. Wang and M. Haranczyk, Addit. Manuf., 2024, 95, 104533 CAS .
C. Yan, X. Lin, X. Feng, H. Yang, P. Mensah and G. Li, Appl. Phys. Lett., 2023, 122(25), 251902 CrossRef CAS .
M. H. del Valle, C. Schenk, L. Echevarría-Pastrana, B. Ozdemir, E. Dios-Lázaro, J. Ilarraza-Zuazo, D.-Y. Wang and M. Haranczyk, Digital Discovery, 2023, 2, 1969–1979 RSC .
C. Schenk and M. Haranczyk, Comput. Mater. Sci., 2025, 252, 113780 CrossRef CAS .
C. Schenk, CASTRO - A ConstrAined Sequential laTin hypeRcube (with multidimensional uniformity) sampling methOd, 2023-2025, https://github.com/AMDatIMDEA/castro.
L. Breiman, Mach. Learn., 2001, 45, 5–32 CrossRef .
F. Wang, Q. Gallagher, A. Gupta and C. Schenk, ACBO Hackathon 2024 - BO for Drug Discovery: What is the role of molecular representation?, 2024, https://github.com/FrankWanger/ACBO-Feat.
F. Chen, L. Weng, J. Wang, P. Wu, D. Ma, F. Pan and P. Ding, Compos. Sci. Technol., 2023, 231, 109818 CrossRef .
J. Xiao, J. Hobson, A. Ghosh, M. Haranczyk and D.-Y. Wang, Compos. Commun., 2023, 40, 101593 CrossRef .
J. Xiao, J. Hobson, M. Haranczyk and D.-Y. Wang, Polym. Degrad. Stab., 2023, 218, 110563 CrossRef CAS .
C. E. Okafor, S. Iweriolor, O. I. Ani, S. Ahmad, S. Mehfuz, G. O. Ekwueme, O. E. Chukwumuanya, S. E. Abonyi, I. E. Ekengwu and O. P. Chikelu, Hybrid Adv., 2023, 2, 100026 CrossRef .
R. Hong, L. Ting and W. Huijie, Resour.-Effic. Technol., 2017, 3, 226–231 Search PubMed .
E. Champagne, M. Fisher and O. Hinojosa, J. Inorg. Biochem., 1990, 38, 199–215 CrossRef CAS .
D.-F. Li, X. Zhao, Y.-W. Jia, X.-L. Wang and Y.-Z. Wang, Compos. Commun., 2018, 8, 52–57 CrossRef .
C. Zhao, T. Wang, F. Chen, Y. Sun and G. Chen, Anal. Bioanal. Chem., 2022, 414, 2453–2460 CrossRef CAS PubMed .
M. R. Kasaai, Carbohydr. Polym., 2010, 79, 801–810 CrossRef CAS .
F. Mori, M. Kubouchi and Y. Arao, J. Mater. Sci., 2018, 53, 12807–12815 CrossRef CAS .
Y. Yang, Z. Li, G. Wu, W. Chen and G. Huang, Polym. Degrad. Stab., 2022, 196, 109841 CrossRef CAS .
W. Cai, B. Wang, L. Liu, X. Zhou, F. Chu, J. Zhan, Y. Hu, Y. Kan and X. Wang, Composites, Part B, 2019, 178, 107462 CrossRef CAS .
A. F. Holdsworth, A. R. Horrocks and B. K. Kandola, Polym. Degrad. Stab., 2020, 179, 109220 CrossRef CAS .
S. H. M. Mehr, M. Craven, A. I. Leonov, G. Keenan and L. Cronin, Science, 2020, 370, 101–108 CrossRef CAS PubMed .
C. Schenk, J. Hobson, M. Haranczyk and D.-Y. Wang, AI-Driven Design and Green Preparation of Bio- Based Fire-Safe Polymeric Materials, 2025, DOI:10.5281/zenodo.15097536 .

Footnotes

† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d5ta02511g

‡ These authors contributed equally to this work.

Click here to see how this site uses Cookies. View our privacy policy here.