Abstract:
The aim of this thesis was to develop a comprehensive database from published articles about photocatalytic water splitting oin literature, then to extract knowledge and to examine whether the results of an unperformed experiment could be estimated or not. Total number of instances gathered was 6378 from 129 different articles; later this large database was divided into 4 major subsets to improve the effectiveness of the analysis performed, these subsets included data about UV over TiO2, visible light over TiO2, ABO3 perovskites and ABS3 perovskites The database was gone through a pre-processing step to eliminate noisy instances, filling missing values and reducing number of dimensions or size of data while keeping its content. The hydrogen production rate (μmol/g-cat/h) was selected as the output variable and tried to be estimated by using several input variables such as semiconducting material, preparation methods, catalytic properties, or operational variables. Linear regression, artificial neural network, random forest and principal component analysis were applied to each dataset by using libraries and functions of “R”. The parameters of each algorithm were changed within a certain intervals to find optimum conditions for best modeling. The models developed were evaluated using their standard error, root mean squared error and r-squared values. To evaluate model performances, 10-fold cross validation and standardized residual error analysis were performed. The best results were achieved with random forest algorithm for all subsets; the absolute error, rmse, and r-squared values were in the range of, 0.08 - 0.42, 0.17 - 0.61, and 0.64 - 0.97 respectively. In general, the error were scattered randomly, but some outliers were detected in subsets. The input significance and sensitivity analysis were also applied to subsets to extract more information. Catalytic variables (band gap, surface area, and particle size) were found as deterministic in titanium based photocatalysts whereas operational variables were more significant for hydrogen evolution over perovskite based photocatalysts.