Possibilities of an artificial neural network use to control oxygen consumption in a converter shop

Converter steelmaking process uses high-performance processing plants and a high level of process automation. Complex multi-stage processes typically use many factors that have effect both on individual stages of the processes and on how a combination of the processes runs as a whole. This makes it necessary to select and/or develop control methods for the converter process having no full current data on process parameters and effects of random disturbances. Use of artificial neural networks is the most effective method available now for analyzing and predicting parameters of different processes (including converter process). This article deals with an algorithm for oxygen measurement based on chemical analysis of steel and process history through some period, using an artificial neural network and machine learning. Initial data were reviewed using correlation and regression analysis, the artificial neural network architecture chosen is reasoned as the most appropriate for the objective on hand and the results are analyzed. The element measurement accuracy achieved is satisfactory, which makes the resulting artificial neural network usable in an automation system for testing purposes. Network performance can be improved with the help of experts in the subject area, whose expertise can be used to correct the model.


Introduction
Converter steelmaking process uses high-performance processing plants and a high level of process automation. This makes it possible to produce a wide range of high-quality steel grades. However, complex multi-stage processes have many factors that have an effect both on individual stages of the processes and on the progress of the entire process [1]. This makes it necessary to select and/or develop control methods for the converter process having no full current data on process parameters and effects of random disturbances.
Relationships between factors and output variables can be constructed analytically based on theoretical analysis of processes or with the use of identification methods based on a real experiment (observing a real process and measuring of corresponding factor values and output parameters).
With the current level of advanced information technology [2], using artificial neural networks (ANN) is the most effective method available now for analyzing and predicting parameters of different processes (including converter process) [3][4][5][6].

Problem description
Converter steelmaking process with top oxygen blowing is the most common steelmaking process used by steel producers of today. Making a high-quality steel having a target chemical analysis and qualities is quite a challenge and therefore it is impossible without an automated steelmaking process control system. As liquid steel parameters cannot be controlled all the time in the ladle, the automated system software must include a package of mathematical models to calculate the technological parameter values for steel process. A large variety of models are available at the moment for qualification of the converter process, most of which relate to the blowing period.
The amount of oxygen is normally determined empirically, which is not always accurate enough. Therefore, a possibility was considered to use ANN and machine learning for determining the amount of oxygen needed for blowing.

2.2.
Using ANN in the converter steel process control system and stating the problem As explained above, the influence of the human factor on the technological process has to be reduced through delegation of the parameters analyzing authority to the ANN. The first step should be to analyze the statistical data collected in the workplace for a certain period of work, which will provide a sufficient set of information for analysis using the ANN. So, we have considered a set of data for more than eighty thousand heats done over several years of company operation.
Getting started with analyzing the data, it should be understood how they are used in the production process (what parameters are known before, after, and during the heat) and what relationships there may be between them. Since we need to find out if the amount of oxygen depends on other variables, we will use regression analysis tools. Regression analysis is a set of statistical methods for investigating the influence of one or more independent variables on a dependent variable [7]. This set of methods is also used to detect hidden relationships in data, which is required for this task.
The initial data for the static heat model are: the chemical analysis and temperature of hot metal in the converter furnace, the number and type of cooling agents added in the process, the lining status measured by the number of heats. This model calculates a total volume of oxygen required for blowing during the current heat.
While reviewing the data, a large number of outliers and garbage data was found in each set of values [8].
Having no noise in the data is an important task for regression analysis, which makes it possible to better define and refine the relationships between parameters. This operation is of primary importance for the output parameter. After the data had been cleared, its initial volume went down by about 30 %. Then, it was necessary to determine the ANN architecture/machine learning method. For this purpose, a few alternatives were selected, and each model was tried.

Investigation of artificial neural networks applicability for oxygen measurement (results)
Investigation was using real parameters of more than 40000 heats produced at the same steel mill. The input variables are: values of variables known before blowing (hot metal weight, scrap weight, lime weight, hot metal temperature, hot metal analysis (C, Mn, Si, P, S, Al, Cr, Ni, Cu, V, Ti), total volume of oxygen per heat, metal temperature, amount of nitrogen for stirring, heat ID number within the converter campaign and values of parameters after blowing: temperature and chemical analysis of the final steel (C, V). The output parameter of the model is the amount of oxygen for the current heat. A correlation analysis was performed before developing the ANN-based model, wherein correlation coefficients between input and output variables were determined. Eventually, the following variables were chosen to be submitted as input values after correlation analysis: • temperature; • hot metal analysis; • chemical analysis of hot metal at tapping (C, V); • type and number of cooling agents; • tap-to-tap time. The Sklearn and Tenzorflow (tf. keras) libraries were used to create and train a model based on ANN/machine learning [9]. The Scikit-learn library is the most common choice for classical machine learning tasks due to its simplicity. One of the main advantages of the library is that it uses a few widely distributed mathematical libraries as a basis and has no problem with integrating them. This module includes popular libraries such as: Pandas, SymPy, SciPy, NumPy, IPython, Matplotlib [10].
TensorFlow is an open source machine learning software library that can be used by a wide range of machine learning techniques and includes an integrated logging system and a log visualization system.
Since no exact relationship between input and output variables is known, a preliminary search was first made for the most appropriate ANN architecture/type of machine learning. See Table 1 for general test results for models under review. The module import block is skipped in the above listing. Since the code was written in the Spyder development environment using a block structure, it is divided by '#%%' characters into blocks that will be further listed in order from top to bottom.
In the first block, data is read from an Excel sheet and broken down into a test set and a training set.
In the next block, the algorithm for training a random regression forest from the Scikit library with the number of trees equal to 'n_estimators' is initialized.
Then the training process starts using the specified algorithm and a previously defined set of training data. At the end of the training, a test is done, and an average square error is calculated.
The last block writes results to the Excel sheet along the specified path.
In practice, the accuracy of forecasting depends more on the quality of the data set submitted for training than on the choice of model type and its optimization Given that the prediction error is quite large, it was decided to continue processing the data and continue testing the algorithm in order to increase its accuracy.
Let us look at the ways to handle the data with reference to the task on hand: • Removal of collinearity. If the relationship is linear, it will be found most quickly and accurately if there are strongly correlated input variables. To remove collinearity, pair correlations are to be calculated and eliminated. • Gaussian distribution. Linear relationships can also be predicted more reliably if the input and output variables have a Gaussian distribution. To achieve this, some transformations (for example, log or Box Cox) can be used for our data. • Scaling of inputs. Forecasting is more reliable if input data is scaled using standardization or normalization. • Getting rid of meaningless predictors. Some algorithms may be less effective with variables that have little or no significance. From the point of view of mathematics, such variables have a fairly low variance. After the algorithm had been written and trained, we got that the error in the training data is 27 cubic meters. This accuracy was considered satisfactory by experts.

Conclusion
With reference to the results that are found satisfactory, it makes sense to start using the resulting artificial neural network in real conditions for testing purposes. This being said, testing a project will serve as a way to improve its effectiveness. To assess the adequacy of the results obtained, the help of specialists in this subject area will be needed, whose expertise can be used to correct the model, if necessary.