The Xfuzzy 3 development environment

The Time Series Prediction Tool - Xftsp

The tool xftsp generates fuzzy inference systems that implement autoregressive models for the short- and long-term prediction of time series. To do this, it applies a methodology based on the use of nonparametric noise or residual variance estimates (to select the optimal number of input variables) in combination with Xfuzzy supervised learning and identification tools (to determine the structure of the systems).

This methodology responds to a direct prediction strategy, which implies the construction of an autoregressive model for each of the terms of the desired prediction horizon. In each case, the optimal subset of inputs is selected a priori by a non-parametric noise estimate (for example, the Delta Test). The specification of the fuzzy system corresponding to each prediction horizon is then obtained through an iterative process in which successive identification and adjustment phases are carried out, increasing the number of linguistic labels of the inputs, until the system error enters the previously estimated range.

xftsp can be executed in graphic mode, using the option "Time Series Prediction" of the Tuning menu or the corresponding icon in the main window of the environment, or from the command line using a configuration file.

The graphical user interface of xftsp allows to collect the necessary information to execute the tool. This information includes the following items:

- Series name:

Name of the time series

- Training file:

Training patterns file

- Test file:

Test patterns file

- Save directory:

Directory where the output files are stored

- Identification algorithm:

Algorithm used in the identification phase (xfdm)

- Optimization algorithm:

Algorithm used in the optimization phase (xfsl)

- NRVE file:

Non-parametric residual variance estimation for each time horizon

- Selection file:

File of selection of input variables for each time horizon (*)

- Tolerance:

Set estimation used to determine the complexity of the fuzzy system as a fixed value or one that increases with the prediction horizon

- Max exploration:

Maximum number of membership functions per input

- Generate optimization logs:

Keep the log files generated by the execution of xfsl in the optimization phase of all fuzzy systems

- Keep pattern files:

Keep in the directories 'xftsp-step-*' the training (and test) pattern files used in the identification and optimization phases

    (*) In Xfuzzy, errors are usually normalized against the squared range of the series, so the estimations should be normalized accordingly.

The central area of the xftsp graphical user interface contains four buttons separated by a progression bar. The two upper buttons allow loading (Load Configuration) or saving (Save configuration) a configuration file.

The syntax of the different directives that can appear in the configuration file is shown below:

 xftsp_series_name("name")
 xftsp_training_file("file_name")
 xftsp_test_file("file_name")
 xftsp_id_algorithm(algorithm_name, value,...)
 xftsp_opt_algorithm(algorithm_name, value,...)
 xftsp_nrve("file_name")
 xftsp_selection("file_name")
 xftsp_option(tolerance, increment)
 xftsp_option(max_exploration, max_num_MFs)    
 xftsp_option(generate_optimization_logs)
 xftsp_option(keep_pattern_files)  

The number of rows in the NRVE file determines the time horizon to be predicted and, therefore, the number of fuzzy systems that will be created. On the other hand, the number of columns of the input selection file sets the maximum size of the autoregressors, that is, the maximum number of input variables of the fuzzy systems.

Once the configuration is complete, the Generate models button allows launching the generation process of the fuzzy systems that model the time series. Most of the messages generated during the execution of the tool are shown in the standard output, that is, the command window from which Xfuzzy was launched or the xfstp command was executed. These messages are also written in a log file, called 'xftsp-run-results.log', which accumulates numerous comments associated with the different steps of execution of the tool. When executing xftsp from the graphical user interface, the messages related to the loading and storage of configuration files, as well as the notification of end of execution are shown in the lower area of the interface. The first lines of the log file resulting from an execution of xftsp have the following appearance:

    Date: Sat Mar 03 08:39:59 CET 2018
    Series name: estsp07
    Training series file: C:\workspace\Ejemplos\Tools\xftsp\estsp07-training.txt
    Test series file: C:\workspace\Ejemplos\Tools\xftsp\estsp07-training.txt
    NRVE file: C:\workspace\Ejemplos\Tools\xftsp\nrve_10 10
    Selection file: C:\workspace\Ejemplos\Tools\xftsp\selection_10 10 10
    -> Step/horizon 1
    Selected 3 variables: 1-3-8
    Training pattern file (after selection): C:\workspace\Ejemplos\Tools\xftsp\xftsp-step-1\estsp07-
    training.txt-3i1o-1step---1-3-8
    Test pattern file (after selection): C:\workspace\Ejemplos\Tools\xftsp\xftsp-step-1\estsp07-
    test.txt-3i1o-1step---1-3-8
    * Performing identification (with 3 inputs) using Wang & Mendel (Active rule extraction)
    Identification finished, identified 6 rules.
    * Performing optimization (with 3 inputs and 6 rules) using RProp
    Optimization finished
    Trn MSE: 1,4906565335E-03, Tst MSE: 1,6805603718E-03 | Threshold: 1,26220182E-03 (1.15 * 
    1,0975668E-03)
    * Performing identification (with 3 inputs) using Wang & Mendel (Active rule extraction)
    Identification finished, identified 15 rules.
    * Performing optimization (with 3 inputs and 15 rules) using RProp
    Optimization finished
    Trn MSE: 1,2759638533E-03, Tst MSE: 1,5397470334E-03 | Threshold: 1,26220182E-03 (1.15 * 
    1,0975668E-03)
    * Performing identification (with 3 inputs) using Wang & Mendel (Active rule extraction)
    Identification finished, identified 20 rules.
    * Performing optimization (with 3 inputs and 20 rules) using RProp
    Optimization finished
    Trn MSE: 1,2574012085E-03, Tst MSE: 1,5753594329E-03 | Threshold: 1,26220182E-03 (1.15 * 
    1,0975668E-03)
    
    * Results: 
    MF & rules &     Trn. MSE 	 &   Test MSE 	 &   Trn. MxAE 	 &    Test MxAE
    2  &   6	 & 1,4906565335E-03	 & 1,6805603718E-03 & 1,459269682E-01	 & 1,5739203918E-01
    3  &  15	 & 1,2759638533E-03 & 1,5397470334E-03 & 1,1709453877E-01	 & 1,3439983748E-01
    4  &  20	 & 1,2574012085E-03	 & 1,5753594329E-03 & 1,2528942456E-01	 & 1,4363158343E-01
    
    Prediction:  25.098186830201954
    ---------------------------------------------------------------------------------------
    
    -> Step/horizon 2
    

The execution of xftsp also generates a series of directories called 'xftsp-step-*' that contain the models (and auxiliary files) corresponding to each prediction horizon. Other files with information about the generated systems are also saved in these directories, as well as in the main directory.

Identification algorithms

In general, the identification algorithms supported by the tool xfdm can be used by xftsp. Some examples are:

 xftsp_id_algorithm(WangMendel)
 xftsp_id_algorithm(ICFA, 0, 20, 2.0, 0.01, 1) 
 xftsp_id_algorithm(CMeans, 0, 10, 2.0, 0.01, 0 ) 
 xftsp_id_algorithm(HardCMeans, 0, 10, 2.0, 0.01, 0 ) 
 xftsp_id_algorithm(GustafsonKessel, 0, 10, 2.0, 0.01, 0 ) 
 xftsp_id_algorithm(GathGeva, 0, 10, 2.0, 0.01, 0 ) 
 xftsp_id_algorithm(IncClustering, 2, 0.1) 

Optimization options

Getting a proper configuration of an optimization algorithm can be a slow and tedious task. Below are some configurations that tend to work well:

 xftsp_opt_algorithm(Scaled_conjugate_gradient)
 xftsp_opt_algorithm(Rprop, 0.1, 1.5, 0.5)
 xftsp_opt_algorithm(Marquardt, 0.1, 10.0, 0.2)
 xftsp_opt_algorithm(Quickprop, 0.25, 1.25)
 xftsp_opt_algorithm(Backprop_with_momentum, 1.2, 0.2)
 xftsp_opt_algorithm(Simulated_Annealing, 500, 0.5, 100)
 xftsp_opt_algorithm(Blind_search, 5.0)
 xftsp_opt_algorithm(Powell, 0.5, 100)
 xftsp_opt_algorithm(Simplex, 0.1, 1.5, 0.5)

Example

In the examples directory of the Xfuzzy distribution, you can find the configuration and data files needed to analyze a time series containing 875 weekly samples of temperatures corresponding to the "El Niņo-Southern Oscillation" phenomenon, a weather pattern consisting of the oscillation of the equatorial Pacific meteorological parameters every certain number of years. The data have been divided into two subsets: one of 475 samples, used as a training file, and another with the remaining 400 samples, used as a test file. A maximum regressor size of 10 and a prediction horizon of 50 has been considered, that is, the last 10 known values will be used to predict the next 50 values.

 xftsp_series_name(estsp07)
 xftsp_training_file("estsp07-training.txt")
 xftsp_test_file("estsp07-test.txt")
 xftsp_opt_algorithm(Rprop, 0.1, 1.5, 0.5)
 xftsp_selection("selection_7")
 xftsp_nrve("nrve_7")
 xftsp_option(tolerance,0)
 xftsp_option(max_exploration,15)
 xftsp_option(generate_optimization_logs)
 xftsp_option(keep_pattern_files)

To carry out the study, launch the tool from Xfuzzy loading the supplied configuration file or execute the command:

    $ xftsp estsp07_xftsp.cfg 


------------------------------------------------------------------------
F. Montesino, A. Lendasse, A. Barriga
Autoregressive time series prediction by means of fuzzy inference systems 
using nonparametric residual variance estimation
Fuzzy Sets and Systems 2010
DOI: 10.1016/j.fss.2009.10.018

For comments, patches, bug reports, etc contact us at:   xfuzzy-team@imse-cnm.csic.es

©IMSE-CNM 2018