Tutorials‎ > ‎

Tutorial 7: Logging the statistics

Experimenting with genetic algorithm requires often to collect data regarding the evolution along the different experiments. In this tutorial we will learn how to capture evolution statistics in .csv and MS Excel files.

Package:
jenes.tutorial.problem7

Files:
KnapsackLoggedProblem.java

Loggers

There are two three methods to log statistics in Jenes. The first is do-it-yourself by saving data in a file or database. The other two rely on loggers, available since Jenes 1.3.0. The easiest way is to use  a StatisticsLogger. This class is able to automatically record objects that are LoggableStatistics, e.g. Population.Statistics and GeneticAlgorithm.Statistics objects. StatisticsLogger relies on a real logger to actually record the data on some medium. For example, it can make use of a CVSLogger for recording on .csv (i.e. comma separated values) file

084         csvlogger = new StatisticsLogger(
085                     new CSVLogger(new String[]{"LegalHighestScore","LegalScoreAvg","LegalScoreDev"}, 
086                         FOLDER+"knapsackproblem.csv" ) );

or XLSLogger for recording directly in a MS Excel .xls file

088         xlslogge1 = new StatisticsLogger(
089                     new XLSLogger(new String[]{"LegalHighestScore","LegalScoreAvg","LegalScoreDev"},
090                         FOLDER+"knapsack1.log.xls" ) );

Sometime it is useful to re-use a spreedsheet. In this case, we can specify which template to use in the constructor method of XLSLogger.

092
         xlslogge2 = new StatisticsLogger(
093                     new XLSLogger(new String[]{"LegalHighestScore""LegalScoreAvg" "IllegalScoreAvg"},
094                         FOLDER+"knapsack2.log.xls", FOLDER+"knapsack.tpl.xls" ) );

The third method makes direct use of real logger, for example

096
         xlslogge3 = new XLSLogger(new String[]{"LegalHighestScore""LegalScoreAvg" "Run"},
097                               FOLDER+"knapsack3.log.xls");

Schema

Loggers are able to process tabular data with a given schema, i.e. a set of labelled fields. The data schema has to be give at construction time. For instance in the first and second case we are interested to record LegalHighestScore, LegalScoreAvg and LegalScoreDev over the generations. In the third case we record LegalHighestScore, LegalScoreAvg and IllegalScoreAvg. Finally in the third case we record LegalHighestScore, LegalScoreAvg and Run, the latter representing the experimentation run number.

If we use StatisticsLogger, the fields has to be labelled the same way statistics are annotated by Loggable( label ). If we use directly the real loggers, we can make use of any label we wish for the schema, as the mapping will be on our charge, as described below.

Note that schema are case sensitive, so "someStatistic" is different from "SomeStatistic", and they are both different from "SOMESTATISTIC".

Media

Recording on .csv files is straightforward. If the file exists, we can decide to overwrite data (default) or to append data.

When we use .xls files we should take care of some details. When the log file does not exists, the logger creates an empty workbook with one sheet, where columns are allocated according to the schema with first row allocated to field labels.

We can decide to use a pre-made .xls file. In this case we need to reserve a column for each field in the schema. The column can be placed anywhere in sheet, which sheet does not matter. The only constraint is that the cell at row 1 has to contain the field label. We can also decide to use a .xls file as template. A template is nothing else a pre-made file. In this case, the file is not overwritten as the destination is different (see xlslogge2 for an example).

For example we can decide to use the template shown in Figure 1.


Figure 1 - Template (knapsack.tpl.xls) used in this tutorial.

Data Collection

Once loggers are setup and schema defined, we can collect data. The best way to collect data is by means of listeners. For instance

159         GenerationEventListener<BooleanChromosome> logger1 = new GenerationEventListener<BooleanChromosome>() {
160 
161             public void onGeneration(GeneticAlgorithm ga, long time) {
162                 Population.Statistics stats = ga.getCurrentPopulation().getStatistics();
163                 
164                 prb.csvlogger.record(stats);
165                 prb.xlslogge1.record(stats);
166                 prb.xlslogge2.record(stats);
167 
168             }
169 
170         };

In this case StatisticsLogger takes care of logging data. We only need to provide a LoggableStatistics, and StatisticsLogger will do the job for us. The drawback to this solution is that we are not able to add data not considered by the statistics object. To gain a finer control of the process, we can make direct usage of a real logger. For instance

188         GenerationEventListener<BooleanChromosome> logger2 = new GenerationEventListener<BooleanChromosome>() {
189             public void onGeneration(GeneticAlgorithm ga, long time) {
190                 Population.Statistics stats = ga.getCurrentPopulation().getStatistics();
191 
192                 prb.xlslogge3.put("LegalHighestScore", stats.getLegalHighestScore());
193                 prb.xlslogge3.put("LegalScoreAvg", stats.getLegalScoreAvg());
194                 prb.xlslogge3.put("Run", prb.exec);
195 
196                 prb.xlslogge3.log();
197             }
198 
199         };

In this case we can build a log record by ourselves.

Log closing

Finally we can make data persistent by closing the loggers.

181         prb.csvlogger.close();
182         prb.xlslogge1.close();
183         prb.xlslogge2.close();

209
         prb.xlslogge3.close();


Output

The output got at the end is depicted in the following figures. See also the attachments below.



Figure 2 - The output provided by csvlogger

Figure 3 - The output provided by xlslogge1


Figure 4 - The output provided by xlslogge2

Figure 5 - The output provided by xlslogge3


Ĉ
Luigi Troiano,
Aug 16, 2009, 6:52 AM
Ĉ
Luigi Troiano,
Aug 16, 2009, 6:52 AM
Ĉ
Luigi Troiano,
Aug 16, 2009, 6:53 AM
Ĉ
Luigi Troiano,
Aug 16, 2009, 6:53 AM
ċ
knapsackproblem.csv
(2k)
Luigi Troiano,
Aug 16, 2009, 6:52 AM