M4 Competitors Guide

User Manual: Pdf

Open the PDF directly: View PDF PDF.
Page Count: 6

DownloadM4-Competitors-Guide
Open PDF In BrowserView PDF
Competitorโ€™s Guide: Prizes and Rules

Contents
The Prizes ...................................................................................................................................................... 2
1.

Three Major Prizes ............................................................................................................................ 2

2.

Student Prize ..................................................................................................................................... 3

3.

Full Reproducibility Prize................................................................................................................... 4

4.

Prediction Intervals Prize .................................................................................................................. 4

Forecasting Horizons..................................................................................................................................... 5
The dataset ................................................................................................................................................... 5
The Benchmarks............................................................................................................................................ 6
Factors Affecting Forecasting Accuracy ........................................................................................................ 6

1

The Prizes
There will be six Prizes awarded to the winners of the M4 Competition. The exact cash amounts to be
granted (at present standing at 20,000โ‚ฌ generously provided by the University of Nicosia) will depend on
securing additional sponsors and will be announced later. Proportionally, the total amount granted will
be distributed as follows:
Prize
1st Prize
2nd Prize
3rd Prize
Student Prize
Full Reproducibility Prize
Prediction Intervals Prize

Description
Best performing method according to OWA
Second-best performing method according to OWA
Third-best performing method according to OWA
Best performing method among student
competitors according to OWA
Best fully reproducible method according to OWA
Best performing method according to MSIS

Percentage (%)
40
20
10
5
5
20

There are no restrictions in collecting more than one prize.

1. Three Major Prizes
There will be three major Prizes for the First, Second and Third winner of the competition who will be
selected based on the performance of the participating methods according to the Overall Weighted
Average (OWA) of two accuracy measures: the Mean Absolut Scaled Error (MASE1) and the symmetric
Mean Absolut Percentage Error (sMAPE2). The individual measures are calculated as follows:
โ„Ž

1
2|๐‘Œ๐‘ก โˆ’ ๐‘Œฬ‚๐‘ก |
๐‘ ๐‘€๐ด๐‘ƒ๐ธ = โˆ‘
โ„Ž
|๐‘Œ๐‘ก | + |๐‘Œฬ‚๐‘ก |
๐‘ก=1

๐‘€๐ด๐‘†๐ธ =

1
โ„Ž

โˆ‘โ„Ž๐‘ก=1|๐‘Œ๐‘ก โˆ’ ๐‘Œฬ‚๐‘ก |
1
โˆ‘๐‘›
|๐‘Œ โˆ’ ๐‘Œ๐‘กโˆ’๐‘š |
๐‘› โˆ’ ๐‘š ๐‘ก=๐‘š+1 ๐‘ก

Where ๐‘Œ๐‘ก is the post sample value of the time series at point t, ๐‘Œฬ‚๐‘ก the estimated forecast, h the
forecasting horizon and m the frequency of the data (i.e., 12 for monthly series).
An example for computing the OWA is presented below using the MASE and sMAPE of the M3
Competition methods:
๏ƒ˜ Divide all Errors by that of Naรฏve 2 to obtain the Relative MASE and the Relative sMAPE
๏ƒ˜ Compute the OWA by averaging the Relative MASE and the Relative sMAPE as it is shown in the
table below

1

R. J. Hyndman, A. B. Koehler (2006). Another look at measures of forecast accuracy. International Journal of
Forecasting 22(4), 679-688
2
S. Makridakis, M. Hibon (2000). The M3-Competition: results, conclusions and implications. International Journal
of Forecasting, 16 (4), 451-476

2

Forecasting
Method
THETA

1.395

Rank
(MASE)
1

12.762

Rank
(sMAPE)
1

0.834

Rank
(OWA)
1

ForecastPro

1.422

2

0.844

13.088

3

0.861

0.852

2

ForcX

1.441

3

0.855

13.130

4

0.864

0.859

3

Comb S-H-D

1.467

6

0.870

13.056

2

0.859

0.865

4

DAMPEN

1.466

5

0.870

13.279

5

0.874

0.872

5

AutoBox2

1.484

7

0.881

13.284

6

0.874

0.877

6

PP-Autocast

1.523

10

0.904

13.600

7

0.895

0.899

7

HOLT

1.507

8

0.894

13.777

9

0.906

0.900

8

B-J auto

1.512

9

0.897

13.819

10

0.909

0.903

9

WINTER

1.544

15

0.916

13.719

8

0.903

0.909

10

Auto-ANN

1.530

11

0.908

13.921

12

0.916

0.912

11

ARARMA

1.531

12

0.909

13.981

14

0.920

0.914

12

Flors-Pearc1

1.549

16

0.919

13.963

13

0.919

0.919

13

ROBUSTTrend
SMARTFCS

1.537

13

0.912

14.098

15

0.927

0.920

14

1.457

4

0.864

15.390

21

1.012

0.938

15

AutoBox3

1.633

19

0.969

13.913

11

0.915

0.942

16

THETAsm

1.594

18

0.946

14.286

16

0.940

0.943

17

AutoBox1

1.540

14

0.914

14.843

18

0.976

0.945

18

RBF

1.574

17

0.934

15.464

22

1.017

0.976

19

Flors-Pearc2

1.665

21

0.988

14.742

17

0.970

0.979

20

Single

1.659

20

0.985

14.881

19

0.979

0.982

21

Naรฏve 2

1.685

22

1.000

15.201

20

1.000

1.000

22

Naรฏve 1

1.787

23

1.060

15.701

23

1.033

1.047

23

MASE

Relative
MASE
0.827

sMAPE

Relative
sMAPE
0.840

OWA

Note that MASE and sMAPE are first estimated per series by averaging the error estimated per forecasting
horizon and then averaged again across the 3003 time series to compute their value for the whole dataset.
On the other hand, OWA is computed only once at the end for the whole sample, as shown in the Table
above.
In the above example, the most accurate method with the smallest OWA, that would have won the first
prize, is Theta; the second most accurate one is ForecastPro, that would have won the second prize, while
the third most accurate one is ForcX, that would have won the third prize.
The code for computing the OWA will be available on GitHub.

2. Student Prize
A prize will be awarded to the student of the best performing method according to OWA.

3

3. Full Reproducibility Prize
The prerequisite for the Full Reproducibility Prize will be that the code used for generating the forecasts,
with the exception of companies providing forecasting services and those claiming proprietary software,
will be put on GitHub, not later than 10 days after the end of the competition (i.e., the 10th of June, 2018).
In addition, there must be instructions on how to exactly reproduce the M4 submitted forecasts. In this
regard, individuals and companies will be able to use the code and the instructions provided, crediting the
person/group that has developed them, to improve their organizational forecasts.
Companies providing forecasting services and those claiming proprietary software will have to provide
the organizers with a detailed description of how their forecasts were made and a source, or execution
file for reproducing their forecasts for 100 randomly selected series. Given the critical importance of
objectivity and replicability, such description and file will be mandatory for participating in the Competition.
An execution file can be submitted in case that the source program needs to be kept confidential, or,
alternatively, a source program with a termination date for running it.
The code for reproducing the results of the 4Theta method, submitted by the Forecasting & Strategy Unit,
will be put on GitHub before 27-12-2017. This method will not be considered for any of the Prizes.

4. Prediction Intervals Prize
The M4 Competition adopts a 95% Prediction Interval (PI) for estimating the uncertainty around the point
forecasts. The performance of the generated PI will be evaluated using the Mean Scaled Interval Score
(MSIS3) as follows:
2
2
โ„Ž
1 โˆ‘๐‘ก=1(๐‘ˆ๐‘ก โˆ’ ๐ฟ๐‘ก ) + ๐‘Ž (๐ฟ๐‘ก โˆ’ ๐‘Œ๐‘ก )๐Ÿ{๐‘Œ๐‘ก < ๐ฟ๐‘ก } + ๐‘Ž (๐‘Œ๐‘ก โˆ’ ๐‘ˆ๐‘ก )๐Ÿ{๐‘Œ๐‘ก > ๐‘ˆ๐‘ก }
๐Œ๐’๐ˆ๐’ =
1
โ„Ž
โˆ‘๐‘›
|๐‘Œ
|
๐‘› โˆ’ ๐‘š ๐‘ก=๐‘š+1 ๐‘ก โˆ’ ๐‘Œ๐‘กโˆ’๐‘š
Where L and U are the Lower and Upper bounds of the prediction intervals, ๐‘Œ are the future observations
of the series, ๐‘Ž is the significance level and 1 is the indicator function (being 1 if Y is within the postulated
interval and 0 otherwise). Given that forecasters will be asked to generate 95% prediction intervals, ๐‘Ž is
set to 0.05.
An example for computing the MSIS is presented below using the prediction intervals generated by two
different methods for 18-step-ahead forecasts:
๏ƒ˜ A penalty is calculated for each method at the points where the future values are outside the
specified bounds
๏ƒ˜ The width of the prediction interval adds up to the penalty, if any, to get the IS.
๏ƒ˜ The IS estimated at the individual points are averaged to get the MIS value.
๏ƒ˜ MIS is scaled by dividing its value with the mean absolute seasonal difference of the series (here
200).
๏ƒ˜ After estimating MSIS for all the M4 Competition series, its average value is computed to evaluate
the total performance of the method.

3

T. Gneiting, A. E. Raftery (2007). Strictly Proper Scoring Rules, Prediction, and Estimation. Journal of the American
Statistical Association, 102 (477), 359-378.

4

Forecasting
Horizon
1

L1

U1

L2

U2

Y

Penalty1

Penalty2

IS1

IS2

289

938

297

865

654

0

0

649

568

2

266

923

304

873

492

0

0

657

569

3

313

992

312

880

171

5680

5640

6359

6208

4

238

949

319

888

342

0

0

711

569

5

224

1008

327

895

591

0

0

784

568

6

209

1014

334

903

672

0

0

805

569

7

206

1040

342

910

465

0

0

834

568

8

175

1041

349

918

255

0

3760

866

4329

9

164

1067

357

926

864

0

0

903

569

10

150

1078

364

933

768

0

0

928

569

11

138

1094

372

941

672

0

0

956

569

12

120

1104

379

948

519

0

0

984

569

13

109

1121

387

956

519

0

0

1012

569

14

96

1133

395

963

591

0

0

1037

568

15

83

1146

402

971

480

0

0

1063

569

16

70

1157

410

978

564

0

0

1087

568

17

58

1170

417

986

579

0

0

1112

569

18

46

1182

425

993

423

0

80

1136

648

MIS

1216

1095

MSIS

6.08

5.48

Forecasting Horizons
The number of forecasts required by each method is 6 for yearly data, 8 for quarterly, 18 for monthly, 13
for weekly, 14 for daily and 48 for hourly. The accuracy measures are computed for each horizon
separately and then combined to cover, in a weighted fashion, all horizons together for each of the two
accuracy measures (MASE and sMAPE).

The dataset
The 100,000 series will be selected randomly from a database of 900,000 ones on December 28, 2017.
Professor Makridakis will select the seed number for generating the random sample that would select the
100,000 series of the M4 Competition. Some pre-defined filters will be applied beforehand to achieve
some desired characteristics, such as the length of the series, the percentage of yearly, quarterly,
monthly, weekly, daily, and hourly data as well as their type (micro, macro, finance, industry,
demographic, other).

5

The Benchmarks
There will be ten benchmark methods, eight used in the M3 Competition and two extra ones based on ML
concepts. As these methods are well known, readily available and straightforward to apply, the accuracy
of the new ones proposed in the M4 Competition must provide superior accuracy in order to be adopted
and used in practice (taking also into account the computational time it would be required to utilize a
more accurate method versus the benchmarks whose computational requirements are minimal).
1. Naรฏve 1 Ft+I = Yt i = 1, 2, 3, โ€ฆ , m
2. Seasonal Naรฏve Forecasts are equal to the last known observation of the same period.
3. Naรฏve 2 like Naรฏve 1 but the data is seasonally adjusted, if needed, by applying classical
multiplicative decomposition (R stats package). A 90% autocorrelation test is performed,
when using the R package, to decide whether the data is seasonal.
4. Simple Exponential Smoothing (S) (ses() function from v8.2 of the forecast package for R ).
Seasonality is considered like in Naรฏve 2.
5. Holtโ€™s Exponential Smoothing (H) (holt() function from v8.2 of the forecast package for R ).
Seasonality is considered like in Naรฏve 2.
6. Dampen Exponential Smoothing (D) (holt() function from v8.2 of the forecast package for R ).
Seasonality is considered like in Naรฏve 2.
7. Combining S-H-D The arithmetic average of methods 4, 5 and 6.
8. Theta As applied to the M3 competition data. (ฮธ=2, seasonal adjustments like in Naรฏve 2, and SES
applied using the ses() function from v8.2 of the forecast package for R).
9. MLP A perceptron of a very basic architecture and parameterization (developed in Python using
the Scikit library v0.19.1 - available on GitHub)

10. RNN A recurrent network of a very basic architecture and parameterization (developed in
Python using the Keras v2.0.9 and TensorFlow v1.4.0 libraries - available on GitHub)

The code for generating the forecasts of the benchmarks mentioned above will be available on GitHub.
Note that the benchmarks are not eligible for a prize, meaning that the total amount of prizes will be
distributed among the competing participants even if some benchmark could perform better than the
forecasts submitted by the participants.

Factors Affecting Forecasting Accuracy
The M4 would provide a unique opportunity to identify the factors affecting forecasting accuracy. Having
100,000 series, with an average of 12 forecasts for each, more than 100 forecasting methods and 2 accuracy
measures would result in about a quarter of a billion data points. Data analytics will be applied to discover
patterns and relationships, exploiting the findings to enrich our understanding of forecasting accuracy and the
factors that affect it.

6



Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.5
Linearized                      : No
Page Count                      : 6
Language                        : en-US
Tagged PDF                      : Yes
Author                          : Spyros Makridakis
Creator                         : Microsoftยฎ Word 2013
Create Date                     : 2017:12:20 20:53:09+02:00
Modify Date                     : 2017:12:20 20:53:09+02:00
Producer                        : Microsoftยฎ Word 2013
EXIF Metadata provided by EXIF.tools

Navigation menu