Monday, August 13, 2012

F#: use "Find and Replace" to speed up Monte Carlo Simulation

In the previous post, I implemented a Monte Carlo pricer using List in F#. If I would like to improve the performance of the pricer, what could I do? One may suggest the following solutions:

  • (Computer Science Solution 1) Given that nowadays almost every machine (or even a mobile phone) is a multicore and there is no dependency between simulation runs, let's run them in parallel. 
  • (Computer Science Solution 2)  Profile the code, identify the bottleneck, figure out asymptotic time complexity, and then redesign the steps in the algorithm accordingly.
  • (Stochastic Process Solution) Apply variance reduction techniques so that I can achieve the same level of simulation accuracy with less runs.
  • (Numerical Solution?) pre-compute as much as possible, like (r - 0.5*(vol**2.0))*T or 2.0*Math.PI...etc, so that I don't have to repeatedly compute the same constants.
  • (hmm... Finance Solution perhaps) Simply use money to resolve this issue, i.e., buy another machine with faster processors.
Well, all solutions above sound promising, but I actually took another even simpler approach. I just use the "Find and Replace" function in the Visual Studio code editor to replace all the "List" occurrences in my code with "Array" (and I tried "Seq" also). And then measure the performances of the 3 different versions: the original "List" version and the two new versions: "Array" and "Seq".

The results are as follows:
version
option price
CPU time (ms)
absolute time (ms)
List
10.118465
5,335
5,467
Array
10.118465
1,606
1,639
Sequence
10.118465
7,612
7,631
( Please note that optimization is turned on while compiling the 3 versions. The code of the Array version is attached at the bottome of this post.)

All the 3 option prices are the same because they use the same seed, 1, to generate random numbers. As you can see, the performance improvement is obviously worth the effort of "Find and Replace". Now the new "Array" version can finish the same computations within less than 1/3 of the time needed by the original "List" version.

The fact that Array is the fastest and Sequence is the slowest is not surprising to me. And I don't bother to find out why Sequence took so much time relatively at this moment, because the way I implemented the code doesn't really need its main strengths: lazy evaluation and ability to model infinite series. However, what I found intriguing is why Array is so fast relative to List.

What I learnt from some F# introductory books is that Array is efficient for random access, which is also where List tends to perform horribly given its singly-linked nature. Though, in our Monte Carlo computations, we walk through a list or an array in a sequential manner only. As such, I don't think it's the efficiency of random access to cause the performance difference I'm looking at.

After running a few simple tests on List and Array using F# Interactive as described below, it seems to me that in our case what really makes the difference is creation of a data structure rather than access to individual elements in the data structure. As you can see from below, it took more than 2 seconds to initialize a list of 10,000,000 floats, while the same action with an array took less than 0.1 seconds. Therefore, in terms of creation or allocation, arrays perform much faster than lists. In terms of iterating over the 10,000,000 floats, although the result below shows that the list was even slightly faster than the array, the difference is immaterial as far as our case is concerned.

Please note that by this post I am not saying that arrays are always superior than lists. At least we should remember that lists have execellent constant time complexity in stack-style operations, which I don't use in my Monte Carlo code.

Based on the finding that creation of a new list can be costly, I will be updating the code of the "List" version accordingly in next post.


//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
//// Here are the simple tests on Array and List using F# Interactive
//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
Firstly, I turn on the timing function:
> #time;;
--> Timing now on

Secondly, I create a list of 10,000,000 floats and sum up using the fold function:
> let X = List.init 10000000 (fun i -> float i);;
Real: 00:00:02.118, CPU: 00:00:02.246, GC gen0: 38, gen1: 20, gen2: 2


val X : float list =
  [0.0; 1.0; 2.0; 3.0; 4.0; 5.0; 6.0; 7.0; 8.0; 9.0; 10.0; 11.0; 12.0; 13.0;
   14.0; 15.0; 16.0; 17.0; 18.0; 19.0; 20.0; 21.0; 22.0; 23.0; 24.0; 25.0;
   26.0; 27.0; 28.0; 29.0; 30.0; 31.0; 32.0; 33.0; 34.0; 35.0; 36.0; 37.0;
   38.0; 39.0; 40.0; 41.0; 42.0; 43.0; 44.0; 45.0; 46.0; 47.0; 48.0; 49.0;
   50.0; 51.0; 52.0; 53.0; 54.0; 55.0; 56.0; 57.0; 58.0; 59.0; 60.0; 61.0;
   62.0; 63.0; 64.0; 65.0; 66.0; 67.0; 68.0; 69.0; 70.0; 71.0; 72.0; 73.0;
   74.0; 75.0; 76.0; 77.0; 78.0; 79.0; 80.0; 81.0; 82.0; 83.0; 84.0; 85.0;
   86.0; 87.0; 88.0; 89.0; 90.0; 91.0; 92.0; 93.0; 94.0; 95.0; 96.0; 97.0;
   98.0; 99.0; ...]


> let sum_X = List.fold (fun s x -> s + x) 0.0 X;;
Real: 00:00:00.062, CPU: 00:00:00.062, GC gen0: 0, gen1: 0, gen2: 0


val sum_X : float = 4.9999995e+13

Finally, I do the same things with Array:
> let Y = Array.init 10000000 (fun i -> float i);;
Real: 00:00:00.076, CPU: 00:00:00.078, GC gen0: 0, gen1: 0, gen2: 0


val Y : float [] =
  [|0.0; 1.0; 2.0; 3.0; 4.0; 5.0; 6.0; 7.0; 8.0; 9.0; 10.0; 11.0; 12.0; 13.0;
    14.0; 15.0; 16.0; 17.0; 18.0; 19.0; 20.0; 21.0; 22.0; 23.0; 24.0; 25.0;
    26.0; 27.0; 28.0; 29.0; 30.0; 31.0; 32.0; 33.0; 34.0; 35.0; 36.0; 37.0;
    38.0; 39.0; 40.0; 41.0; 42.0; 43.0; 44.0; 45.0; 46.0; 47.0; 48.0; 49.0;
    50.0; 51.0; 52.0; 53.0; 54.0; 55.0; 56.0; 57.0; 58.0; 59.0; 60.0; 61.0;
    62.0; 63.0; 64.0; 65.0; 66.0; 67.0; 68.0; 69.0; 70.0; 71.0; 72.0; 73.0;
    74.0; 75.0; 76.0; 77.0; 78.0; 79.0; 80.0; 81.0; 82.0; 83.0; 84.0; 85.0;
    86.0; 87.0; 88.0; 89.0; 90.0; 91.0; 92.0; 93.0; 94.0; 95.0; 96.0; 97.0;
    98.0; 99.0; ...|]


> let sum_Y = Array.fold (fun s x -> s + x) 0.0 Y;;
Real: 00:00:00.107, CPU: 00:00:00.109, GC gen0: 0, gen1: 0, gen2: 0


val sum_Y : float = 4.9999995e+13



//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
//// The Array version of Monte Calro Simulation
//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
open System
open System.Diagnostics


let N = 10000000
let rnd = new Random(1)    
let randomPairs = Array.init N (fun i -> (rnd.NextDouble(), rnd.NextDouble()))
let box_muller_transform (u1,u2) = sqrt(-2.0*log(u1)) * cos(2.0*Math.PI*u2)
let randomNormals = randomPairs |> Array.map box_muller_transform


let call_payout K S = max (S-K) 0.0

let European_Option_Price (payout:float->float) S0 r vol T =
    let S_T x = S0 * exp((r - 0.5*(vol**2.0))*T + vol*sqrt(T)*x)
    randomNormals |> Array.map S_T |> Array.map payout |> Array.average
                  |> (*) (exp(-r*T))


let S0 = 50.0
let K = 40.0
let r = 0.01
let vol = 0.2
let T = 0.25
let price() =
    let price = European_Option_Price (call_payout K) S0 r vol T
    printfn "option_price = %f" price


let time f =
    let proc = Process.GetCurrentProcess()
    let cpu_time_stamp = proc.TotalProcessorTime
    let timer = new Stopwatch()
    timer.Start()
    try
        f()
    finally
        let cpu_time = (proc.TotalProcessorTime-cpu_time_stamp).TotalMilliseconds
        printfn "CPU time = %dms" (int64 cpu_time)
        printfn "Absolute time = %dms" timer.ElapsedMilliseconds


time price

2 comments:

  1. No mystery there! I've been systematically investigating F# data structure performance. See my latest article in the series

    ReplyDelete
    Replies
    1. Thanks for the info. I will definitely take a look.

      Delete