Friday, December 30, 2011

Matlab vs Python vs Ruby - which one is really faster?

For open source programmers who want to write highly readable and reusable programs, the first choice is Python. It is completely free with a pretty rich set of libraries and tool-chain for almost everything that you can imagine. However we should find how efficient python is when used for writing computational programs.
In this type of computer program, we are typically dealing with giant "for" and "while" loops over repetitive numerical operations such as addition and multiplication etc. Therefore the efficiency of the program is limited by the efficiency of the "for" loop and particularly, the programming language that implements the "for" loop for us.
Here we compare Python with Matlab and Ruby which are usually used. To measure the built-in delay of a for loop we write the following program in Matlab

tic
for i = 1:100000000
    
end
out = toc

and similar program in Python is written as


# File: time-example-5.py
import time
# measure process time
t0 = time.time()
for i in range(0,100000000):
    pass
print time.time() - t0, "seconds process time"

and similar program in Ruby is

now = Time.now
for i in (1..100000000)
end
print Time.now-now

when we run these program, we obtain the following timing. 

                  OUTPUT OF RUBY
--------------------------------------------------------------------
ruby 1.8.7 (2010-08-16 patchlevel 302) [i686-linux]
run1 = 5.970745 seconds
run2 = 6.174075 seconds
run3 = 6.117122 seconds
run4 = 6.028899 seconds
run5 = 6.195276 seconds
--------------------------------------------------------------------

                  OUTPUT OF PYTHON
--------------------------------------------------------------------
Python 2.7.1+ linux32
run1 = 6.72360992432 seconds process time
run2 = 6.64303207397 seconds process time
run3 = 6.67376494408 seconds process time
run4 = 6.68509507179 seconds process time
run5 = 6.83553600311 seconds process time
--------------------------------------------------------------------

                  OUTPUT OF MATLAB
--------------------------------------------------------------------
Matlab 2010 + linux32
run1 = 0.2635 seconds 
run2 = 0.2625 seconds 
run3 = 0.2881 seconds 
run4 = 0.2595 seconds 
run5 = 0.2790 seconds 
--------------------------------------------------------------------

Interestingly, Matlab runs the loop in 0.26 seconds in average which is 23 times better that Ruby and Python versions
Even when we use xrange() in python script to improve performance, the code runs in 2 seconds which is one order of magnitude slower that Matlab code.
We conclude that Matlab performance in "for" loop is much better than Python and Ruby.




Monday, December 19, 2011

How to read arbitrary mathematical formula from input text file into C/C++?

It is extremely useful if we could evaluate arbitrary formula directly in C/C++ code without hard coding it into the code. In other words, how is it possible to read and evaluate mathematical formula from an input file in a C/C++ program?
This is a classical problem in computational science/engineering where it is often needed to process a formula in a giant computational code (like the ones which are written to analyze flow over missiles) without modifying/recompiling the code from scratch. For example, if the values of boundary conditions are time varying, then we need to have a time varying formula for boundaries which might be changing from application to application. Therefore, we are interested to have an input file for these kind of boundaries which contains the required boundary condition for each specific application/problem.

There are some ways to do this. The first thing that makes sense is to write a expression processing program which reads the mathematical formula from the input file and evaluates it in the C/C++ code. However, as you might guess this is the hardest ways to do this.
To solve this old problem, we use a blend of C-Python. Here I make the file 'bn_file.bn' 
   
1         x*x
2         math.sin(x**2)
the first column is just an index which might be anything and the second column is the formula which we are interested to evaluate them in a C code for different values of 'x'. Note that we are free to change x*x to anything without changing our C code.
Here I write a Python script. It reads 'bn_file.bn'  and evaluate the formula for the value of 'x' which we will set in our following C++ code.

#this is input_formula.py
import math
bn_fomula_dic = dict()
fid = open('bn_file.bn','r')
lines = fid.readlines();
for l in lines:
    l = l.split()
    if len(l) == 2:
        bn_fomula_dic[l[0]] = str(eval(l[1]))

And here is the C-code which set the value of 'x', calls Python script to read/evaluate formula from input file, and prints the output.

//This is file py_cpp_formula.cpp
#include <Python.h>
#include <iostream>
#include <string>

// Read arbitrary mathematical formula from input text file into your C/C++
// code and evaluate it directly in your C/C++ code! 
// By: Arash Ghasemi (ghasemi.arash@gmail.com)
// Thanks to KOICHI TAMURA'S BLOG
// http://koichitamura.blogspot.com/2008/06/this-is-small-python-capi-tutorial.html
// Copyright 2011 by the author

int main(int argc, char *argv[])
{
  int i;
  FILE *fid = NULL;
  //PyObject pointers required for interfacing
  PyObject *po_main = NULL, *po_dict = NULL, *po_value = NULL;
  char *filename = "input_formula.py"; //python code filename
  char input_buff[32];
  double x = 0.0;

  Py_Initialize(); //starting Python environment

  for ( i = 0; i < 10; i++ ) //evaluate the given formula one hundered times
    {
      // setting the input to python code using a simple string command
      sprintf(input_buff,"x = %f" , x);
      PyRun_SimpleString(input_buff);
      // opening the file containing python code,
      // NOTE IT MUST BE OPENED AND CLOSED BEFORE/AFTER EACH PYTHON RUN QUERY
      fid = fopen(filename, "r");
      //running the Python code
      PyRun_SimpleFile(fid, filename);
      fclose(fid);
      //interfacing section, first get the main in python and vars associated with it
      po_main = PyImport_ImportModule("__main__");
      //then attach "bn_fomula_dic" dictionary to "po_dict"
      po_dict = PyObject_GetAttrString(po_main, "bn_fomula_dic");
      //then find the value corresponding to keyword '2'.
      po_value = PyDict_GetItemString(po_dict, "2");
      if(PyObject_IsInstance(po_value, (PyObject*)&PyString_Type))
{
 std::string valstr = PyString_AsString(po_value);
 std::cout << "for x = " << x << ", the value of formula is : "<< valstr << std::endl;
}
      //incrementing 'x'
      x += .1;
    } //next i

  Py_DECREF(po_value);
  Py_DECREF(po_dict);

  Py_Finalize(); //finalize Python session
  return (0); //go back to OS
}
All three files 'py_cpp_formula.cpp' and 'input_formula.py' and 'bn_file.bn ' should be in the same directory. To compile the C code I used icc. However the same thing can be done in gcc. 

icc -Wall -O3 py_cpp_formula.cpp -I/usr/include/python2.7/ -lpython2.7 -lstdc++

In the C-code, we first initialize a Python environment using Py_Initialize(). Then we set the value of 'x' to something (which continuously varies inside the loop) and we parse it to Python environment using the following lines

      sprintf(input_buff,"x = %f" , x);
      PyRun_SimpleString(input_buff);
After this, the value of 'x' exists in the Python environment. Now we need to run the Python script to read the formula and evaluate is for the given 'x'. We first open the file containing Python script, run it and then close it. This sequence is done in the following code

      fid = fopen(filename, "r");
      //running the Python code
      PyRun_SimpleFile(fid, filename);
      fclose(fid);

Finally the value of evaluated formula is looked-up from the generated Python dictionary "bn_fomula_dic" and evaluated for the second formula, i.e. y = sin(x^2). This value is printed in the output. The output of program is brought here:
for x = 0, the value of formula is : 0.0
for x = 0.1, the value of formula is : 0.00999983333417
for x = 0.2, the value of formula is : 0.0399893341866
for x = 0.3, the value of formula is : 0.089878549198
for x = 0.4, the value of formula is : 0.159318206614
for x = 0.5, the value of formula is : 0.247403959255
for x = 0.6, the value of formula is : 0.352274233275
for x = 0.7, the value of formula is : 0.470625888171
for x = 0.8, the value of formula is : 0.597195441362
for x = 0.9, the value of formula is : 0.72428717437

which is pretty fast! 
So in this post we showed that it is possible to read arbitrary formula from input file into C and evaluate is without degrading the efficiency of the C code.