Fresh Approaches to Computation: December 2011

For open source programmers who want to write highly readable and reusable programs, the first choice is Python. It is completely free with a pretty rich set of libraries and tool-chain for almost everything that you can imagine. However we should find how efficient python is when used for writing computational programs.
In this type of computer program, we are typically dealing with giant "for" and "while" loops over repetitive numerical operations such as addition and multiplication etc. Therefore the efficiency of the program is limited by the efficiency of the "for" loop and particularly, the programming language that implements the "for" loop for us.
Here we compare Python with Matlab and Ruby which are usually used. To measure the built-in delay of a for loop we write the following program in Matlab

tic
for i = 1:100000000

end
out = toc

and similar program in Python is written as

# File: time-example-5.py
import time
# measure process time
t0 = time.time()
for i in range(0,100000000):
pass
print time.time() - t0, "seconds process time"

and similar program in Ruby is

now = Time.now

for i in (1..100000000)

end

print Time.now-now

when we run these program, we obtain the following timing.

OUTPUT OF RUBY

--------------------------------------------------------------------

ruby 1.8.7 (2010-08-16 patchlevel 302) [i686-linux]

run1 = 5.970745 seconds

run2 = 6.174075 seconds

run3 = 6.117122 seconds

run4 = 6.028899 seconds

run5 = 6.195276 seconds

--------------------------------------------------------------------

OUTPUT OF PYTHON

--------------------------------------------------------------------

Python 2.7.1+ linux32

run1 = 6.72360992432 seconds process time

run2 = 6.64303207397 seconds process time

run3 = 6.67376494408 seconds process time

run4 = 6.68509507179 seconds process time

run5 = 6.83553600311 seconds process time

--------------------------------------------------------------------

OUTPUT OF MATLAB

--------------------------------------------------------------------

Matlab 2010 + linux32

run1 = 0.2635 seconds

run2 = 0.2625 seconds

run3 = 0.2881 seconds

run4 = 0.2595 seconds

run5 = 0.2790 seconds

--------------------------------------------------------------------

Interestingly, Matlab runs the loop in 0.26 seconds in average which is 23 times better that Ruby and Python versions.

Even when we use xrange() in python script to improve performance, the code runs in 2 seconds which is one order of magnitude slower that Matlab code.

We conclude that Matlab performance in "for" loop is much better than Python and Ruby.

It is extremely useful if we could evaluate arbitrary formula directly in C/C++ code without hard coding it into the code. In other words, how is it possible to read and evaluate mathematical formula from an input file in a C/C++ program?

This is a classical problem in computational science/engineering where it is often needed to process a formula in a giant computational code (like the ones which are written to analyze flow over missiles) without modifying/recompiling the code from scratch. For example, if the values of boundary conditions are time varying, then we need to have a time varying formula for boundaries which might be changing from application to application. Therefore, we are interested to have an input file for these kind of boundaries which contains the required boundary condition for each specific application/problem.

There are some ways to do this. The first thing that makes sense is to write a expression processing program which reads the mathematical formula from the input file and evaluates it in the C/C++ code. However, as you might guess this is the hardest ways to do this.

To solve this old problem, we use a blend of C-Python. Here I make the file 'bn_file.bn'

1 x*x

2 math.sin(x**2)

the first column is just an index which might be anything and the second column is the formula which we are interested to evaluate them in a C code for different values of 'x'. Note that we are free to change x*x to anything without changing our C code.

Here I write a Python script. It reads 'bn_file.bn' and evaluate the formula for the value of 'x' which we will set in our following C++ code.

#this is input_formula.py

import math

bn_fomula_dic = dict()

fid = open('bn_file.bn','r')

lines = fid.readlines();

for l in lines:

l = l.split()

if len(l) == 2:

bn_fomula_dic[l[0]] = str(eval(l[1]))

And here is the C-code which set the value of 'x', calls Python script to read/evaluate formula from input file, and prints the output.

//This is file py_cpp_formula.cpp

#include <Python.h>

#include <iostream>

#include <string>

// Read arbitrary mathematical formula from input text file into your C/C++

// code and evaluate it directly in your C/C++ code!

// By: Arash Ghasemi (ghasemi.arash@gmail.com)

// Thanks to KOICHI TAMURA'S BLOG

// http://koichitamura.blogspot.com/2008/06/this-is-small-python-capi-tutorial.html

int main(int argc, char *argv[])

{

int i;

FILE *fid = NULL;

//PyObject pointers required for interfacing

PyObject *po_main = NULL, *po_dict = NULL, *po_value = NULL;

char *filename = "input_formula.py"; //python code filename

char input_buff[32];

double x = 0.0;

Py_Initialize(); //starting Python environment

for ( i = 0; i < 10; i++ ) //evaluate the given formula one hundered times

{

// setting the input to python code using a simple string command

sprintf(input_buff,"x = %f" , x);

PyRun_SimpleString(input_buff);

// opening the file containing python code,

// NOTE IT MUST BE OPENED AND CLOSED BEFORE/AFTER EACH PYTHON RUN QUERY

fid = fopen(filename, "r");

//running the Python code

PyRun_SimpleFile(fid, filename);

fclose(fid);

//interfacing section, first get the main in python and vars associated with it

po_main = PyImport_ImportModule("__main__");

//then attach "bn_fomula_dic" dictionary to "po_dict"

po_dict = PyObject_GetAttrString(po_main, "bn_fomula_dic");

//then find the value corresponding to keyword '2'.

po_value = PyDict_GetItemString(po_dict, "2");

if(PyObject_IsInstance(po_value, (PyObject*)&PyString_Type))

{

std::string valstr = PyString_AsString(po_value);

std::cout << "for x = " << x << ", the value of formula is : "<< valstr << std::endl;

}

//incrementing 'x'

x += .1;

} //next i

Py_DECREF(po_value);

Py_DECREF(po_dict);

Py_Finalize(); //finalize Python session

return (0); //go back to OS

}

All three files 'py_cpp_formula.cpp' and 'input_formula.py' and 'bn_file.bn ' should be in the same directory. To compile the C code I used icc. However the same thing can be done in gcc.

icc -Wall -O3 py_cpp_formula.cpp -I/usr/include/python2.7/ -lpython2.7 -lstdc++

In the C-code, we first initialize a Python environment using Py_Initialize(). Then we set the value of 'x' to something (which continuously varies inside the loop) and we parse it to Python environment using the following lines

sprintf(input_buff,"x = %f" , x);

PyRun_SimpleString(input_buff);

After this, the value of 'x' exists in the Python environment. Now we need to run the Python script to read the formula and evaluate is for the given 'x'. We first open the file containing Python script, run it and then close it. This sequence is done in the following code

fid = fopen(filename, "r");

//running the Python code

PyRun_SimpleFile(fid, filename);

fclose(fid);

Finally the value of evaluated formula is looked-up from the generated Python dictionary "bn_fomula_dic" and evaluated for the second formula, i.e. y = sin(x^2). This value is printed in the output. The output of program is brought here:

for x = 0, the value of formula is : 0.0

for x = 0.1, the value of formula is : 0.00999983333417

for x = 0.2, the value of formula is : 0.0399893341866

for x = 0.3, the value of formula is : 0.089878549198

for x = 0.4, the value of formula is : 0.159318206614

for x = 0.5, the value of formula is : 0.247403959255

for x = 0.6, the value of formula is : 0.352274233275

for x = 0.7, the value of formula is : 0.470625888171

for x = 0.8, the value of formula is : 0.597195441362

for x = 0.9, the value of formula is : 0.72428717437

which is pretty fast!

So in this post we showed that it is possible to read arbitrary formula from input file into C and evaluate is without degrading the efficiency of the C code.

Fresh Approaches to Computation

Friday, December 30, 2011

Matlab vs Python vs Ruby - which one is really faster?

Monday, December 19, 2011

How to read arbitrary mathematical formula from input text file into C/C++?

Total Pageviews

About Me