Saturday, 2 June 2012

C++ vs Java vs Python

Whatever your job it is important to use the right tool, so the job can be done quickly and you can go home on time. For a software engineer no tool is more important than programming languages.

Which programming language to use frequently descends into something akin to a religious war. I will try to steer clear of this mode of argument. A programming language is simply a tool.

To declare any bias upfront - I am primarily a C++ programmer and I frequently also write Python. I have dabbled in Java but have not done a great deal with it.


Syntax Complexity


In terms of syntax complexity, I hate to say this as a C++ programmer, but C++ is far more complex than Python or Java. C++ has a grammar which is much more context sensitive than Python or Java. Part of the complexity of C++ is that it is really four programming languages in one: core C++, template metaprogramming, C and preprocessor.

I find Java slightly less complex than C++. There is no preprocessor, templates are very limited and obviously there is no C legacy.

In my experience Python has the simplest syntax. The amount of syntax I have to memorise and recall in Python is much smaller than Java or C++.

This is all personal opinion and impressions. You may believe me because I am a C++ programmer, yet I am saying that Python is best and Java is good. Can we have a more objective measure though?

What about looking at each language's full grammar in EBNF-like format? Python has a short specification of about 116 lines. In comparison a Java specification (see section 18, via stackoverflow) is about 545 lines long. Finally C++ comes in at 987 lines.

Although it is difficult to objectively measure a language syntax's complexity, these measures give some idea.

Libraries

Libraries can truly make or break a language. Libraries can paper over problems with the language itself and prevent reinventing the wheel.

Python has built-in object oriented support for dates, compression, retrieving web pages, sending and receiving email, manipulating paths, csv files, command line options, INI style configuration files, threading and more. A large number of third party libraries are available using easy_install.

Java does have built-in support dates, threading and path manipulation utilities. Apache Commons provides many of the features that are built-in to Python: compression, CSV reader, command line options and INI style configuration files. Apache HTTPClient provides a way to retrieve web pages. Although there is no central repository there are an enormous number of open source and commercial libraries available for Java.

C++ and Boost cover very few of these common use cases. Boost supports dates, path manipulation, command line options, INI style configuration file and threading. It is also possible to write a CSV reader in C++ using boost::tokenizer. Compression, retrieving web pages and sending & receiving emails are not supported.

C++11 moves some of the features from Boost into the language itself and there are third party libraries which fill some of these voids. C++ also integrates easily with C libraries, such as libcurl for retrieving web pages.

The advantage of using C libraries is that there are thousands of C libraries. The disadvantage of using C libraries is that you are back using procedural programming for that specific library, unless you wrap the libraries in an object oriented layer yourself. In addition the C libraries have no knowledge of exceptions, so may not clean up correctly if there is an exception.

Databases

Databases are critical for almost any back-end business application.

Java has by far the best native support. JDBC is a standardised interface and is widely supported by database vendors. A JDBC to ODBC bridge is included with the JDK. Others can be easily installed by adding the jar to the classpath.

Python is a bit more mixed. Python has a standardised interface, however it generally relies on third parties to write drivers for C based drivers. Most vendors have not provided Python drivers. Even an ODBC driver is not built-in. A number of drivers are available, most through easy_install. There are some duplicates of some drivers and it is not always easy for a newbie to determine which one is still being actively developed. For example there are six different ODBC drivers, two different MySQL drivers and nine different PostgreSQL drivers.

C++ support for databases is also a bit mixed. There is no support for databases in C++ or Boost. OTL supports accessing Oracle, ODBC and DB2-CLI from C++. Databases other than Oracle or DB2-CLI are only supported via ODBC.

Functional Programming

Wikipedia states that "In computer science, functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data"

Python supports various aspects of functional programming. With the map() and reduce() functions. The itertools library is also included in later versions. Iterators, list comprehension and generator expressions are included in the language syntax and are faster than map() or reduce() in most cases.

Lambda functions are included in the language syntax. Lambda functions can be useful for "glue" logic: for example extracting the nth element from a list or a property from an object. However lambda functions are slow, like all python functions, limited and often not as clear as a separate function.

C++ STL includes template functions similar to python's map() such as  transform(). It is possible to pass an object with an operator(), so the function can be executed very efficiently as an inline function.

Boost supports lambda function programming using some complex template programming. When Boost.Lambda works it is very useful but when it stops working it can be difficult to debug.

The C++11 now supports lambda functions in the language syntax.

I have not had any experience with functional programming in Java but I should note that Hadoop, one of the first MapReduce frameworks was written in Java.



Consistency

Python is really rather good with consistency. Iterators are widely used and all iterators are consistent. There is one preferred way to do most things.

C++ is fairly consistent but there are some oddities. A number of those oddities are due to C++'s C heritage, particularly with respect to syntax. Others are simply design deficiencies: for example in the STL the fstream classes take C-style character strings, rather than C++ std::strings (this has been fixed in C++11).

Java is quite consistent with syntax and the libraries are quite consistent. However iterators are somewhat inconsistent. For example the CharacterIterators interface is very different to the Iterator interface.

Memory and Speed

In my experience C++ is the most frugal with memory. Python is not too far behind considering it is a dynamic language. Java is probably the least frugal with memory but if the program is written properly there is usually no problem.

Java and C++ aren't too far apart in terms of speed, although C++ seems a bit faster for some tasks. In comparison Python can be painfully slow for various types of processing. In the example linked a highly optimised python version took 4.55s but a naive Java version took 130ms. Projects such as pypy may improve the situation

All three languages have sufficient performance in terms of memory or speed that it would not be a top priority for most application. Other issues such as library support, existing codebase, database support, familiarity and developer time are higher priorities.

Conclusion

I am happy to use all three languages. I use C++ and Python most frequently. I haven't seen as many of the dark corners of Python and I am most familiar with C++.

Database access is really easy in Java, most of the code bases I work with are already in C++ and python is great for just hacking things together quickly while being able to actually read it later.

5 comments:

  1. I am also on the same boat as you, scala seems a good companion for C++

    ReplyDelete

  2. Nice information about Python. It is one of the fantastic blog when iam reading.Thank for sharing..


    Python Online Training

    ReplyDelete
  3. I think that everyone should learn python, just for deeper understanding cod, you may not use it, but you must learn basic. It can take two weeks max. Check my article about python here: https://www.cleveroad.com/blog/python-vs-other-programming-languages

    ReplyDelete