Of Streams and Storms - A Direct Comparison of IBM InfoSphere Streams and Apache Storm in a Real World Use Case - Email Processing
This benchmark study finds InfoSphere Streams outperforms Apache Storm by 2.6 to 12.3 times in terms of throughput while simultaneously consuming 5.5 to 14.2 times less CPU time. Furthermore, it shows that the throughput and CPU time gaps widen as data volume, degree of parallelism, and/or number of processing nodes grows. InfoSphere Streams handles heavy loads much better and makes more effective use of available CPU capacity. As a result, Apache Storm is not practical for most production applications such as geospatial analytics, deep network inspection and call data record analysis. The sophisticated and robust engineering of InfoSphere Streams ensures the ability to scale linearly and handle high loads effectively while maintaining a low resource usage footprint. IBM has also designed a real-time statistical features calculation pipeline for streaming email content, which acts as a pre-processing phase for a larger spam detection system. Download the paper to learn about this email processing calculator and explore deep technical results of the study.