Stream processing is currently central to handle large volumes of data generated at high rates. However, the efficient processing of such quantity of data demands massively parallel hardware. The usual approach is to rely on clusters of multi-processors, where network communication may become a bottleneck. Some work has also been done in the GPU computing field. Yet, the GPUs’ programming complexity and the existence of synchronization-related overheads, when the streaming graph scales, have hampered the integration of GPUs in the Big Data streaming frameworks. In this paper we explore the unique characteristics of the Intel Xeon Phi processor to develop a stream processing framework for hybrid CPU/Intel Xeon Phi systems. We built atop the Intel Threading Building Blocks library and the Marrow algorithmic skeleton framework to offer an easily programmable high performance system. Our experimental results show that offloading the computationally heavy nodes of a streaming graph to the Xeon Phi may earn considerable speed-ups. Furthermore, additional gains may be obtained by sharing the processing load between the CPU(s) and the Xeon Phi processor(s).