One of the most challenging tasks for network operators is implementing accurate per-packet monitoring, looking for signs of performance degradation, security threats, and so on. Upon critical event detection, corrective actions must be taken to keep the network running smoothly. Implementing this mechanism requires the analysis of packet streams in a real-time (or close to) fashion. In a softwarized network context, Stream Processing Systems (SPSs) can be adopted for this purpose. Recent solutions based on traditional SPSs, such as Storm and Flink, can support the definition of general complex queries, but they show poor performance at scale. To handle input data rates in the order of gigabits per seconds, programmable switch platforms are typically used, although they offer limited expressiveness. With the proposed approach, we intend to offer high performance and expressive power in a unified framework by solely relying on SPSs for multicores. Captured packets are translated into a proper tuple format, and network monitoring queries are applied to tuple streams. Packet analysis tasks are expressed as streaming pipelines, running on general-purpose programmable network devices, and a second stage of elaboration can process aggregated statistics from different devices. Experiments carried out with an example monitoring application show that the system is able to handle realistic traffic at a 10 Gb/s speed. The same application scales almost up to 20 Gb/s speed thanks to the simple optimizations of the underlying framework. Hence, the approach proves to be viable and calls for the investigation of more extensive optimizations to support more complex elaborations and higher data rates.