Packets Matter is an op-ed series advocating the use of PCAP as the primary source of intelligence in enterprise network security.
Network security monitoring and analytics are the quintessential “big data” problems. Cisco has projected that Internet traffic will surpass a zettabyte in 2016. That’s around a trillion gigabytes – over 300 GB per capita of the three billion Internet users worldwide. The rate of growth is just astounding.
Meanwhile, reports of massive data breaches are giving businesses greater and greater incentive to invest in network security analytics. But the sheer speed and volume of network traffic data makes this a formidable undertaking. So, many businesses resort to monitoring logs and alerts, and yet despite being small and easy to navigate, logs will fail to secure the enterprise against advanced persistent threats (APTs). What would be optimal is an application of packet capture (PCAP) that doesn’t buckle under stress.
PCAP runs into scalability challenges at three stages: capture, storage, and analysis. I’ll discuss some of the architectural design decisions that our engineers at Novetta have made in developing an enterprise PCAP solution that scales at each stage.
Capture at scale
Capture with appropriate hardware and software. Enterprise packet capture demands more than running tcpdump on a host. You will want to copy packets off the wire using either a network tap or the port mirroring functionality offered by your switch. I would recommend the tap. You may need to load balance multiple taps at places of extreme bandwidth consumption, such as your Internet egress points.
Your choice of packet capture software matters, too. PF_RING was designed for high-throughput packet capture, and it’s a part of what we use in our solution.
Storage at scale
Leave the packets where they were captured. Bring only the metadata to the cluster, and let those queries guide you to the packets that deserve your attention. I think this was a great innovation by our engineers, which gives a nod to the adage, “bring the process to the data; don’t bring the data to the process.”
Unless you’re at DEFCON, less than 0.01% of payload content deserves the attention of an incident response team. Those needles in the haystack are valuable, of course, because they may contain proof of malware exchange. But it would be a waste to inspect every payload that traverses the network. Analysis can focus almost entirely on the metadata of the packet headers – not the payloads. That metadata requires up to 200 times less cluster storage capacity than the full packets, and it offers greater fidelity than NetFlow.
So leave the full packets where you captured them, and fetch them only to validate suspicions raised in your analysis of the metadata. This will lead to greater capacity, faster ingest, and snappier analytics.
Analysis at scale
Distribute your analytical processes. Novetta has preferred to partner with vendors of shared-nothing, massively parallel processing (MPP) database systems for their maturity in the industry. Hadoop could become a viable platform for network security monitoring and analytics once the blossoming of some new projects proves to bear fruit. Contrary to Hadoop’s reputation as a “batch only” system, recent projects such as Kafka, Spark, Storm, Drill, and Tez are bringing new analytical capabilities into Hadoop that I, for one, am excited to explore in relation to network security. But I predict we are two years away from seeing a stable, documented, proven integration of those systems.
Use column-oriented data for extremely fast analytical processing. Many analytical queries require knowing everything about a small number of things, such as, “show me the total number of bytes communicated by each IP address, and then sort the results.” These are called “aggregate queries,” because they compute the aggregate values of fields across many records rather than simply retrieve the records.
Aggregate queries run drastically faster on data that has been organized into a column-oriented structure, which contrasts with the row-oriented structure of conventional database systems. A person familiar with row-oriented data might be unsure how to envision the layout of column-oriented data. If you feel that way, consider this:
Imagine you have a cursor that reads data from right to left starting at any point in a file. With data organized in rows, your cursor must read every value of every field before it has read the values of the one field you care about. With data organized in columns, your cursor can read every value of that field without reading the values of irrelevant fields. You can see this in the figure below. Notice how the columnar structure lends itself easily to the rapid aggregation of fields. Simple – and effective!
The table on the left shows data organized in rows: each line contains one value for every field. The table on the right shows data organized in columns: each line contains every value for one field. The blue cells in both tables show where the cursor must traverse to read every value in the ip field.
Everything I’ve described above is illustrative of the architectural design decisions of Novetta Cyber Analytics, which has demonstrated robustness under the stress of petabytes of packets captured every week. You can learn more about its architecture in the product brochure found at the bottom of the page. You might also like to browse some of our favorite made possible by this architecture in the “Top 10” doc. Hopefully I have given you some concepts and keywords to help guide your research on achieving scalable PCAP.