Welcome!

@DXWorldExpo Authors: Elizabeth White, Pat Romanski, Yeshim Deniz, Liz McMillan, William Schmarzo

Blog Feed Post

Cloudera and Platfora Leveraged to Address Hard Challenge: What do “they” know about my network?

By

Editor’s note: This guest post by Wayne Wheeles focuses on a topic I’ve struggled with for over a 15 years and shows great promise in addressing challenges no one else has tackled.  Wayne is a Network Forensics Analytic/Enrichment Developer at Six3 Systems. – bg

For a decade now, many Network Forensics Analysts, Network Security Engineers, and Cyber security Professionals have pondered that most interesting of questions:  What do “they” know about my network? From time to time over the years, discussions related to determining what external entities may know about determining the attack surface of a network occur and then fizzle out.  Often, organizations collect and store a great deal of data to piece together a defensive view of a network but do not piece together what external entities know about or have shown interest in on the same network. Big Data offers the potential to evaluate this question in ways that were unimaginable just five years ago. New technologies and techniques enable organizations to evaluate the question of what is the known attack surface of my network.  I addressed this question head-on using a variety of cyber security data sets, enrichment techniques, Cloudera CDH 4 (Hadoop distribution), and Platfora: a relative newcomer that is one of the most powerful tools I have worked with in some time.

In this day and age it is amazing how little is known about what activities are occurring on our networks.  The “they” alluded to earlier in the blog is used to describe external entities which engage in scanning and network mapping, seeking to learn more about all aspects of a target network: what devices reside on the network, what ports are open, and identify potential avenues for exploitation. This scanning occurs at a scale that is almost unimaginable and often goes unnoticed. For those who have the question: So is this network scanning common? On the working data used for this article set, I determined that over 4000 large-scale scans of the target network occurred each year, originating from at least 95 countries worldwide.

As always, the real story is told through the data; using netflow data, port and geographic enrichment. In order to more effectively share the tale at scale, we worked with Platfora to explore and visualize the data.  The screen shot below is of the Platfora Data Catalog, which makes it easy to look at all of the available data sets available in the cluster. The data catalog provides the instrument for defining data sets and relationships between different classes of data within the cluster.

platforadata1

Next, using Platfora we loaded a series of derivative data sets which captured all of the major scans on the network during 2013 into the Platfora Data Catalog. From the Platfora Data Catalog, we generated a series of lenses or views of the data. When creating Lenses, Platfora provides a wide range of functions, operators and aggregates for working with data which are really helpful in generating visualizations in this blog.

Platfora provided a wide range of capabilities for preparing the data for analysis which considerably reduced data preparation time. After completing the preparation of the data, the emphasis shifted to developing and understanding the data using a variety of visualization techniques. In the Platfora VizBoard below, of interest was not the fact that high ports (x-axis) were scanned, but rather the number of times (indicated by color of bars) that they were scanned by the same source IP address (y-axis).  Each of the source IP addresses in the set below scanned ports of the targeted network over 1000 times in a 90-day timeframe.

platforadata2

The heat map above depicts the fact that not only did the source IP addresses (y-axis) scan large numbers of destination ports (x-axis) on the target network but in many instances returned between four and six times to the same port during the observation period.  When building the data sets, references were defined, defining the relationships between different types of data resident in the cluster.  In the graphic above, when port 61000 is highlighted, the netflow information which served as the base data set has been augmented with information from other data sets on: known exploits for a given port, Intrusion Detection Signatures information for a given port over time and information on Intrusion Detection Signatures for a given IP address.  Platfora was very useful for “following where the data will lead”, enabling the analyst to pivot in the direction with all details on a port or IP address, bytes, packets, and generate new derivative lenses with two clicks of a button.

In review, what do “they” know about my network? Based on the analysis of the set of aforementioned actors above, the following observations were made: over 300 scans a month occurred, roughly 4000 (sweeping scans covering a large number of ports) large scans occurred each year, in all over 22,500 ports were probed and of those no less than twelve ports were revisited up to ten times.  Based on the analysis using Platfora, several areas were identified for additional investigation and recommendations made to improve the overall network security posture.

In order to put this article together, a four-node Hadoop cluster built using Cloudera CDH 4, IBM Pure Data for Analytics 2001 and Platfora’s exploratory BI tool for Hadoop.

Based on what I had read previously my view of Platfora was that it was just a visualization package but to my surprise it turned out to be a complete end-to-end data integration and visualization platform fully integrated with Hadoop and Hive.

Finally, I would like to thank two contributors: Keith McClellan and Six3 Systems for helping me pull this off and Bob Gourley (CTO Vision) for posting my blog.

Read the original blog entry...

More Stories By Bob Gourley

Bob Gourley writes on enterprise IT. He is a founder of Crucial Point and publisher of CTOvision.com

DXWorldEXPO Digital Transformation Stories
With the introduction of IoT and Smart Living in every aspect of our lives, one question has become relevant: What are the security implications? To answer this, first we have to look and explore the security models of the technologies that IoT is founded upon. In his session at @ThingsExpo, Nevi Kaja, a Research Engineer at Ford Motor Company, discussed some of the security challenges of the IoT infrastructure and related how these aspects impact Smart Living. The material was delivered interac...
Atmosera delivers modern cloud services that maximize the advantages of cloud-based infrastructures. Offering private, hybrid, and public cloud solutions, Atmosera works closely with customers to engineer, deploy, and operate cloud architectures with advanced services that deliver strategic business outcomes. Atmosera's expertise simplifies the process of cloud transformation and our 20+ years of experience managing complex IT environments provides our customers with the confidence and trust tha...
Intel is an American multinational corporation and technology company headquartered in Santa Clara, California, in the Silicon Valley. It is the world's second largest and second highest valued semiconductor chip maker based on revenue after being overtaken by Samsung, and is the inventor of the x86 series of microprocessors, the processors found in most personal computers (PCs). Intel supplies processors for computer system manufacturers such as Apple, Lenovo, HP, and Dell. Intel also manufactu...
Darktrace is the world's leading AI company for cyber security. Created by mathematicians from the University of Cambridge, Darktrace's Enterprise Immune System is the first non-consumer application of machine learning to work at scale, across all network types, from physical, virtualized, and cloud, through to IoT and industrial control systems. Installed as a self-configuring cyber defense platform, Darktrace continuously learns what is ‘normal' for all devices and users, updating its understa...
At CloudEXPO Silicon Valley, June 24-26, 2019, Digital Transformation (DX) is a major focus with expanded DevOpsSUMMIT and FinTechEXPO programs within the DXWorldEXPO agenda. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of business. Only 12% still survive. Similar percentages are found throug...
OpsRamp is an enterprise IT operation platform provided by US-based OpsRamp, Inc. It provides SaaS services through support for increasingly complex cloud and hybrid computing environments from system operation to service management. The OpsRamp platform is a SaaS-based, multi-tenant solution that enables enterprise IT organizations and cloud service providers like JBS the flexibility and control they need to manage and monitor today's hybrid, multi-cloud infrastructure, applications, and wor...
Apptio fuels digital business transformation. Technology leaders use Apptio's machine learning to analyze and plan their technology spend so they can invest in products that increase the speed of business and deliver innovation. With Apptio, they translate raw costs, utilization, and billing data into business-centric views that help their organization optimize spending, plan strategically, and drive digital strategy that funds growth of the business. Technology leaders can gather instant recomm...
The Master of Science in Artificial Intelligence (MSAI) provides a comprehensive framework of theory and practice in the emerging field of AI. The program delivers the foundational knowledge needed to explore both key contextual areas and complex technical applications of AI systems. Curriculum incorporates elements of data science, robotics, and machine learning-enabling you to pursue a holistic and interdisciplinary course of study while preparing for a position in AI research, operations, ...
After years of investments and acquisitions, CloudBlue was created with the goal of building the world's only hyperscale digital platform with an increasingly infinite ecosystem and proven go-to-market services. The result? An unmatched platform that helps customers streamline cloud operations, save time and money, and revolutionize their businesses overnight. Today, the platform operates in more than 45 countries and powers more than 200 of the world's largest cloud marketplaces, managing mo...
Codete accelerates their clients growth through technological expertise and experience. Codite team works with organizations to meet the challenges that digitalization presents. Their clients include digital start-ups as well as established enterprises in the IT industry. To stay competitive in a highly innovative IT industry, strong R&D departments and bold spin-off initiatives is a must. Codete Data Science and Software Architects teams help corporate clients to stay up to date with the mod...