The session will go over understanding what each tool is and value of the following Big Data Tools (Kafka, Spark, and Cassandra) and then how to build out and architect these real-time solutions. This talk is based on past implemented solutions with clients. We will review the reasons for creating these solution, what business value they bring, how to integrate with existing source systems and integration best practices. We will also view a demo solution containing these big data tools. We will also review the advantages of doing open source vs the benefits paying for each Big Data tool Kafka, Spark and Cassandra.
Attendees would learn about how to add more capabilities and extend their existing Oracle solutions to include Big Data Tools and capabilities. They would also learn about business, statistical, and technical pieces that come into play when creating Big Data Solutions.
We will review the following solution architecture including how to size the infrastructure in the cloud and on-premise. The solution components include the following pieces: Kafka is the data source messaging system that will transport the data from source systems. Spark Streaming pulls data out of Kafka and performs ETL/ Spark Machine Learning Analytics and then transfers the data into Cassandra DB for storage. Solr DB is also installed on one of the Cassandra nodes to enable fast text search. We will also review how to technically architect the solution in Kafka(how to build the data pipelines), Spark(how to build the ETL), and in Cassandra(how to build your data models) and how to integrate these tools to work together.
6:30 – 7:00 PM: Networking and Pizza
7:00 – 7:10 PM: iOT State of the Union by Swathi
7:10 -8:00 PM: Main Presentation by Jeff
1) Introduction to Big Data tool Solution Technologies
2) Understanding the business value of Big Data Tool Technologies
3) Understanding how to architect the infrastructure for the Big Data Tool Solutions
4) How to build the Big Data Tools
a. Understanding how to create data messaging streams in Kafka
b. Understanding how to create ETL using Spark
c. Understanding how to architect the database using Cassandra
5) Demo showing the Big Data Tool Solution
8:00-8:30: Q&A / Discussion / Open Mic
Jeff Shauer: Jeff Shauer has been building out complex automated analytical solutions for over the past 13 years. He has worked with a variety of Big Data, Business Analytics, Business Intelligence, and Visualization technologies to help clients out over the past 13 years. Currently, he is helping to build real-time analytical solutions with petabytes of data using Kafka, Spark, and Cassandra. Jeff helps clients out by understanding business and technical problems and coming up with agile and automated solutions. He has experience in developing, maintaining, owning, and managing big data and financial planning systems at small, medium, and Fortune 500 corporations as well as in public sector clients. Jeff has a Master’s degree from the McIntire School of Commerce at the University of Virginia in the Management of Information Technology and a Bachelor of Science degree from the McIntire School of Commerce at the University of Virginia in Commerce. Jeff is the owner of the Washington DC based JS Business Intelligence consultancy that builds and creates Big Data and Hyperion solutions.
Swathi Sambhani: Swathi Sambhani has over 19 years of experience working in various roles in tech for a number of Fortune 500 enterprises. She has taken her experience of launching products in large enterprises to help startups take their ideas to market within 90 days. She runs the DC Emerging tech group that showcases successful enterprise use cases of technologies like IoT, machine learning, blockchain etc.