This event has ended. Visit the official site or create your own event on Sched.
Texas Ballroom [clear filter]
Tuesday, October 4

10:00am CDT

Reactive Distributed Systems for Streaming Big Data, Analytics & ML

Being reactive in distributed systems is critical, but what does that really look like at scale with Terabytes or Petabytes of data ingestion per day, or what does it mean in applications and deployment architecture? 

There is a need to simplify. How can we build resilient, self healing systems that run at massive scale which don't lose data, support rigorous requirements, in the chaos of big data, partial failures, split brain, and eventual consistency? How would you build awareness and intelligence into your systems if 'everything fails all the time' was a starting point?

This talk looks at the problems differently, with reactive strategies and technologies that collaborate and how they help achieve more stable, self-aware systems. 

 Helena has been building large-scale, reactive, distributed cloud-based systems for many years, distributed big data systems for the last four, choosing Scala, Akka and Kafka for the core of all. She will discuss simplification of big data architecture, data flows, and a collaborative set of supporting technologies.

avatar for Helena Edelson

Helena Edelson

CEO, The Axis Initiative
Helena is using AI and complex adaptive systems to study and help endangered species under climate change, biodiversity loss, human-wildlife conflict and illegal wildlife trade. Bridging academia and industry, she is a member of the Environmental Intelligence team of the Interagency... Read More →

Tuesday October 4, 2016 10:00am - 10:50am CDT
Texas Ballroom

11:20am CDT

Riding the Jet Streams
Java 8 introduced the Stream API as a modern, functional, and very powerful tool for processing collections of data. One of the main benefits of the Stream API is that it hides the details of iteration over the underlying data set, allowing for parallel processing within a single JVM, using a fork/join framework.
I will talk about a Stream API implementation that enables parallel processing across many machines and many JVMs.
You will learn how you can use the same API to process massive data sets across large clusters, which you already know how to do in a single JVM.
With an explanation of internals of the implementation, I will give an introduction to the general design behind stream processing using DAG (directed acyclic graph) engines and how an actor-based implementation can provide in-memory performance while still leveraging industry-wide known frameworks as Java Streams API.

avatar for Viktor Gamov

Viktor Gamov

Principal Developer Advocate, Kong 🦍
Viktor Gamov is a Senior Solution Architect at Hazelcast, the leading open-source in-memory data grid (IMDG). Viktor has comprehensive knowledge and expertise in enterprise application architecture leveraging open source technologies. He has helped leading organizations build low... Read More →

Tuesday October 4, 2016 11:20am - 12:10pm CDT
Texas Ballroom

2:10pm CDT

Scala and the JVM as a Big Data Platform - Lessons from Apache Spark

The success of Apache Spark is bringing developers to Scala.

For Big Data, the JVM uses memory inefficiently, causing significant GC challenges. Spark's project "Tungsten" is fixing these problems with custom data layouts and code generation.
In this talk, we'll see what we've learned from Spark, ongoing improvements, and what we should do to improve Scala and the JVM for Big Data.

avatar for Dean Wampler

Dean Wampler

VP of Rocket Surgery, Lightbend
Dean Wampler, Ph.D., is the VP of Fast Data Engineering at Lightbend. He leads the development of Lightbend Fast Data Platform, a distribution of scalable, distributed stream processing tools including Spark, Flink, Kafka, and Akka, with machine learning and management tools. Dean... Read More →

Tuesday October 4, 2016 2:10pm - 3:00pm CDT
Texas Ballroom

3:10pm CDT

Microservices: The danger of overhype and importance of checklists
It is a hard to argue that microservices is a new hotness in the world of software. Technical teams in seemingly all new startups are planning to develop their product using microservices, while architects in Fortune 500 are managing tireless conversations about which microservices technology is best to use for their next generation platform. The overhype of microservices is here and with that comes a danger to loose the true meaning of the word, which soon might be labeled as overrated. Before it happens, let's talk about what microservices architecture is, what it can offer, what it gets us into, what its price tag and how to create a checklist for choosing right tools that would help solve the complexity instead of sugarcoating it. 

avatar for Katrin Shechtman

Katrin Shechtman

Enterprise Architect, Lightbend Inc.
Software Engineer with years of experience developing large platforms in C, C++, Java and Scala utilizing many different frameworks. Currently works at Lightbend as Enterprise Architect helping big enterprises embrace a world of Reactive Systems and Big Data.

Tuesday October 4, 2016 3:10pm - 4:00pm CDT
Texas Ballroom

4:30pm CDT

Reactive Stream processing at Netflix
Netflix customers stream over two billion hours of content each month, accounting for over a third of downstream Internet traffic during peak hours. At this scale, Netflix's systems generate and collect millions of events every second, such as request traces, streaming client activities, and system metrics. It is essential for engineers to process such data streams efficiently and reliably to support real-time monitoring and alerting, outlier detection, application diagnostics, trend prediction, and many other operations.

This talk will discuss Netflix's stream processing system a.k.a. Mantis that supports a reactive programming model, allows auto scaling, and is capable of processing millions of messages per second with configurable message delivery guarantees.

Mantis is a stream processing framework built on reactive principles using RxJava, Netty and ReactiveSocket. It provides users with the capabilities to write scalable stream processing jobs without having to worry about hard problems such as managing continuous data flow in a distributed environment or ensuring fault tolerance.

avatar for Neeraj Joshi

Neeraj Joshi

Senior Software Engineer, Netflix
Neeraj is a Senior Software Engineer on the Netflix Edge Realtime Events team. He has over 12 years of experience in the industry building highly scalable and resilient systems. He designed and developed Scryer Netflix’s predictive autoscaling engine and is the key contributor to... Read More →
avatar for Nick Mahilani

Nick Mahilani

Senior Software Engineer, Netflix
Nick is a Senior Software Engineer on the Netflix Edge Realtime Events team. He has over 10 years of software engineering experience building a range of software systems from embedded routing/switching software to highly scalable distributed microservices.

Tuesday October 4, 2016 4:30pm - 5:20pm CDT
Texas Ballroom

5:30pm CDT

Functional (and Reactive) Operations
If we were starting greenfield development of a service or web application today we would likely employ a number of practices and design choices that are known to optimise application responsiveness, resiliency, elasticity, and/or composability. Delivering our reactive applications on top of predictable infrastructure will set our project up for success.
Some of us don't have that luxury. We must provision, deploy, and operationally maintain legacy monolithic Rails web applications and HTTP APIs that are hard to refactor without introducing new bugs, poorly performing, and struggle to meet user load/peak demand. Built during a prior era of the company where fast and loose practices were rewarded, startup cowboys delivered the first set of features promptly at the expense of subsequent velocity, long-term maintainability, and high risk deployments. When living in this reality, our infrastructure must be reliable or our application needs constant babysitting, leading to on-call fatigue and high staff turnover.
The good news is there are core principles we can apply to produce more reproducible systems, failstop deployments, and consistent environment configurations to eliminate a large class of bugs inherent in legacy applications and minimize related business risks. This will be the focus of the session and applies to both greenfield and legacy cases.
Code examples given using NixOS and Haskell, but focus remains on the underlying principles.

avatar for Susan Potter

Susan Potter

Distributed Systems Engineer, Referential Labs
Susan is a distributed systems engineer straddling technical operations and engineering helping make data and service infrastructure operationally manageable at scale. Over the last seventeen years she has worked on algorithmic trading systems, market data software, multi-tenant service... Read More →

Tuesday October 4, 2016 5:30pm - 6:20pm CDT
Texas Ballroom
Wednesday, October 5

10:00am CDT

Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka @PayPal
Akka Streams and its amazing handling of stream back-pressure should be no surprise to anyone. But it takes a couple of use cases to really see it in action - especially use cases where the amount of work increases as you process make you really value the back-pressure.

This talk takes a sample web crawler use case where each processing pass expands to a larger and larger workload to process, and discusses how we use the buffering capabilities in Kafka and the back-pressure with asynchronous processing in Akka Streams to handle such bursts.

In addition, we will also provide some constructive “rants” about the architectural components, the maturity, or immaturity you’ll expect, and tidbits and open source goodies like memory-mapped stream buffers that can be helpful in other Akka Streams and/or Kafka use cases.

avatar for Anil Gursel

Anil Gursel

Software Engineer, PayPal
Anil Gursel has been working on the JVM since 2004. His current focus is to implement reactive applications using Scala and Akka. He is a Software Engineer at PayPal’s Infrastructure team where he helps teams build highly scalable, low latency applications. Anil is a big advocate... Read More →
avatar for Akara Sucharitakul

Akara Sucharitakul

Principal MTS, Architect, PayPal
Akara Sucharitakul founded project squbs (pronounced s-cubes) for Internet scale Akka productionalization. He works in the PayPal infrastructure team, on both squbs and messaging. Akara is a 20 year veteran of the JVM from its very early days and of a veteran of Sun for 15 years... Read More →

Wednesday October 5, 2016 10:00am - 10:50am CDT
Texas Ballroom

11:20am CDT

Stream Processing with Apache Flink in Zalando's World of Microservices
In this talk we present Zalando's microservices architecture, introduce Saiki – our next generation data integration and distribution platform on AWS and show how we employ stream processing for near-real time business intelligence.

Zalando is one of the largest online fashion retailers in Europe. In order to secure our future growth and remain competitive in this dynamic market, we are transitioning from a monolithic to a microservices architecture and from a hierarchical to an agile organization.

We first have a look at how business intelligence processes have been working inside Zalando for the last years and present our current approach - Saiki. It is a scalable, cloud-based data integration and distribution infrastructure that makes data from our many microservices readily available for analytical teams.

We no longer live in a world of static data sets, but are instead confronted with an endless stream of events that constantly inform us about relevant happenings from all over the enterprise. The processing of these event streams enables us to do near-real time business intelligence. In this context we have evaluated Apache Flink vs. Apache Spark in order to choose the right stream processing framework. Given our requirements, we decided to use Flink as part of our technology stack, alongside with Kafka and Elasticsearch.

With these technologies we are currently working on two use cases: a near real-time business process monitoring solution and streaming ETL.

Monitoring our business processes enables us to check if technically the Zalando platform works. It also helps us analyze data streams on the fly, e.g. order velocities, delivery velocities and to control service level agreements.

On the other hand, streaming ETL is used to relinquish resources from our relational data warehouse, as it struggles with increasingly high loads. In addition to that, it also reduces the latency and facilitates the platform scalability.

Finally, we have an outlook on our future use cases, e.g. near-real time sales and price monitoring. Another aspect to be addressed is to lower the entry barrier of stream processing for our colleagues coming from a relational database background.

avatar for Javier Lopez

Javier Lopez

Big Data Engineer, Zalando
Javier is a Colombian Engineer from the National University of Colombia. During his bachelor studies he focused on Software Engineering, Telecommunication technologies and Business Intelligence. After working more than 7 years as Software Engineer in different industries (Education... Read More →
avatar for Mihail Vieru

Mihail Vieru

Big Data Engineer, Zalando
Mihail is passionate about designing and implementing highly scalable, performant and robust data processing solutions. He enjoys continuously learning and working with cutting edge technologies. Mihail earned a Master's degree from the Humboldt University of Berlin, Germany, where... Read More →

Wednesday October 5, 2016 11:20am - 12:10pm CDT
Texas Ballroom

2:10pm CDT

Monolith to reactive microservices
In this talk, we will review an experience of rearchitecting and migrating a system that appeared reactive and microservice-based, but was in fact a monolith with RPC calls to a truly reactive architecture.

The migration work had to be done without causing disruption to the current system, and without taking time to rewrite the system. The result is a biometric computer vision system with a distributed domain in Akka / Scala with storage in Apache Cassandra, with the computer vision components in OpenCV in C++, connected with RabbitMQ and with batch analytics code in Apache Spark.

This talk will show the architectural and code smells that were the result of half-harted reactive implementation and the way to address them, but also the impact of the changes on privacy and security of the stored biometric information.

avatar for Jan Machacek

Jan Machacek

Senior Principal Engineer, Disney Streaming Services
I am a passionate engineer & author with practical experience in architecting actor-based systems, well-engineered machine learning systems, and matching infrastructure. I have hands-on experience in the architecture & implementation of:Connected health systems (signal processing... Read More →

Wednesday October 5, 2016 2:10pm - 3:00pm CDT
Texas Ballroom

3:10pm CDT

Schema upgrades in a continuous delivery environment
You have a 24/7/365 environment up and running with dozens of services and dozens on instances for each. Everything is going well. Then, a new feature is required and it requires change to one of your datastores. Since you have a few dozen instances of that service running, you need to perform a rolling upgrade. In this talk, we will discuss some strategies to allow such upgrades to be possible.

avatar for David Buschman

David Buschman

Technical Lead, Timeli, Inc.
Dave Buschman is the Technical Lead for the Colorado-based startup company Timeli ( timeli.io ).  We operate in the Meter and Sensor Data collection and analytics space.  His experience in the Java Enterprise SaaS space lead him to move to Scala a few years back, where he is now... Read More →

Wednesday October 5, 2016 3:10pm - 4:00pm CDT
Texas Ballroom

4:30pm CDT

Embracing Streams…Everywhere

“I don’t need stream processing because I don’t have streaming data.” - Anonymous
In general people relate to streams as data models which are naturally streaming (infinite asynchronous messages) or relate to realtime big data processing. In this talk Nitesh Kant, will try to break this myth by emphasizing the fact that streams exists everywhere, be it data read from sockets, protocols like HTTP or microservice composition. He will explain how extending this ubiquitous interaction model into applications can result in simpler, resilient and maintainable systems.
You will learn, how to start thinking “streaming first” through concrete examples and how adopting this mental model makes writing application easier.

avatar for Nitesh Kant

Nitesh Kant

Senior Software Engineer, Undisclosed
Nitesh Kant is a veteran in design and development of reliable distributed systems with over 15 years of experience in this space. He is a core contributor of ServiceTalk, an open source networking library for the JVM that supports multiple protocols including gRPC. Previously, he... Read More →

Wednesday October 5, 2016 4:30pm - 5:20pm CDT
Texas Ballroom

5:30pm CDT

Implementing an akka-streams materializer for big data
Akka Streams provides a tremendously flexible architecture to build reactive pipelines that can be imported, exported and otherwise composed as partial DAGs. Its current, default implementation is ActorMaterializer which provides reactive streams across actors within a single JVM. 
Here we show how we implemented a GearpumpMaterializer which distributes reactive streams across a set of remote workers on the Apache Gearpump platform. We discuss how this was implemented and a number of challenges we faced with specific GraphStages and their semantics. Additional we cover how different materializer implementations can interoperate together to materialize different parts of the pipeline. We show that the changes we introduced internally within Akka Streams will enable other implementations of akka stream materializers and suggest a template based on our implementation. We will contribute the Gearpump materializer as open source to https://github.com/akka/akka-stream-contrib or make it available as part of an upcoming Apache Gearpump release.

avatar for Kam Kasravi

Kam Kasravi

Senior Software Engineer, Intel Corp
Kam is working on deep learning systems and kubeflow

Wednesday October 5, 2016 5:30pm - 6:20pm CDT
Texas Ballroom
Filter sessions
Apply filters to sessions.