You’ve heard the stories: IoT, Sensors, Mobile-everything-everywhere in real-time is leading us to a volume/velocity/variability data Armageddon. Will anything save us? Rest assured there is a world-class group of data fanatics working on it. How do I know? Because I just returned from a data immersion of the highest order – Hadoop Summit 2016 in San Jose, CA.
If you can believe that someone actually makes sense trying to convince you that the “The only thing that now exists is Data” or that the unreasonable effectiveness of Algorithms, Cloud, IoT, and Data is something I need to be keenly aware of, then this was your place. And I have to admit, I ducked into the rabbit hole and sought out my data nirvana.
This was no mild feat. I haven’t seen as much code, math, and whiteboard-screenshots thrown up on a big screen in front of me since my freshmen year in engineering school. These folks came to play and it required a quick mind, a fast laptop, and battery power, lots of battery power. Oh, and bandwidth, tons of bandwidth because this is the open-source community and you better believe this stuff works, because they loaded it out there in Git repositories, with gobs of example datasets, and 1-click Yarn/Docker deployments so you can download it right then and there and test it and enjoy. Go ahead, I dare you. There are JIRA tickets waiting for you.
At these events, I like to have a topic, say a meme, or a thread that guides my interest. Last week, it was data streaming analytics. Why was that on my mind? Well at SSG we’ve got this thing called the “TAC”, Technical Advisory Committee. Sounds heavy, but it is more like a live, in-person, no-holds-barred Slack-like gathering where you get to hear the voices and engage in the conversation, way more fun than just reading the text and emoticons. It is an open forum. We talk, muse and brainstorm about anything and everything. Some of it is pretty out-there (at least some of us think it might be) and much of it is really cool. We’ve come up with this idea about a foundational change and it is all about streaming data, forever, infinitely persisted and immutable. Yep, no more data warehousing, no more ETL. MDM? Nope, don’t need it. It’s all in the stream. Get it when you need it. And you know what? #HS16SJ was full of this stuff! Apache Storm, Flink, Spark Streaming, Apex, and Kafka came up time after time. The Stream IS the Datastore. A change is coming, a big one. Get ready, because I am not making this up. Excuse me, I have some Git files to go download.
Mike Moses is an Apache Nifi fanatic and head sales guy at SSG. An engineer by training and a philosopher at heart, his passions span decades of leading edge and bleeding edge Tech revolutions. Building on a background in engineering and solution design, he has focused on enterprise software and data management implementations for the past 20 years. Surrounded by one of the finest teams of developers on the planet, he has become an evangelist for revolutionary change in the way software is created and data is set free. Analytics, modeling, and machine learning thread his story for bringing novel and game-changing solutions to clients and to the world.