Am I the only one who has this? Let me know.
Phase 1: Discovery of New Product
Suddenly everybody talks about New Product. It’s said it changes everything. Articles about New Product appear on Hacker News for weeks. Then colleagues on LinkedIn even mention New Product (Warning! People you know, know New Product!). (Or they’re just linking to articles about New Product, so they look cool. Either way: they must know New Product!)
There are a lot of data-related Apache products out there and it’s hard to keep up with all of them. There are several products to stream or flow data (what’s the difference?). Like Kafka, Storm, Flink and NiFi. Yes, all products have documentation, but for an outsider their description sounds like “enterprise scalable streaming solutions”. What does that tell you?
I followed a Crash Course on Apache Nifi at the DataWorks Summit in München last month and was quite impressed. At heart I’m a command line kind of guy, but this graphical interface is really slick and it’s amazing what you can do to find out where your data goes to with NiFi. I decided to organize a workshop for my colleagues at Open Circle Solutions. Continue reading
“How do you got in Big Data?”, is a question that people asked me a couple of times now. So let me give that answer in a blogpost as well.
I’ve used eight sources of Big Data related knowledge and skills:
- Massive Open Online Courses (MOOCs)
- Meetups and summits
- Online documentation
- Hands-on experience
- Learning sites/”universities” of vendors
Posted in Learning Big Data
Tagged Apress, Big Data Expo, Big Data University, Cloudera, Coursera, Dataworks Summit, Drill to Detail podcast, EDX, Hortonworks, Learning Big Data, MapR, Massive Open Online Courses, MOOCs, O'Reilly, Packt Publishing, Roaring Elephant podcast, Udacity
Day two started with more keynotes. Ross Porter of Dell EMC talked about the ingredients of a successful analytics project. Carlo Vaiti of HP Enterprise had an interesting talk about trends in big data, but I would advise him to let a professional presentation bureau go over his slides. They were perfect for a breakout session, but not so much for a keynote. Too small fonts. Continue reading
Just left the beergarten party at Dataworks Summit 2017 in München. Okay, let’s see how well I blog after three of these large beers. Luckily I took notes before. Tell me when I start to become incoherent.
So actually for me the summit started yesterday at the Partner day, but today the breakout sessions started. I’ve attended a cool hands-on session on Apache Ranger and Atlas yesterday. These are the new security and governance tools for Hadoop. And I think we can say that the days that you could say Hadoop is largely insecure are about to be over.
The first session today I attended was the keynote of course. I was a little bit late, but it turned out that the second row seats were largely available, so I had a good view of the speakers. A couple of them tried to convince us that (big) data was going to change everything. I already knew that. Continue reading
After 20 years of working with Oracle products, I decided to make a new step: to become a data engineer. And that is just one term of the Big Data jargon I’m about to learn. It is my intention to use this blog to take you with me on this journey and to make sense of new products and jargon I’m about to get familiar with.
So what is a data engineer? Is it just the DBA of the Big Data world where Hadoop has replaced the relational database? I’ll keep you informed.