I think it was last year when I announced that I wanted to go back to conferences again. Preferably as a speaker. But what conference is the best for data engineers? I couldn’t quite figure it out. Then the call for papers for PyCon Ireland 2024 came by on my socials and I thought “why not Python?” I do lots of it, even though it’s not always work related. And I’ve never been to Ireland. So submitted two sessions. One got selected right away. I booked my flight and hotel and off I went last Friday (November 15 2024).

Day 1

Let me first of all say that I found the quality of the presentations very good. They were interesting and I was able to follow the topics quite well.

Jaroslav Bezděk started off with a talk about pandas and DuckDB. I know pandas, but I wanted to know more about the second one. Jaroslav’s talk confirmed for me what I already suspected: it is not that hard to start with DuckDB. Nice to see some examples. Certainly a tool I want to try out. I found his slides on Github: https://github.com/jardabezdek/talk-zoology-101.

Then it was my turn to talk about e-ink displays. I’ve grown fond of these devices, combined with Raspberry Pi’s. So I discussed 4 ways that I’ve used them. The slides can be found on my Github page: https://github.com/Marcel-Jan/talk-eink-dashboards.

It was so much fun to be a speaker at a conference again and the audience was very welcoming. Many people talked to me after this and my other presentation and it really makes me want to do this more often.

Mihai Creveti talked about what AI agents can and cannot do. I’ve got a clearer picture now what agents are good for. He also gave examples of tools that are used for this currently. There’s so much in the AI landscape nowadays. It’s good to know where the field is going.

The talk about testable pipelines by Florian Stefan was of great interest to me. He showed how he uses DBT (Data Build Tool) with dbt_expectations for testing of pipelines. dbt_expectations is not only useful for testing. It can also check if quantiles of column values fall within an expected values. So that goes further than just of they have expected values.

Mark Smith from MongoDB demoed an AI agent that can send real world text messages with excuses why he could not make it to the office. And by giving that agent a memory, you can make sure it won’t send the same excuses twice. Weird use case, but otherwise a clear application.

Like me, Cosmin Marian Paduraru, has used Python to solve a personal use case. He wanted to know if he could use visual intelligence to identify new items for his collection of bottle caps and avoid duplicates. He showed what technology he used and what obstacles he encountered. And he showed you don’t always need the latest and largest algorithm for this kind of work. And not a bad try for his first presentation at a conference ever.

James Shields from Bank of America talked about how to get a culture of innovation at your company. I speak from experience that getting a culture of innovation can be hard. It’s hard to get the time, get everyone involved, including management. And even if everyone is willing to innovate, it doesn’t always happen. At Bank of America they use hackathons. And even that is not everyone’s cup of tea. But still they are making great progress.

The Github ecosystem does not only support DevOps but it can also support DevSecOps. That’s what the talk by Eoin and Tom Halpin was about. They showed with a down to earth example how they use Github Actions and Workflows to not only do automated testing, but also do vulnerability scans. You can find their repo here: https://github.com/genai-musings/chatting-with-ChatGPT. I finally understand what these badges are for on the Github page. Certainly something I want to try out BTW.

Next I went to Paul Minogue’s presentation about vector databases. He discussed what vector databases are good for and shared his research on this matter. There were some surprises for me. For example that OpenSearch can be applied as a good vector data store. He also shared the challenges he encountered. I already knew about embeddings, but I’ve learned a lot about the ways you can search through embeddings if there are a lot of them and performance is not good enough.

Florenz Hollebrandse discussed the modular approach they use at JPMorganChase to make sure that when choosing solutions they don’t paint themselves into a corner. They decouple business logic from platform/deployment concerns. This way they are able to reuse more generic software. JPMorganChase open sourced a solution for this, which you can find on Github: https://github.com/jpmorganchase/inference-server.

Then it was my turn again. I was asked if I could prepare my other submission as a backup presentation. So I finished my presentation on how I used Python to prepare for my astronomy podcast the evening before, on my hotelroom. I had a lot to share about how I use ChatGPT to categorise astronomy news articles, how I use embeddings to find similar articles that don’t need categorising anymore (reduces the bill). And flattening the embeddings to 2D or 3D allows you to make nice graphs. My slides can be found here: https://github.com/Marcel-Jan/talk-python-astropodcast

And then there were the lightning talks where people can quickly share a topic of interest. It doesn’t always have to be directly Python related. That’s why the Swedish vessel called Vasa that sunk fairly quickly after barely leaving the harbour (“I’ve been projects on this before”). But also a daring demo of pre-commit, a tool that won’t let you commit unless your code adheres to certain standards. And about creating a Telegram bot with Firestore.

The day ended with pizza and fries. And good conversations with people I didn’t know before. This is such a nice community with many people wanting to share their knowledge with everyone else. It’s heart warming.

Day 2

On day 2 you could follow all kinds of workshops. And yes, a lot of them were RAG and AI agent related. A nice chance to try that out.

So I learned to use a vector database (MongoDB) and RAG to create an agent. And an important concept I learned here was chunking, which is used to break up text to make it possible for the agent to work faster.

Then I thought about following a workshop on scraping, but the room was already very full. So why not do more RAG? You can never do enough RAG. So I followed Shekhar Koirala’s and Shushanta Pudasaini’s workshop, which was about multi-model RAG. What that meant was that you feed your RAG software a PDF and it will separately get text, images and tables out of it. Which you then later can use in an agent.

And lastly I followed Cheuk Ting Ho’s workshop on Polars with Polar extensions. Polars is pandas’ faster sister. For this it has been programmed in Rust. So for the first time I’ve installed Rust on my laptop. I managed to follow the entire workshop. There’s still a lot I need to know before I can implement this on site. But I’ve got a bit of the hang of it.

And that was the end of Pycon Ireland 2024. I must say I’ve enjoyed it very much.

I stepped out of the Raddison Blu and finding myself to find a new purpose of the next of the day. I decided to explore the area of Trinity College a little bit. Because tomorrow I want to visit the Old Library. And a couple of other sights in the area.

I’ve also very much enjoyed dinner. I went to The Winding Stair next to the river Liffey. What an excellent restaurant. I’ll likely visit them again later this week.

A great time at PyCon Ireland 2024

Published by Marcel-Jan Krijgsman on November 17, 2024November 17, 2024

Day 1

Day 2

0 Comments

Leave a Reply Cancel reply

What I learned from using OCR to get data from my weighing scale

Masterclass Machine Learning in Cycling

Visiting PyGrunn 2025

A great time at PyCon Ireland 2024

Published by Marcel-Jan Krijgsman on November 17, 2024November 17, 2024

Day 1

Day 2

0 Comments

Leave a Reply Cancel reply

Related Posts

What I learned from using OCR to get data from my weighing scale

Masterclass Machine Learning in Cycling

Visiting PyGrunn 2025