Tag Archives: data quality

Profiling data with ydata in PySpark

When you got a dataset to explore, there are several ways to do that in PySpark. You can do a describe or a summary. But if you want something a little more advanced, and if you want to get a … Continue reading

Posted in Data management, Python, Spark | Tagged , , , | Leave a comment

My experiences with Azure Purview

At my last customer I have extensively worked with Ataccama, a data management product. It has a data catalog to store metadata on datasets, and it can do data quality checks. In Azure Microsoft has a data management product too. … Continue reading

Posted in Azure, Data engineering, Data management | Tagged , , , , | Leave a comment