?>
GeneBackground4270

Data Engineers: How do you promote your open-source tools?

News & discussion on Data Engineering topics, including but not limited to: data pipelines, databases, data formats, storage, data modeling, data governance, cleansing, NoSQL, distributed systems, streaming, batch, Big Data, and workflow engines.

GeneBackground4270

What's your experience growing an open-source project?

A subreddit for everything open source related (for this context, we go off the definition of open source here http://en.wikipedia.org/wiki/Open_source)

GeneBackground4270

How Do You Handle Data Quality in Spark?

News & discussion on Data Engineering topics, including but not limited to: data pipelines, databases, data formats, storage, data modeling, data governance, cleansing, NoSQL, distributed systems, streaming, batch, Big Data, and workflow engines.

GeneBackground4270

If you love Spark but hate PyDeequ – check out SparkDQ (early but promising)

GeneBackground4270

If you love Spark but hate PyDeequ – check out SparkDQ (early but promising)

Articles and discussion regarding anything to do with Apache Spark.

GeneBackground4270

PyDeequ frustrated me — so I built SparkDQ (feedback wanted!)

A subreddit for everything open source related (for this context, we go off the definition of open source here http://en.wikipedia.org/wiki/Open_source)

GeneBackground4270

I built a PySpark data validation framework to replace PyDeequ — feedback welcome

The official Python community for Reddit! Stay up to date with the latest news, packages, and meta information relating to the Python programming language. --- If you have questions or are new to Python use r/LearnPython

GeneBackground4270

Goodbye PyDeequ: A new take on data quality in Spark

News & discussion on Data Engineering topics, including but not limited to: data pipelines, databases, data formats, storage, data modeling, data governance, cleansing, NoSQL, distributed systems, streaming, batch, Big Data, and workflow engines.