Published inPython in Plain EnglishHow to Easily Extract Salesforce Data Using PySpark in PythonExtracting Salesforce data with PySpark can be a daunting task, especially when you’re faced with outdated modules and tricky dependencies…Mar 4Mar 4
Published inData Engineer ThingsMetrics for Data Engineering: Why, How, and Which Metrics to TrackData engineering can feel like a thankless job. You’re maintaining pipelines, handling data storage, and fielding endless requests without…Feb 23Feb 23
Published inData Engineer ThingsEnhancing Code Reusability with Python Packages in AWS GlueCode reusability is key to building efficient, maintainable, and scalable ETL pipelines. If you’re working with AWS Glue, leveraging Python…Feb 20Feb 20
Published inArt of Data EngineeringHow to Configure the GlueJobOperator in Apache AirflowData engineering often requires setting up workflows that seamlessly connect multiple tools. One common challenge is integrating Apache…Feb 9Feb 9
Published inCodeXWhy Apache Airflow Shouldn’t Be Used as a Data Processing EngineWhen working with Apache Airflow, it can be tempting to use its tasks and Python operators to extract, transform, and load (ETL) data…Jan 12Jan 12
Published inPython in Plain EnglishEasy to Implement Speech-To-Text ToolTranscribing with high-quality free tool using pythonSep 21, 20221Sep 21, 20221
Published inCodeXSpeech Recognition 101Brief introduction to automatic speech recognition concepts and how to apply itSep 1, 2022Sep 1, 2022
Published inCodeXExtracting Value from Non-Structured DataA guide through data concepts to understand some possible ways to extract value from non-structured dataJul 13, 2022Jul 13, 2022