-
Updated
Oct 8, 2021 - Python
pyspark-sql
Here are 8 public repositories matching this topic...
Project based on application of azure databricks
-
Updated
Mar 7, 2023 - Python
This code demonstrates how to integrate PySpark with datasets and perform simple data transformations. It loads a sample dataset using PySpark's built-in functionalities or reads data from external sources and converts it into a PySpark DataFrame for distributed processing and manipulation.
-
Updated
Mar 31, 2025 - Python
Generate a synthetic dataset with one million records of employee information from a fictional company, load it into a PostgreSQL database, create analytical reports using PySpark and large-scale data analysis techniques, and implement machine learning models to predict trends in hiring and layoffs on a monthly and yearly basis.
-
Updated
Apr 22, 2025 - Python
This script builds a linear regression model using PySpark to predict student admissions at Unicorn University.
-
Updated
Apr 25, 2024 - Python
Repositorio para realizar el curso en Udemy llamado "Airflow2.0 De 0 a Héroe", de la academia "Datapath".
-
Updated
Feb 9, 2023 - Python
Worked on Pyspark file streaming
-
Updated
Jun 11, 2023 - Python
Objective: Perform word count tasks and joins using spark SQL within a Docker container
-
Updated
Mar 15, 2022 - Python
Improve this page
Add a description, image, and links to the pyspark-sql topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the pyspark-sql topic, visit your repo's landing page and select "manage topics."