| Data quality cheks. RFM datamart. |
SQL, Common Table Expression, Window Functions, PostgreSQL, cloudbeaver |
| Modifying DWH. Migration to the new model. |
SQL, Window Functions, PostgreSQL, cloudbeaver |
| Modifying ETL and datamarts. Implementing idempotency. |
AirFlow, SQL, PostgreSQL, cloudbeaver, bash, pandas, SQLAlchemy, PostgresOperator, BashOperator |
| Data quality checks in ETL |
AirFlow, SQL, PostgreSQL |
| Datamart in DWH based on multiple sources |
Airflow, PostgreSQL, MongoDB Compass, pendulum, Jupyter Notebook, bash, SQLAlchemy, PostgresHook |
| Datamart based on Analytical Database Vertica |
AirFlow, Yandex S3 Storage, Common Table Expression, SQL, Vertica, cloudbeaver, pandas |
| Working with PySpark in Hadoop. Working with HDFS. |
Hadoop, Spark, PySpark, YARN, MapReduce, Window Functions, HDFS, Airflow, SparkSubmitOperator, Parquet |
| Processing stream data with Spark |
Kafka, PySpark, AirFlow, kcat, Jupyter Notebook, SQL, PostgreSQL, Spark Streaming |
| Cloud services |
Yandex Cloud Services, Datalense, Kubernetes, kubectl, Kafka, kcat, confluent_kafka, flask, Docker Compose, Helm, Redis |
| Combining data streams. Analytics datamart. |
Yandex S3, DWH, Vertica, boto3, Airflow, TriggerDagRunOperator, Metabase |