Oki Dwi SaputroControlling file size in Spark to Hive WrittingRecently, I’ve been struggling with small files created by pyspark jobs when writing to Hive tables. I had couple of attempts to solve the…4 min read·Dec 3, 2021----
Oki Dwi SaputroinNerd For TechRemoving Transformation from ETLWhat if we remove Transformation from ETL (or ELT)?2 min read·Apr 16, 2021----
Oki Dwi SaputroFisika dan Data Engineering4 tahun kuliah di Fisika dan melipir ke bidang lain.3 min read·Apr 9, 2021----
Oki Dwi SaputroinAnalytics VidhyaHow to predict churn customers using machine learning : Sparkify project.Churn prediction using pyspark on streaming services dataset.7 min read·May 18, 2020----
Oki Dwi SaputroHow to check good AirBNB listing?Based on Boston AirBNB Data exploration3 min read·Mar 25, 2020----
Oki Dwi SaputroinKurio ToolboxXGBoost Untuk Text ClassificationBase-model yang kita gunakan adalah LogisticRegression, sementara model lain yang digunakan adalah XGBoost dan SVM Linear Kernel.3 min read·Aug 23, 2017----