InTowards AIbyLouis-François BouchardThe Best RAG Stack to Date(exploring every component)Sep 14, 20241K17Sep 14, 20241K17
InTDS ArchivebyAlmog BakuThe LLM Triangle Principles to Architect Reliable AI AppsSoftware design principles for thoughtfully designing reliable, high-performing LLM applicationsJul 16, 20245.6K14Jul 16, 20245.6K14
Inget-defactobyDany SrageExplainability of the features? No! Of the hyperparameters.As data scientists, we enjoy training models, but the enthusiasm often fades when it’s time to fine-tune the model’s hyperparameters after…Mar 15, 20241.2K12Mar 15, 20241.2K12
InTDS ArchivebyJoos KFourier Transform for Time SeriesLearn what Fourier Transform is and how it can be used to decompose time series. With a worked Python example on CO2 time series data.Oct 31, 20213437Oct 31, 20213437
InLearning SQLbyMarie Truong4 Techniques to Solve Hard SQL ProblemsWrite complex queries easily with these tipsJul 18, 202362514Jul 18, 202362514
Shi Heng ZhangLineageX: The Python library for your lineage needsLineageX aims to generate the column-level lineage information by providing a lightweight, accessible, and flexible tool for developers and…Jun 8, 20232551Jun 8, 20232551
KhairinaData Pipelines Pocket Reference — Key Learning PointsIn this post, I will share my key learning points from the Data Pipelines Pocket Reference: Moving and Processing Data for Analytics book.Jul 2, 20231252Jul 2, 20231252
Peter FlomANOVA: Why analyze variances to compare means?Some people who are new to statistics get confused about ANOVA (analysis of variance) and one common confusion is why we are analyzing…Dec 8, 20187276Dec 8, 20187276
InTDS ArchivebyAngel DasIntroduction to Data Processing using Descriptive Statistics and Statistical Charts in PythonA complete hands-on guide to test Data Assumptions (MCAR, MAR, MNAR, Central Tendency, Skewness, and Outliers) in PythonJan 20, 20221023Jan 20, 20221023
InAnalytics VidhyabyPaul BananziCredit Risk Modelling in PythonCredit risk is the risk of a borrower not repaying a loan, credit card or any type of credit facility. Credit risk is an important topic…Nov 6, 2020701Nov 6, 2020701
InTDS ArchivebyEduardo BlancasOn writing clean Jupyter notebooks10 recommendations for writing readable and maintainable notebooksJul 23, 20211.7K20Jul 23, 20211.7K20
Adnan SiddiqiCreate your first ETL Pipeline in Apache Spark and PythonIn this post, I am going to discuss Apache Spark and how you can create simple but robust ETL pipelines in it. You will learn how Spark…Jun 9, 20198096Jun 9, 20198096
InAnalytics VidhyabyAyşenur ÖzenE-Commerce Data: Exploratory Data Analysis (EDA)In this article, I will use e-commerce data to take the first and most important step of data analysis with EDA, and thus we will see this…Jan 18, 2021531Jan 18, 2021531
InGeek CulturebyBurak DoğrulPredicting Customer Life Time Value (CLTV) via Beta Geometric / Negative Binominal Distribution…In this article, after explaining the history of the customer lifetime value and buy till you die model, I will examine the BG/NBD and…Jun 8, 20214271Jun 8, 20214271
InTDS ArchivebyChuck UtterbackPredict Customer Churn With PrecisionBalancing precision and recall for actionable retention tacticsApr 26, 20213372Apr 26, 20213372
InAnalytics VidhyabyPaul WanyangaImproving Customer Retention using Machine LearningBusiness PerspectiveApr 13, 202167Apr 13, 202167
InTDS ArchivebyCvetanka EftimoskaBig Data Analyses with Machine Learning and PySparkBig data is a term that describes the large volume of data — both structured and unstructured.May 4, 20201651May 4, 20201651
InTDS ArchivebyAndrii ShchurMarketing automation — Customer segmentationDo you know your customers? Customer analytics is becoming critical. These insights power businesses’ sales, marketing, and product…Jun 17, 20211682Jun 17, 20211682
DataTurks: Data Annotations Made Super EasyUnderstanding SVMs’: For Image ClassificationHello friends! Welcome back… In this fourth tutorial we are going to understand Support Vector Machines. An algorithm that intuitively…Aug 10, 20186053Aug 10, 20186053