In the first quarter of 2019, amidst privacy scandals and investigations, Facebook reported 2.39 billion active users. That was a 55 million increase in active users from the last quarter of 2018 making it the largest social media network despite growing competition from the likes of Twitter and Instagram.
It's active usage makes it an appealling gold mine for advertisers vying for the attention of of potential customers. In order to maximise the brief window of attention available in this microwave environment we live in, agencies now need to have a pretty good idea of who they're speaking to, how they're speaking to them and what they're everyday experiences look like.
The following notebook is used to cluster the the various viewers of an advertisemnt campaign running for an anonymous company on Facebook. Machine learning is then used to predict whether a conversion, both the total (which is the number of people who inquired about product after seeing the ad) and the approved (which is the total number who bought the product after seeing the ad).
Whilst the words clustering and segmentation are often used interchangeably in predictive marketing, the two words mean different things. Segementation is the grouping of cutomers due to identfied similarities in order to target the exact customer profile for a particular product or service. The groups or segments are predefined and the users are filtered by matched similarities until the Reuleaux triangle, also known as the overlapping central sweet spot of a venn diagram, is found.
On the other hand clustering refers to finding similarities by which to group customers. Unlike with segmentation, the group is not predefined and is instead created as more commonalities are found amongst the customers. Clustering makes use of machine learning algorithms to relate features and create new segments.
Some of the biggest takeaways from this project was the fact that often assumptions are made with regards to the demographics one assumes group people together. The KMeans clustering quickly disproved these assumptions and alluded to a previously ignored trait being the biggest influence and customer clusters.
Another key takeaway from this project was the need for "enough" data in order to train models and come to a conclusive decision about the fit f a model or lack thereof. Another key practice is the scaling of data and understand the algorithms behind certain models in order to better interpret the results found.