Insta Identities

Objectives

As an avid social media user, I find the growing occupation "Social Media Influencer" an interesting yet elusive matter. In attempt to better understand the demographics of followers, I decided to investigate the followers of a couple Insta-famous individuals.

Datasets & Tools

Instagram

This was my main source of information. Using a Social Media Influencer as a starting point, I would scrape information about their followers. I was looking for the following things:

  • Number of Posts

  • Number of Followers

  • Number of Followings

  • Bio

  • Links

These could be metrics for ​popularity, activity, self-identity, which will allow me to better provide insight into what attracts specific types of Social Media Influencers 

Social Media Influencers

I needed to choose a social media influencer to study, I wasn't very sure about the market so I look to none other than the internet. I came accross a blog that wrote about influencers accross a number of topics. 

I randomly selected an Instagrammer out of 5 industries, to look at the variation in followers. For future purposes I wish to increase the sample size of Instagrammers in each industry for a more consistent reading across industry, but for now I shall assume a representative sample. 

Travel

Food

Photography

Lifestyle

Referred in graphs as "Jannid"

RSelenium

Ever since Facebook's Analytica scandal, the Instagram API has been restricted to only the user's account information. This presented a problem, and I turned to Instagram's web intereface to scrape for some hope. 

The trouble with scraping Instagram is that it's rendered in Dynamic JavaScript, which means scrapping it using RVest was impossible. Using Selenium, a tool usually used for Web Testing was a great alternative. I set it up running in a Docker Container.

Manipulation & Visualization

Using Selenium I scrape the data into R. I then create DTMS and summarizations that I output to Tableau, which I use to visualize the data

Using Selenium I scrape the data into R. I then create DTMS and summarizations that I output to Tableau, which I use to visualize the data

Analysis

Cleaned Results

[1]

Here is a summary of the data I collected, with a sample size of about 2000 per Instagrammer with a total slightly over 8000

Getting A Gauge Of The Market

[2]

This graph immediately hones my hypothesis, that there were generally two categories of followers, one that were more active with sharing, but were not as concerned with followers (by creating content that would increase probability of being followed), and one that was less active but had significantly more followers but was passionate and focused on sharing. 

A Closer Look At Follower Distribution

[3]

The categorization of these followers are further motivated by this graph. Notice how The Points Guy (indicating Travel) and Dolly (Indicating Food) are more bunch in the middle, with smaller standard deviations. Many fall within the median categories.

It's All About The Ratio

[4]

Following this hypothesis, we see that it follows that followers of the photography and lifestyle categories are generally more popular (with a high follower : following ratio), but also put more effort in displaying a bio. This FF ratio is significantly lower for Travel and Food categories, with also less of them concerned about placing a bio in their accounts.

Quality Content?

[5]

If we compare no. of bios (number of people who have a bio on their account) against the median number of posts in each category, we see a negative relationship. This could indicate that people who post less are more selective with what they choose to say, and careful with their posts.

Self Description Lengths

[6]

Removing those without bios, there seems to be very little standard deviation across the board. However, those following the lifestyle blog seem to lower than average biography lengths, typically leaving it to a few emojis or a few words. Food followers seem to write the most, being very descriptive about what they enjoy, live and do as we'll explore in the upcoming sections.

What's In Their Bios: WordClouds

Lindsey Silverman (Food)

Murad Osmann

(Photography)

The Points Guy

(Travel)

Janni Olsson

(Lifestyle)

[7]

As you can probably tell, "travel" is huge, literally and metaphorically. Both Photography and Lifestyle have a largebase. Foodies have a consistency around good, with topics like nutrients, gluten and plants contained in them. 

The New Age of Emojis 

There is a huge number of Emoji's in use in Instagram Bios. I figured out a way to conduct a document term matrix on the emojis and evaluated their popularity based on group. Note that some of the emojis might not render due to a lack of full unicode support on the tableau public platform at this moment of time.

 

Move your mouse over the bars to get a preview of them!

[8]

I want to further investigate and take decide to create a comparison emoji by emoji. Notice that the y-axis is independent in each emoji category, as I want to investigate the differences between groups.

[9]

Scroll through and look at the comparisons emoji by emoji. Notice how Photography Followers have a higher tendency to use all of the majority of the emojis!

Where To Next? Links On Insta Bios

[10]

The most popular link that people put up is a youtube link, this is closely followed by facebook and oddly enough, vk a Russian social media platform. We can also see a much higher incidence of VSCO links amongst lifestyle followers as compared to the rest. Quite a high incidence of Food and Travel people use linktr, an website that links to more of your own links (as instagram only allows one link).

I was personally surprised by how much I was able to extract and learn from this process. Due to limited resources, I wasn't able to get a larger sample size of  followers in each category, thus the insights I provide may not be as generalizable to the over arching category, as it is to the actual Instagrammer. In the future, I hope to expand this study across a larger samplesize and a broader audience. For now, thank you for reading!

Imran Idzqandar is a Sophomore studying Business Analytics and Behavioral Economics at the Wharton School, University of Pennsylvania.

 

This was project for Prasanna Tambe's class, OIDD 245: Analytics & The Digital Economy. The focus of this project was to utilize digital exhaust to investigate matters that concern us.

Referred in graphs as "Dolly"

© 2020 Imran Idzqandar. All rights reserved.