Shein is rooted in controversy about its lack of sustainability, contribution to fast fashion, and even poor conditions for its labor workers. However, it’s clear that these issues have not stopped its tremendous growth and popularity over the last few years. The fast fashion giant is projected to overtake the combined sales of H&M and Inditex (the largest retailer to date) by 2025.
As an avid fashion shopper, I was curious how past scandals and public opinions really affected the retailer’s brand. I decided to look deeper into Shein’s Instagram and related Youtube videos to analyze user sentiment. I thought it was important to examine its existing customers. What could be some reasons Shein’s customer base continues to grow?
Data Collection
I scraped data from two main sources: the @shein_us Instagram account, and Shein Youtube reviews that popped up on the front page in the search engine.
I've included screenshots and a map of the data scraping.
Data Exploration
1. IG: I found that around 32.1% of IG user comments were from "influencers" on Instagram.
I calculated the followers/following ratio to identify these users below.
ig$f4fratio = round(ig$followers/ig$following, digits = 2)
ig$influencer = ig$f4fratio > 2
2. Youtube: Top Comments by Likes
3. It should be noted that among the four Youtube videos, there is a lot of variation among the popularity and view count of the video.
Data Analysis
1. Document Term Matrix: What are the most common words?
I quickly realized that I would have to find a way to deal with all of the emojis in my sentiment analysis. I didn’t want to strip them completely because I felt like they could still be useful. I used the libraries textclean and sentimentr.
I replaced the emojis with a text equivalent from the hash_emojis dictionary. Then, I started converting the text into a Document Term Matrix.
After this process, I had 117 unique terms from my Instagram comments. I pulled out the top 10 occurring phrases.
However, the Youtube comments seemed to lean more towards casual speech. This makes sense since these were much longer comments and more conversational.
2. Linear Regression: Do the words in the comments have a correlation to how well received they are?
I fit a linear model to estimate this. Likes was the dependent variable, while the other features came from the DTM Matrix. I pulled out the significant coefficients for both Instagram and Youtube.
Youtube is here and sized on the coeff:
I think that one main reason why the estimates are so extreme might be due to the skewed distribution of likes in both of these datasets.
3. Analyzing NRC Sentiment: What emotions are in these comments?
I used the get_nrc_sentiment() function to get more in-depth emotional analysis beyond a simple positive/negative score.
In both graphs, the positive sentiment clearly outweighs the negative scores by about 1.5 to 2x. Joy, anticipation, and trust are the top 3 emotions, with joy being the highest one in both social media channels.
4. Bing Sentiment: Analysis simply on a positive or negative score
While the distributions are mainly grouped towards 0, the mean sentiment on Instagram is .33 and on Youtube, it’s 1.15.
All in all, the major conclusion that I came to is there is a slight, positive outlook towards Shein. Additionally, there is definitely a greater quantity of positive scores in comparison to negative ones.
Concluding Ideas:
Hot Topics: The most popular or well-liked comments tended to talk about the affordability of Shein or the lack of sustainable practices.
Comment Length: Many of the commenters are happy or satisfied and will drop shorter comments to relay this. However, those unhappy with purchases will often write longer comments. (In further analysis, perhaps I could identify whether the length of a comment factors into its sentiment or popularity?)
One Reason for Success: Many of the positive comments emphasize “affordability” or “bang for buck” in the pros that they point out.
While it seems like Shein’s customers are aware of its notorious issues, many of the negative comments are targeted toward poor quality instead of social issues. On top of this, there seems to be a much greater number of users celebrating Shein’s affordability that drown out negative feedback on a quantity level. One main conclusion I would draw from this data set is that by consistently delivering on purposefully low expectations, the retail giant is able to meet customer satisfaction at a wide range, even if that means having a handful of its customers fall through the cracks.
Komentarze