Looking for work! Check my resume here.

· 6 min read

OG Tags and Facebook Issues

Painful workarounds to accommodate Facebook's web crawler

Working on my personal project Bilolok, I began implementing the Open Graph Protocol so that content shared on Facebook or other social media sites would appear professional and garner interest for the app. This was especially important for Facebook since nearly 1/3 of the country’s population are active members and I expect most incoming visitors would be from there.

Originally, I just had static Open Graph tags in the project’s <head> but later, when I added more user content I wanted dynamic OG tags to match the content of the page.

First Issue

Bilolok is a SPA and therefore the index page’s <head> would not be able to update until the JavaScript was loaded and run. Well, that is true in the basic deployment pattern others exist and keep reading for more. Web crawlers or bots do not usually run JavaScript so whatever OG tags I set on the website’s index.html would be the OG tags for all content shared on external platforms. I was using a package called vue-meta to update the OG tags and page title when the user navigates through the website but again this requires the JavaScript is running which web crawlers and bots do not do. So I needed a way to render the OG tags I wanted server side so the external social media sites had the information I wanted.

One solution would be to invest in server side rendered SPA such as NuxtJS which is Vue’s response to NextJS but that sort of investment to my app seemed like a major rewrite since Nuxt is very opinionated and I would have to make sure my app fits into whatever their conventions are.

Another solution would be an external service such as Prerender.io. I did not look too far into this one since I didn’t want to take on an external service for my small project. Plus, I know my requirements are quite simple and I could handle it myself.

So finally, I decided to build my own prerender of sorts. This solution uses a simple nginx if statement to catch known/popular crawler and/or bot user agent strings then redirect them to the dynamically crafted index.html file for the resource they requested. I do not prebuild index files for every resource but instead added a subdirectory in the backend FastAPI server where I use starlette to handle the requests. Starlette is the foundation which FastAPI is built so this doesn’t require any new packages to be installed to the project except for jinja2 for templating. Since the small app for OG tags is nested in the main backend app I have access to all the same models and CRUD classes so I can keep consistency between the main API and this app easily. The returned files are very simple only really containing the customized OG tags in the head and not much more.

Possible Improvement

So now we have two backend apps, the primary API and an app for server side rendered files containing customized OG tags. Don’t forget we also have to add if statements to our app’s nginx config to redirect known crawlers to this OG app. Also, don’t forget we have the default, static OG tags on the index page loaded by those who do not get caught in the crawler check. So at this point there is a bit of coupling between the dynamic OG tags and default static OG tags which makes me feel I have two sources of truth, 3 if you count the tags being modified by vue-meta in the app too.

This sort of coupling has me thinking about how I could use the OG tags app to return an HTML file that includes scripts to the built frontend app. Now this is another sort of coupling that complicates deployment but removes the 2 or 3 sources of truth for OG tags. The trade-off down the road may be worth it but for now this quick afternoon fix to my initial problem using starlette suffices.

Second Issue

Facebook could not find the page each time I tested on their Sharing Debugger. They would show that they received a 307 response to upgrade their request from http to https. Apparently, they do not have time to follow redirects so I had add my crawler check into the nginx servers on port 80 and allow them to the OG tags app without the usual redirect to https.

The OG tags app is working as intended and very small and easily to modify if ever needed. But, it still doesn’t work on Facebook. Facebook would also report issues for images saying the content type was invalid.

Bilolok has a sub-domain for images since I found that easy to set up and I like keeping services distinct that way. I am using Thumbor to process return images so my first thought was something was wrong with how Thumbor was responding. Yes, Thumbor is configured to return image/webp formats if supported which Facebook sharing does not support so I thought that was issue but even forcing all users to use jpeg format by adding a header in the nginx config that forced Accept: image/jpeg did not resolve the issue.

My next thought was to be sure Thumbor was not managling the images in some way that I hadn’t noticed before but the Facebook crawler may have been sensitive too. But, I noticed the same issue was true to a jpeg file on the root of the app which was returned by the root domain and not the sub domain for images so Thumbor was not the issue.

Lastly, I dug further into Google to find this issue online. Suprisingly, I found a new result that said HTTPS was the issue. The first occurance of the issue I could find was a StackOverflow post from 2013 but apparently later posts also complained this issue was ongoing as late as 2018 and I can confirm even in 2022. As before the https issue was the root problem. I then had to modify my nginx app config further in the images subdomain config to check if a crawler was requesting the resource then allow them access the resource and not redirect to https.

Benefits

Overall this feature adds a bit of complexity to the Nginx config files and requires another systemd service to keep the starlette app running. But for an app that leans on social elements so much the reward of dynamic share titles and images is worth it. In fact, once I searched Facebook for a kava bar that I heard was having an event and my app was the top result since the kava bar did not have an active page of their own.

Conclusion

Catering to Facebook adds complexity to my nginx configs and some mental overhead for future development if I add new resources for sharing but I guess it is a necessary evil since my user base are active Facebook users so Facebook sharing has to work.

  • python
  • starlette
Share:
Back to Blog
FYI: This post is a part of the following project.
Bilolok

Bilolok

A Foursqaure inspired app for kava bars in Port Vila, Vanuatu.