Extracting Letterboxd Tokens with mitmproxy
Letterboxd has an API but it's available by request only.
1
Requests can take months if they even get a reply.
This is why in the past my projects have either relied on their export functionality (like Letterboxd Gaps
2
Check out the blog writeup!
) or simple webscraping. This works well when you only need your own activity data, but struggles when it comes to querying multiple users and intersections of their data. But the iOS app does it just fine—enter mitmproxy
.
mitmproxy
is a set of tools that allow you to perform "Man In The Middle" or MITM attacks. Whenever you open the Letterboxd app and load the home page the app sends off a request to their servers to fetch that data. A MITM attack puts us between the app and the server, and allows us to see what's being requested and what's being returned.
3
I've used this before in the past with Fiddler on my Windows computer and with an iOS app for getting Roborock vacuums working with iOS Shortcuts.
We can use this to emulate the app for our own API calls.
Starting mitmproxy
First we need to install mitmproxy
. You can do this by following the instructions on mitmproxy.org, but given that I have Homebrew
4
The industry owes a lot to Max, even if it hasn't always shown it.
installed I just ran brew install mitmproxy
. I want to keep a permanent certificate trusted,
5
You really shouldn't do this, but NSO keeps finding zero-click RCE for iPhones anyways so 🤷♂️.
so I chose an abnormal port to avoid any conflicts if I forget to turn off my VPN (in this case 48640).
6
I always choose ports in the same way as a holdover from working on some projects in college that needed 10+ ports. First I take a relevant word (in this case mitmproxy
), then convert it to A1Z26, multiplying the resulting numbers (78848640000), and then take a valid excerpt (78848640000).
Because the web interface is easier for copying, you can start and open the web interface with mitmweb --listen-port 48640
, which automatically opens a tab showing traffic as it arrives.
We just need one more piece of information from the Mac, and that's the IP address. Make sure your phone and Mac are on the same browser, and open a Terminal window with ipconfig getifaddr en0
giving you the right IP address (in this case 192.168.1.21
).
Pointing your iPhone to it
In the WiFi settings on your iPhone click the little ⓘ next to your current WiFi network. Scroll down to "Configure Proxy" and choose "Manual", entering in the IP address and port from the previous steps.
7
Ravenclaw Common Room is a holdover from when I briefly had four WiFi networks: 2.4GHz, 5GHz, the default Xfinity one and my secondary router.
You can blame the HP reference on me not watching Seinfeld and having only a passing interest in the Beatles, culling the pool of famous quartets.
Configure Proxy
Manual port information
Then from your iPhone go to mitm.it and choose the iOS option and follow the instructions to enable it: Settings > General > VPN & Device Management and click "Install".
https://mitm.it
for downloading certificate
VPN & Device Management
Click 'Install'
You may also need to go to Settings > General > About > "Certificate Trust Settings" to enable the mitmproxy
root certificate depending on your iOS version. Now open the Letterboxd app!
Extracting tokens
The tokens that Letterboxd uses are only good for an hour, so if it's been an hour since you last opened the app you should see a request to https://api.letterboxd.com/api/v0/auth/token
. Clicking that shows the collection of cookies for the app, a refresh_token
, client_secret
, and client_id
.
Believe it or not, this is all the information that you need.
8
It even works without the cookies, though I keep them in for Sid.
That POST
request returns a JSON dictionary with access_token
which can be used with Bearer Authorization and an expires_in
value (which is always 3600, or the aforementioned hour). Once it expires you'll need to hit /auth/token
again, but the refresh_token
remains constant so you won't have to update it by going through this process. We can now use this information to interface with the API.
COOKIES = {
'com.xk72.webparts.csrf': '811**',
'_ga': 'GA1.2.**',
'_ga_D3**': 'GS1.2.**',
'letterboxd.signed.in.as': '**'
}
REFRESH_TOKEN = '7163f**'
CLIENT_SECRET = '7d035**'
CLIENT_ID = '4f203**'
Using it with the API
The technical details are fun and all, but how do we use this? I previously had a couple of scripts that used the export functionality, and I took this as an opportunity to clean up the repo and move them into a unified repository that leverages these tokens, along with writing a few new ones. Here are some examples of one-off scripts that I've written out of idle curiosity over the last few years.
Review Curve Animation
I noticed at some point that my Letterboxd review curve was approaching a bell curve. 9 Or at least, it was until around 420 reviews in, before I adjusted my rating approach. Now it's skewed. I wanted to know how that looked over time, so I wrote code that backcalculated the distribution after each review and turned it into a gif.
Monthly Snapshot
I used to write a bunch of my movie reviews into a Slack group with some friends from college. That group has since moved to iMessage and I've also since moved most reviews to Letterboxd, 10 This is why a lot of my older reviews have more plot and character information and usually a direct "watch" or "don't watch" kicker. but I still give a brief overview of what I watched that month and what I'd recommend. To help with screenshotting I wrote a script that scrapes that month's diary into a local HTML file, and automatically resizes to match the window.
Timeline Graph
I got a stationary bike back in July 2023 and started watching movies while biking. I was curious if this trend was visible in the data, and so created a script to build an interactive HTML page with a graph of movies over time split by all sorts of characteristics (rating, tag, watchlist, etc). Can you tell when the bike arrived?? 11 The intermittent plateaus tend to be vacations, TV shows, and whatever campaign Brennan Lee Mulligan is whipping up these days.
You can also see the evolution of my ratings over time, with the number of movies I've rated 2.5 recently surpassing 4.5, and 2 surpassing 5.
Account Similarity
I was picking a movie to watch with a group and wanted to see the intersection of our watchlists. You can do this with two people in the UI, but for multiple it's not possible. Luckily with a script it is! I also had the script find movies where we substantially disagreed on the rating of a movie, or movies where we agreed with each other, but disagreed with the average score. 12 I stand by most of my reviews that conflict with my friend's...except for Promising Young Woman. I blame being swept up in the violin cover of Toxic.
Next Steps
I have a number of other scripts that are WIP: recommendations for users to follow or reviews to like, automatic color sorting for lists, poster syncing with custom media libraries, etc. Whenever they're ready they'll end up in this Github repo. Until then, if you're using this please be careful and don't do anything too onerous to the servers. And let me know if your refresh token/client secrets are the same as mine!
-
Requests can take months if they even get a reply. ↩︎
-
Check out the blog writeup! ↩︎
-
I've used this before in the past with Fiddler on my Windows computer and with an iOS app for getting Roborock vacuums working with iOS Shortcuts. ↩︎
-
The industry owes a lot to Max, even if it hasn't always shown it. ↩︎
-
You really shouldn't do this, but NSO keeps finding zero-click RCE for iPhones anyways so 🤷♂️. ↩︎
-
I always choose ports in the same way as a holdover from working on some projects in college that needed 10+ ports. First I take a relevant word (in this case
mitmproxy
), then convert it to A1Z26, multiplying the resulting numbers (78848640000), and then take a valid excerpt (78848640000). ↩︎ -
Ravenclaw Common Room is a holdover from when I briefly had four WiFi networks: 2.4GHz, 5GHz, the default Xfinity one and my secondary router.
You can blame the HP reference on me not watching Seinfeld and having only a passing interest in the Beatles, culling the pool of famous quartets. ↩︎ -
It even works without the cookies, though I keep them in for Sid. ↩︎
-
Or at least, it was until around 420 reviews in, before I adjusted my rating approach. Now it's skewed. ↩︎
-
This is why a lot of my older reviews have more plot and character information and usually a direct "watch" or "don't watch" kicker. ↩︎
-
The intermittent plateaus tend to be vacations, TV shows, and whatever campaign Brennan Lee Mulligan is whipping up these days. ↩︎
-
I stand by most of my reviews that conflict with my friend's...except for Promising Young Woman. I blame being swept up in the violin cover of Toxic. ↩︎