-
-
Notifications
You must be signed in to change notification settings - Fork 8
Description
If we only rely on SSO-based API integrations, we won't be able to provide very strong guarantees of identity. Unfortunately, most of these APIs do not provide information that they verify in a meaningful way.
However, thanks to GDPR and other legislation, almost all user data has to be accessible via at least a web UI in order for applications to comply with the restrictions placed on them.
In the long run, it would be great to have a more direct way to access this information, but in the short run, we can build tools for web-scraping that will help us get much more meaningful identity verification.
There are a number of architectural decisions to make here. Let's discuss potential ways of accomplishing this here. I've reached out to a friend who has recently built a fairly sophisticated client-side web scraping system that will hopefully be able to provide further insight.
Architecture proposal
For web-based users:
We could build a Chrome/Firefox extension that hooks into our backend servers via websocket. This would allow us to "drive" the user through the necessary steps to grab information from third-party services. We'd definitely want to think through fraud prevention there, as blindly trusting client-side data is rarely a good idea. Our best bet would be to actually use the extension to effectively "sniff" on the API requests providing the information to the page (not sure how this is possible, but I have it on good authority that it is, at least with Google Chrome extensions.)
For mobile users:
This will be a poorer UX for sure, but for accessibility purposes, it's very needed. A react-native application that effectively renders a webview and then accomplishes the same goals as the above application via more "usual" web-scraping techniques (reading the HTML) will be necessary. The good news is, within a signed mobile application, it's much harder to fake the data we're retrieving, so there's less need for extensive fraud prevention measures. In addition, it might similarly be possible to sniff web requests within an iOS/Android webview, which would make our scrapers much more resilient. Either way, it would similarly need to be driven by a real-time connection to our servers.