Here[1] is your anonymization tool: kloak - Keystroke-level online anonymization kernel.
Authenticating users via typing isn't new. In fact it was already available more than 12 years ago. Unfortunately the original source isn't available anymore, but Bruce Schneier covered it back then: https://www.schneier.com/blog/archives/2005/11/authenticatin...
I also did find a paper in German from the company which tried to commercialize it in the following years: http://www.horst-goertz.de/hgs-wordpress/wp-content/uploads/...
This is interesting. I think the weird underscore between words (which indicates space) is throwing me off though. I'd recommend removing that.
I’ve seen something similar for a mobile app:
https://www.onenigma.com/case-study/typingID
A friend built it a few years ago. It’s integrated into at least one app, and has some serious potential. I suspect a regular keyboard would be the same.
Part of the streaming anomaly detection benchmark dataset NAB has keyboard typing data (amongst other interesting sources) [1]. Comparing the algorithms would be quite interesting; the algorithm in [2] is unsupervised and doesn't need training data. The benchmark dataset and algorithms are open-source.
https://github.com/numenta/NAB/tree/master/data#real-data
Ahmad & Lavin, Neurocomputing 2016, https://www.sciencedirect.com/science/article/pii/S092523121...
Cool stuff! My name kept showing up along with a few others, but I wonder it that is due to insufficient users in my typing speed range. I'll try again later.
Please make commas more visible. The upturned underscore obscures it.
Interesting concept. I did 62 sentences and got "⭐️⭐️⭐️⭐️⭐️" as a rating but none of my last 5 are matched up. I figured it would be pretty easy to pick me out as I use http://mkweb.bcgsc.ca/carpalx/?full_optimization as my main layout and thought that would make my key distances very unique.
On a usability front I echo that the "␣" is definitely confusing, especially next to a ",".
While typing, a GitHub username that contains my surname showed up. That isn't all that unlikely -- my surname is among the top 50 in the US -- but it makes me wonder to what extent genetics determine similarity in typing patterns. Maybe I can use this to find some long-lost relatives. ;)
I made something like this (without the neural networks) just recording the typing cadance of users entering their password. If the cadance matches within a tollerance then it can be used as a crude indicator of identity, along with other factors like IP, geolocation, browser fingerprint etc.
Funnily enough, I had a classmate in my grad neural networks class that did exactly this. Results on his end seemed to be pretty similar to yours though I don't think he publicly hosted it anywhere
Regardless, this is an interesting application of NNs
I have often wondered if it would be possible to make a simple game of typing 3 or 4 digit sequences (similar to the dactylo typing games like "type the words before they fall to the bottom of the screeen"), and then find out if PIN code subsequences or digit transitions have a distinct timing pattern. If possible it would be very creapy if it would imply browser based games, or a keylogger, or your employer could extract your pincode from enough typing material (say in some spreadsheet)...
One interesting issue that I encountered is with the way the US International layout works. Normally when I type ', that is a dead key for accenting characters; the thing is, ' then a character that can't be accented with it (say t, or s), yields me with 't or 's, whereas when I type that way here I have to explicitly do '␣ to yield '. I tried it on my own with the KeyEvent API and got the same problem, not sure what the fix is but it sure messes my writing up.
I'm incredibly impressed that you implemented the whole LSTM model in JavaScript [1] as well as Python. This indirectly gave me a lot of implementation insight, so thanks!
[1]: https://github.com/indutny/gradtype/blob/master/src/model.js
This seems worse than browser fingerprinting since it identifies people, not browsers.
Does the Tor browser contain a way to combat this?
After entering the 20 sentences, getting "sorry the server is down trying later" is the worse user experience ever. If the server is down, please do something before someone enters 20 sentences... or have a "try again" button to submit again...
I remember someone did this using a smart phone sitting on the same desk as the keyboard as the sensor (and maybe the processor, I forget), but the idea was to recreate what they were typing rather than biometric identification.
The _ and the , are a deal breaker for me, especially when following each other. That's not readable. Ain't copying 30 sentences like that.
Broken on apostrophe.
`don't` got stuck on the `'` char.
My college had some computer science folks that gave out $10 cash per student to type a page of text for obtaining training data.
got about 60 sentences in, and my name popped up here and there. my name's first occurrence was at sentence ~40.
Pretty neat.
Could someone please explain how this works. I’m familiar with runs but not sure how this is set up.
This is quite interesting. I’m qurious whether it can be used for “forgotten password”... maybe not
I wonder how it will react to random input, I mean me just typing non sense
one more demo of this I saw - https://vikasdesai.github.io/keystroke-dynamics/
Are you planning to open source the dataset generated?
I heard the NSA has been doing this for nearly a decade.
This doesn't work and can never work. But it's a nice toy.
Creepy. Why on Earth would you build something like this?
Amazing how people in tech are so flippant about building tools that are almost exclusively useful for tyrants.
Sometimes it helps to know the jargon around this topic. If you look for "keystroke dynamics" several articles and github repos turn up.
Experienced morse-code interpreters used to be able to recognize who was at the other end that day by their typical intervals and mistakes.