Digital fingerprints and Privacy: The Paranoia of Online Activity

A growing concern these days is the privacy in ubiquitous online activity. Burgeoning developments and research in Internet of Things[1], Big Data, Social Networks and Data Science among others has aggrandized data collection, inference, pattern recognition and an exhaustive list of other parameters, that could tell a beautiful story about you. Remember those robotic fortune telling machines around 90’s, which play pre-recorded messages randomly? And then came few movies about Artificial Intelligence which predicted robots to become more intelligent than humans and take over thereafter. Well, the present scenario is far more appealing than those fiction. Given your credit card data from Mint [2], Driving data from small plugin [3] that insurance companies enticed you to install for some discounts, browser cache, Social network data from Facebook, Gmail, Twitter and other apps that you use, it is possible to tell every detail about you, some even you don’t know of.

These digital traces can comprehend more information than your actual fingerprint. Well, fingerprints are useless. It might lock your Iphone, but who wants to see your iphone data physically, when a lot more can be known through passive data collected from your activity.

What can data tell about you?

Though Correlation vs Causation[4] is highly debated, Inference can be very powerful in providing astonishing insights. Credit card transactions, that has replaced cash drastically, can provide tremendous digital traces through place of purchase, item purchased and patterns in purchasing. This can further corroborate in studying your socio-economic status, necessities, health condition and behavior through spending patterns. You might not know, but you may be a caffeine or alcohol addict that can be confirmed from regular purchases rather than habitual. Or it can be inferred that you like particular kind of food, genres of books or music, travel destinations or rather secretive sexual exposures. Driving data tells about the usual routes you take, your home and work locations, or simply every detail of your whereabouts. Following you distantly from a vehicle is old school. Tracking is much close and easier now. Social network data is much obvious. Shopping data, likes, tweets and stars tells about what you like from what you are fed online. This can tell a lot about your inherent character. What type of news you are interested in? Who are your close friends? How strong is your mutual relation? Do you have a partner? Do you both share common interests? Is your partner honest with you? Are you happy with your life? How do you perceive someone’s happy and sad updates? Does it has any impact on you? What are your real strengths and weakness? How offensive or defensive are you? What could be your strengths and weakness? Does showing similar objects on various platforms at different times subliminally persuades you to buy it? For example, a political campaign about a leader and few of his edges against his opponent, that are shown frequently at various platforms can impact your voting decision. This list of questions, that can be answered with data is exhaustive.

Unless you are in majority of law abiding citizens, it is rather easy to track any unusual activity. Search extensively about a monitored topic online, then buy a knife or gun with your credit card, then track the person or entity you are targeting in usual hide and seek methods and you will be seeing a cop waiting for you than your usual plan. In fact, police of chicago are already using such algorithms and data to predict criminal activity [5]. Though there is no particular reliable reference, it was rumored that NSA predicts unusual criminal or terrorist activity, if a person purchases relative items like pressure cooker, cycle rims, electric items that could be used to make a bomb. One or two instances may not be sufficient to confirm such events, and correlation might not always lead to causation, but carefully inferring data for various of such instances can predict the outcome very accurately.

Positive side of data collection and analysis:

But, there are some great advantages in sharing your data. Ever wondered how your timeline in facebook is getting better day after day? Or how your google search results are closely aligned to your interests? Every search, status you like, post you update or link you open tells more about you to the Artificial Intelligence algorithms that trains themselves to better serve you. One more instance is the targeted advertising. Amazon’s recommendations of related products might be old school, but how about google’s search results for lawyers in your area, while you are searching for information about traffic tickets. In this case, google knows what you wanted and where you live, but you could trust google in storing your information securely in exchange for better search results. And, have you tried fitbit’s sleep clock yet? Isn’t it amazing that your clock only wakes you up after you had enough sleep. More such instances are farmers using sensors to know better watering conditions [6], google’s live traffic updates etc. Check out more such activities here: [7][8][9][10][11], and follow their references for in depth study.

How to be cautious in this rather open world:

Removing your social network accounts is not going to help. It’s actually the least part to worry about. If you are really concerned about your privacy, you might probably have to abandon those Internet of Things. You will have to use cash for all payments, no mobile phone whatsoever, no email, no social network accounts, shouldn’t search for anything online at google or any other search engines, no internet connectivity and computer systems, and, this list is exhaustive that you might better of live in an unmanned island.

Data sharing is not bad, it might actually help you and poor researchers. But, it’s really important to convince yourselves with these minimal questions:

  1. Who are you sharing data with?
  2. What kind of data are they collecting? Mobile can comprehend all your sensor data like location, gyroscope, accelerometer, physical activity, logs among others, browser cache can be worse, searches are fine to some extent as long as they anonymize your information.
  3. Again, How are they collecting your data: Through mobile, browser cache or any other device?
  4. How will they use your data? What is their policy on sharing, selling or analyzing your data?
  5. What kind of anonomization procedures they are following? Can any of your information be conceived from combining various patterns?
  6. What happens if your data is accidentally lost? How worse can it be when compared to Ashley Madison’s leak?
  7. What kind of legal actions you can take for your data mishandling?

Most importantly, by now, you should have understood that collective data analysis can provide great insights. But, you should also know that such insights might not be comprehended from any single instance of data. Which means, you are probably safe, as long as you are sure that all your data is stored in different channels, and there is no way for someone to combine it. Remember, you can safely allow data collection by individual entities for better serving you, but you should not compromise to allow all such entities to combine and get to know more about you. Be better informed and exercise your discretion while sharing your data.

What are the biggest myths software engineers believe?

Answer by Kishore Kumar:

The Software Engineer's dilemma – Problems or procedures one simply doesn't believe in, or follow, unless or until they face it themselves.

Opinion :- Result / Response:

  1. An extra 'for' loop doesn't make much difference here :- The Service ends up consuming N^2 or worse resources. Couldn't be any better than setting the servers on fire, if the service is being used millions of times a day.
  2. Design is a waste of time, I have done much bigger projects earlier :- Lost in a maze, in the middle of project and have to re-do the entire stuff.
  3. Comments and Naming patterns? Dude, you are a developer! Go figure yourselves – The next hour working on someone's code…. What the hell is he doing here? Why is this named XY, rather than XX anyway?
  4. Subversion Update? Probably I might be the only one working on this file :- Ends up messing with conflicts.
  5. I will commit all my changes at once, once I am done :- It goes wrong somewhere, or previous version might be better than this and there is no way to go back, rather than writing from scratch.
  6. Unit tests? F**k my code is *** proof :- (Some of the external services changes, and the code might still be *** proof) – where the hell it is breaking.
  7. Similarly, Logs? Boring! :- What the hell is happening? How do I figure out where it is breaking.
  8. AES or some security algorithm is stringent, I might probably write or customize one my selves :- Well, actually AES or some X is better and I might better of follow established practices sometimes.
  9. There is lot of management, that I am wasting time on meetings than writing code :- (Looking at some new code) Who made this changes? Was it supposed to be here?
  10. Why does my company spend/waste huge money on testers? :- On seeing a bug reported by tester, f**k how did I miss that?
  11. Those first time set-up instructions are absurd, particularly for that server X, who is doing it like that anyway. I did better in my previous job :- Ok, I might have to re-do it from the beginning, following each step carefully.

This list is exhaustive. Suggestions or additions are welcome.

What are the biggest myths software engineers believe?