Skip to main content

My name is Marcus, and I'm a data addict #resetthenet #snowden

2 min read

I'm going to rehab.

I'm going to start going to Data Addicts Anonymous, because as a programmer who builds online services, "Collect it all just in case" is a hard habit to break.

We all need to get into the habit collecting just enough data, and storing it just long enough, to solve a specific function. Keeping it longer is so tempting, and storage is so cheap, you find yourself thinking "ahh well, I'll store it, it might be useful later", but chances are it never will.

Case in point: In a system I'm working on, we extract EXIF image data from uploaded images (and strip it from the source, so that our customer's privacy is preserved). The only thing the EXIF data is currently used for is to sort out the orientation of thumbnail images, however my instinct was to store it in the database anyway.

Why? Chances are I'll never use this information, and collecting it just means it can be NSLed in the future. If we go so far as stripping it from public images, why store it in the database?

My pledge for : I promise, in the systems I build, to collect only the minimum amount of information to perform a specific task, and to store it only as long as absolutely necessary to perform it.