By Bradley J. Bartram, Vice President of Information Technology & CTO
I recently worked on an engagement dealing with a company defending against an action brought by an ex-employee. The case focused on alleged changes to a corporate database that, if true, would constitute a possible fraud.
This case made me stop and think about the nature of the data we put into our devices and what it knows about us. By now, we are aware of the normal pieces of data stored on digital devices and how easy it is to forensically recover them. However, what I’m going to discuss over the course of this article is the implied data that can be gleaned from a digital data repository.
Generally, when people think of their information, they tend to think of the obvious pieces of information they store. The content of the word document, the values and calculations placed into excel, the names and numbers in the address book, etc. These are all normal pieces of data that we look for and examine and provide us with the plot in our examination’s narrative. Any student of literature will recognize there’s more to a good story than just the plot. There are characters, events, subtle details, twists, turns, and pacing. Taken as a whole, it all comes together to give the reader a cohesive story. Miss one or more critical elements and your great novel becomes pulp. A digital examination is the same way. We need more than just content to determine the full context of the investigation.
So how does any of this relate to the topic at hand?
Well, data put down onto storage over the course of normal use is a very interesting thing. It’s very, very hard to fake. Sure, elements can be created that look like they belong and may be quite convincing when taken out of context as an individual fragment. When all of the data is taken together, things fit together in a logical and expected fashion – just like a well crafted story. Taken as a whole, a planted piece of data or an attempt to modify something will stick out to the trained observer.
What does our data know about us and why is it so quick to drop the dime?
Moving back to the database engagement I led off with, the data that’s stored has a multi-dimensional element. First is the obvious content of the tables and the fields. This is what the database is designed to hold, usually in support of another application. Next are the various pieces of data that designers usually put into the system to make life either more convenient for the user or easier on them. These are bits of data like time stamps that may tell when a data record is created, modified, or other wise accessed. Finally, there are often parts of the information programmed in specifically for the benefit of the software designer. Software has bugs. Software developers need some method of knowing what software does and when it does it so they can track down the bugs and fix them. This works to our investigative advantage.
When taken as a whole, and not in individual components, this extra data comes together to paint a picture of system use. As a user interacts with a database to do their normal routines, patterns of use emerge. If a small company has a 9 to 5 staff, usage is expected to begin at 9 or just after; slow down near traditional lunch times; pick up again after; and end at 5 or just before. Data is not expected to be modified in odd times unless there’s a system function that is scripted to do that, in which case the routine is stored and probably predictable. Given enough sample data, each user will begin to have distinct usage patterns that can be statistically compared to other samples and variations determined. Variances outside the norm will begin to stick out like a beacon. Evidence of wrongdoing is identified by the consistency of what is stored on the media or consistency of data can be used to exonerate the accused. Ultimately, the data tells the story.
Although this example has been centered around a corporate database, the same can be said of a laptop hard drive. People are creatures of habit. It takes much more discipline and effort to randomize behavior than it does to fall into a pattern of actions. We don’t even realize it, but our devices are storing all of our secrets and will reveal them to whomever knows how to ask the right questions.