Monthly Archives: May 2012

Photo metadata analysis

Overview

Digital SLRs record a number of information (metadata) about the shot being made together with the actual image. This includes for example the aperture, shutter speed and the focal length that were used during the shot. This information is stored in an Exchangeable Image Format (EXIF) datastructure, that we can read, and process. If you have a large collection of photos, you can learn about certain facts that you may not know otherwise. I’ve done, for example, an analysis of my collection in the context of focal length I use, before deciding on what prime lens I will purchase.

Attempt 1 – nodejs with exif module

If you are not into nodejs and coffeescript, you may skip this section entirely.

Prerequisites:

  • nodejs
  • coffeescript
  • exif module (npm install exif)
  • fs module (npm install fs)
The script below can be executed and it takes variable number of arguments in a form Directory1 Directory2 …
The script will run through your directory and generate a comma-saparated-value file, with a date, aperture, focal length and exposure time, that then could be imported into Excel or used in Gnuplot.

ExifImage = require('exif').ExifImage
fs = require 'fs'

process_image = (img_filename) ->
 console.log 'Processing file: ' + img_filename
 new ExifImage { image : img_filename }, (err, img) ->
 if err
   console.log 'Error: ' + err.message
 else
   process.stdout.write tag.value for tag in img.exif when tag.tagName is 'DateTimeOriginal'
   process.stdout.write ', ' + tag.value for tag in img.exif when tag.tagName is 'ExposureTime'
   process.stdout.write ', ' + tag.value for tag in img.exif when tag.tagName is 'FocalLength'
   process.stdout.write ', ' + tag.value for tag in img.exif when tag.tagName is 'FNumber'
   process.stdout.write '\n'

process_directory = (dir_name) ->
 fs.readdirSync(dir_name).forEach (file) ->
 extension = file.split('.').pop()
 if extension.toLowerCase() == 'jpg'
 process_image dir_name + file

a = process.argv[2..]
a.forEach (val, index, array) ->
 console.log 'Processing directory: ' + val
 process_directory val

Unfortunately, even for a directory that contains 20 files this will not work. Your script will chew all available memory and crash. It will work for a directory with only few files though. Try it.

TODO: check how the existing exif module is implement it and see if it is possible to make it only use the EXIF record instead of loading the entire image into the memory.

Attempt 2 – bash and awk

After failing with the cool coffeescript-based histogram-drawing attempt, I’ve decide to quickly hack a command line bash script that would together with awk and command line exif program, achieve what I want: a histogram of all focal lengths used in a photo collection. This method is fast and reliable.

Prerequisites:

To do that for your collection, here’s what you’ll need

  • bash
  • awk
  • exif (or any other command line executable tool you fancy, that can read EXIF data)
  • (optional) gnuplot (command line plotting tool); you can use Excel or similar program, too.

If you do not have exif installed, and you are on macosx, to get exif you say: brew install exif (if you do not use homebrew, you should definitely give it a try; you’ll love it.).

If you on Linux, use your package manager and install exif. bash and awk come as default on both macosx and Linux.

First, we need to collect all the focal length from all the images in a given location. The following, will go into your specified DIRECTORY, and search for all files in the top level and all subdirectories. The command assumes that you store only images – if you want to limit it to .JPG or .jpg files only, check the manual for find command:

find DIRECTORY -type f -exec exif -m -t 'Focal Length' {} \; > mydata.txt

After that, you can have a look, and mydata.txt should contain something along the lines:

25.0 mm
20.0 mm
26.0 mm
16.0 mm
19.0 mm
39.0 mm
34.0 mm

Great. Now, we need to convert it to a histogram-like structure, so that we can visualize it on a graph. To do that, we’ll use cat and awk:

cat mydata.txt | awk '{count[$1]++} END {for (j in count) print j, count[j]}' > mydata

The above line will count each occurance of a given focal length, and combine it into a table. If you look inside mydata file, you should see something like that:

29.0 22
20.0 44
150.0 1
187.0 1
105.0 13
31.0 15

Now, with the mydata file, we’re ready to use gnuplot (or you can load the file into Excel). Simply fire up gnuplot, and do:

plot "mydata" w impulses lw 3 lc rgb "#00AA00"

Done. I hope you’ll enjoy checking what is the focal length you love to shoot at.

Check examples of focal length histograms of my photos.

 

What defines you

Direct vs. Indirect control

I often spend a lot of time planning, envisioning, and wishing for a particular outcome to happen. It occurs in different areas of work – in software projects, in design, in management. This is normal. This “wishful thinking” gives us direction and focus. It helps to identify elements and activities that shift odds in our favour. This “wishing” helps to create a plan. It does help, indirectly, to achieve our desired outcome. And so on. This is how we all get some things done. We “wish” for something to happen, we do “our thing”, and then the outcome happens. As we wished for, or not.

Often, the outcome decides if the undertaking is considered a success or a failure. This is a form of mental jump that we do. A form of “simplified thinking”. The trap is that our “wishful thinking” and “outcome” blur the value of our work with all the value that is created in spite of us. Our mental processes blur the elements that are in our direct control from the elements that are not in our direct control.

Craftsmanship

One realisation that occurred to me is that things that are in my direct control are, by far, much more important than things that are not. And, what’s more important, things that are in my direct control define me, whereas things that are not in my direct control, do not. What follows from this is that the outcome alone does not define me. This is somewhat counter-intuitive and a bit entangled, so let me explain. I work on a project. I control a number of elements, e.g. the quality of the code, the usability of the interface, the architecture of the system, the engagement of end-users, and so on. Those things are in my direct control. These factors shift the odds of the project being adapted and used by users. But, I do not control directly the end-user uptake. I can only, indirectly, influence it, and I can shift the odds by doing high quality work, but there is a number of elements that will always be beyond my control. The trap is that I often disregard those elements, and treat them as if they were non-existant. This is a mistake for two reason: first, I lose focus on things that are in my control, and second, I tend to take credit (or blame) for things that were completely out of my own control or influence (things that just happened, in spite of my own efforts or doing).

The result we wish for is often a combination of elements in our direct control, together, with elements that are outside, elements that we do not control directly. Focusing only on the outcome might distract you from the quality of your work. Instead, you should direct your focus to all the elements that are in your direct control.

Instead of the outcome, you should let the quality of your craftsmanship define you. 

The interesting paradox is that the outcome of a project is strongly influenced by the quality of work and all the elements in our direct control. Hence, the elements in our direct control and the actual outcome are entangled. Focusing on the mastery and the craftsmanship will ultimately help to achieve the desired outcome and it will prevent taking blame (or credit) for things that are beyond our control.