PDA

View Full Version : How to parse a date/time string that may be invalid only because of DST?



rlk
11th July 2020, 22:51
I'm a developer on KPhotoAlbum, a volunteer open source image management application for KDE. We recently received a bug report from a user because some of the person's photos were not getting the date extracted correctly from the EXIF data embedded in the photograph.

The date/time in a photo is, per the EXIF specification, encoded as

YYYY:MM:DD hh:mm:ss

The problem was one of the photos had a date string of

2000:04:02 02:05:37

This fell within the spring-ahead that year. As a result, the attempt to parse the string as a date yielded an invalid date, as documented. Even much more recent cameras don't automatically correct for DST (which would be problematic, among other reasons as it would require a firmware release whenever the timezone rules changed anywhere). But the result is that if you're taking photos during just the wrong hour of the year we aren't able to extract a timestamp and have to fall back on the file modification time.

Obviously, this is a rather rare situation. I've personally never hit it in the something like 300,000 photos I've shot, since I've never been out shooting at 2 AM on a certain Saturday night in the spring. Still, if there's a robust way of working around it I'd like to be able to do something. One thought that came to mind is if the parser fails to append a 'Z' to the timestamp string (to indicate UTC) and try again; if it succeeds, it's clearly an indication of a DST problem. We could then either accept the UTC time as the timestamp, or if we can somehow work out the timezone, correct for that. Fiddling with a timestamp string is not the most robust approach, however.

We'll probably simply document this as a corner case and recommend that people manually adjust the timestamp of their image files. That's not going to be to everyone's liking; one of the big draws of KPhotoAlbum is that it does not modify your image files in any way. It won't be acceptable, for example, to people using cameras that sign image files for forensic purposes, since the EXIF data is included in the signature.

Thoughts?

ChristianEhrlicher
12th July 2020, 11:24
Parse date and time separately and add the time to your datetime later on.

ChrisW67
12th July 2020, 23:34
I'm a developer on KPhotoAlbum, a volunteer open source image management application for KDE. We recently received a bug report from a user because some of the person's photos were not getting the date extracted correctly from the EXIF data embedded in the photograph.

The date/time in a photo is, per the EXIF specification, encoded as

YYYY:MM:DD hh:mm:ss

The problem was one of the photos had a date string of

2000:04:02 02:05:37

This fell within the spring-ahead that year.
Unfortunately the EXIF spec does not specify the time zone that the time stamp represents.

If the camera was recording local time then it must have some idea of a timezone (i.e. to derive it from GPS with/without DST) or be a naïve clock with no concept of timezone or DST at all. Since the camera recorded 02:05:37 then its idea of the time did not include DST and in my view you should parse it using a time zone without DST (e.g. UTC+10:00 instead of Australia/Sydney). You need to know the timezone in the location, and at the time, the photo was taken (not the viewers current time zone). If you also have the GPSTimeStamp and GPSDateStamp metadata then you could derive a time offset (non-DST) to work with. I don't know how you perfectly determine the correct time zone if the camera has not also recorded it.

If the camera is recording UTC (or GPS time) then there is no DST, and 02:05:37 is perfectly valid.

The forensic argument is fine, but a lawyer would have a field day with this gloriously ambiguous EXIF timestamp (digitally signed or not) if the GPS data is not also present. I guess you could display both the original file string and the date/time as you have interpreted it with an explicit time zone shown.