So far I’ve found geolocations in XML metadata that my actioncam stores on disk as seperate .XML files and I’ve found them in JPG files. When I showed the cool maps I made to my father, he asked if I could create maps from his holiday videos. So that he can show cool maps in his video compilations.
Where do locations get stored in video files?
My father has a Sony PJ650VE video camera that makes videos in AVCHD format. Even the camera itself can show you a map of a video location. So I knew it should store geolocations somewhere. But looking on disk I saw no handy metadata files for me to read. So where did the locations go?
I learned that video formats like MP4, Quicktime (.mov) and AVCHD have EXIF metadata stored in them, just like JPG files. Luckily I had all the videos my father had made of our trip to the east coast of the USA in 2013. So I had lots of examples of AVCHD files to work with.
And a familiar tool called exiftool by Phil Harvey can read that metadata not just for photos, but also for videos. The geolocation data for AVCHD and MP4 files however, is stored as “embedded” metadata. You can get that embedded metadata by running exiftool with the -ee option.
What you see here is the exiftool command to extract embedded geolocation data from an AVCHD file (.MTS extention). For this I had to use a gpx.fmt file to format the geo data. You can find that file here: https://github.com/exiftool/exiftool/blob/master/fmt_files/gpx.fmt.
exiftool -p "C:\Program Files\Exiftool\fmt_files\gpx.fmt" -ee .\00000.MTS > 00000.gpx
This will produce a .gpx file with not just one geolocation, but several geolocations measured while the video was recorded. As far as I can see it has two geolocation measurements per second. But that might be a setting per camera. Here is an example:
[..] <trkpt lat="26.4226047222222" lon="-81.9083294444444"> <ele>8.806</ele> <time>2013-09-20T20:54:22Z</time> </trkpt> <trkpt lat="26.4226047222222" lon="-81.9083294444444"> <ele>8.806</ele> <time>2013-09-20T20:54:22Z</time> </trkpt> <trkpt lat="26.422605" lon="-81.9083297222222"> <ele>8.81</ele> <time>2013-09-20T20:54:23Z</time> </trkpt> <trkpt lat="26.422605" lon="-81.9083297222222"> <ele>8.81</ele> <time>2013-09-20T20:54:23Z</time> </trkpt>
As you can see it has latitude and longitude, as well as elevation (which is the same as altitude?) and a timestamp. This works with both AVCHD files as well with the MP4 files my Sony actioncam shoots.
Exiftool in Python
That geo data was what I wanted, but how to I get it into Python? I didn’t want to first write lots of .gpx files all over the place. It should be possible to read the geo data in a variable.
Looking for solutions I found lots of EXIF Python libraries. A lot of them hadn’t seen updates in years. But there is a PyExifTool library and it is up to date. The only problem: I could not find out how to get embedded data out of it. It should be possible with the class ExifTool-Helper, but I never found out how.
Running an os command from Python
I decided in the end to go another route: running exiftool.exe from Python. For this I used os.subprocess.run. It can run an OS command for you and return the output to a file, but also to a pipe. And we can direct that pipe to a variable.
First I prepare my exiftool.exe command, just like I ran by hand. Except now of course the filename is a variable which makes automation possible.
exiftool_command = ["exiftool", "-ee", "-m", "-p", "C:\\Program Files\\Exiftool\\fmt_files\\gpx.fmt", self.mp4file_location_disk]
Then I run this command with os.subprocess.run. The output that would have ran on screen, now is directed to a PIPE (subprocess.PIPE). All of that goes into my variable called exif_metadata.
exif_metadata = subprocess.run(exiftool_command, stdout=subprocess.PIPE)
What you get now is the result in byte format. Every line is started with ‘b and ended with another single quote and it has lots of carriage return and line feed characters in it (\r\n). To clean that up, we can decode it to a string with this command:
exif_metadata_decoded = exif_metadata.stdout.decode('utf-8')
The exif_metadata_decoded variable will now hold that same XML metadata that exiftool wrote to that .gpx file.
Digging into the XML
A short version of the XML data looks something like this:
<?xml version="1.0" encoding="utf-8"?> <gpx version="1.0" creator="ExifTool 12.41" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.topografix.com/GPX/1/0" xsi:schemaLocation="http://www.topografix.com/GPX/1/0 http://www.topografix.com/GPX/1/0/gpx.xsd"> <trk> <number>1</number> <trkseg> <trkpt lat="26.4226047222222" lon="-81.9083294444444"> <ele>8.806</ele> <time>2013-09-20T20:54:22Z</time> </trkpt> </trkseg> </trk> </gpx>
What we want to do, is get in that <trkpt> stuff and pick out the lat, lon, ele and time data. Like we saw before, you can have multiple trkpt’s per video file. I’ve done some experiments where I plotted all these trkpt locations for all video files of one holiday. That’s two locations per second of video. It resulted in a big, big html file that my Firefox was suffering under. So let’s just pick the first location per video file.
So how do we tell Python to traverse to that <trk>, <trgseg> and pick up the first <trkpt> element it sees? Well I’m still learning to read XML with Python, but here’s what I’ve done. I’ve used ElementTree from the xml.etree library:
from xml.etree import ElementTree as ET
I loaded the decoded exif metadata that I got from exiftool and created an XML root from that:
exif_metadata_root = ET.fromstring(exif_metadata_decoded)
Now I could iterate through the elements in that XML tree. Among then I saw my trkpt tags and from there I was able to pick out the lat and lon attributes.
for exif_element in exif_metadata_root.iter(): if "trkpt" in exif_element.tag: lat = exif_element.attrib['lat'] lon = exif_element.attrib['lon']
You’d think that ele and time would be regarded as attributes of trkpt also, but with this iteration they were regarded as different elements. So to read these, I just retrieved the text value of them:
if "ele" in exif_element.tag: elevation = exif_element.text
The latitude and longitudes were in decimal, so I did not need to convert them.
To pick out only the first location I checked if my lat, lon and ele had data in them. If so, I break out of the for loop. Suggestions to create cleaner solutions are of course welcome.
I then could send the location data straight to a Pandas dataframe to plot them with Folium.
I said that I would only pick out one geolocation per video file. But initially I didn’t and this is what it looked like. This is a track of videos my father and I made when we came in by ferry from Ocracoke (one of the Outer Banks islands of North Carolina) and we went on over Cedar Island, on to Morehead.
After that I’ve changed my code so that it picks just one geolocation per video. Even generating that map takes a lot of time if you want to show all locations of 2000+ AVCHD videos (1178 of which had actual geolocations in them). Reading the videos themselves instead of the little XML files my Sony actioncam produces takes way more time. I haven’t timed the complete run, but it was more than 1 hour. And here is the result (the southern half of the journey in 2013 anyway):
You can find the Python code I’ve used to create this here:
So I hope my father will be happy that I can create maps from his videos. Of course this isn’t the kind of solution my father can run by himself. He isn’t a developer. For now I will have to install Python on his PC and run it myself. Maybe I can create a GUI that points to a directory of videos and returns a jpg and that he can run some day. But I never had a lot of success with Python GUIs. So don’t hold your breath for this one.
Other blogposts I wrote about geo data in Python: