Extracting Metadata Without Unpacking RAW Data

Is there a way to have libraw extract the metadata from a RAW file without unpacking all the data?

I'm using the libraw plug-in that is part of OpenImageIO. In that plugin, unpack() is called before any of the image properties are read, which causes the property reading to be incredibly slow because unpacking() is slow (relative to just reading the TIFF header in most RAW files.)

https://github.com/OpenImageIO/oiio/blob/master/src/raw.imageio/rawinput...

Can unpack() be skipped and the headers still be read correctly? What about the other values like image dimensions?

Forums: 

Indeed, you need only open

Indeed, you need only open_file() (or open_datastream() if you use it) to extract metadata.

unpack() only read and unpacks raw pixels, you do not need to use it if you need only metadata.

Also, if you need thumbnail too, you may use open_file() + unpack_thumb() calls only.

-- Alex Tutubalin @LibRaw LLC

Followup:

Followup:

if you need to free file handle w/o releasing metadata, there is recycle_datastream() call, this call will close file handle and nothing more.

-- Alex Tutubalin @LibRaw LLC

Followup2:

Followup2:

for non-square-pixel files (very old kodaks an nikons and several other cameras) you need to call adjust_sizes_info_only() call after open_file().

For most current cameras this call can be skipped.

-- Alex Tutubalin @LibRaw LLC

Alex,

Alex,

Thanks for the quick reply. If you take a look at the implementation in OpenImageIO you'll see that they call unpack() immediately after calling open_file():

https://github.com/OpenImageIO/oiio/blob/master/src/raw.imageio/rawinput...

From your comments above, it sounds like I could start reading the metadata immediately after open_file() and not call unpack() at all. Lower down, on line 285, there's a comment that indicates where the metadata reading code begins:

https://github.com/OpenImageIO/oiio/blob/master/src/raw.imageio/rawinput...

Is all of that code "safe" to call after only calling open_file()? Will valid values, if present, be correctly read? The last call to read_tiff_metadata() looks like it's just using standard OpenImageIO routines to read a TIFF header, I assume that would be OK as well?

Thanks again!

OpenImageIO::RawInput::open()

OpenImageIO::RawInput::open() do all things at one time: open, unpack, than process.

Looks like you do not need this, if you need metadata only.

LibRaw::open_file() will read all metadata, including image sizes (also, EXIF, makernotes, all).
It is really fast (fraction of second or less on SSDs)
But:
image sizes may not be completely correct if aspect ratio is not 1.0
To corect this, use asjust_sizes_info_only() call.

The bad thing that adjust_sizes_info_only() may prevent unpack()/dcraw_process() from working correctly (because sizes to be adjusted another time).
So, if you plan to call unpack() later, you'll need open_file() again.

-- Alex Tutubalin @LibRaw LLC

Yeah, the OpenImageIO

Yeah, the OpenImageIO implementation doesn't offer too much flexibility, but that's understandable. Maybe for RAW files it makes sense for me to just call libraw directly.

This code path is being using while importing a large number of images into a database, so it will only ever need the metadata. Later on, when the image needs to be decoded and presented to the user, an entirely different code path is taking so the issues with adjust_sizes_info_only() shouldn't really be a problem. I'll make that call after open_file() when loading metadata but I'll make it after unpack() when preparing the image for presentation.

This is fantastic, thank you.

You do not need adjust_sizes.

You do not need adjust_sizes....() if you plan (or use) unpack()/dcraw_process() calls.
These calls (generally, dcraw_process()) will adjust image sizes correctly.

adjust_sizes_info_only() call is made specifically for metadata-parsing applications (take look into LibRaw/samples/raw-identify.cpp sample, this sample prints A LOT of metadata if called with -v switch)

-- Alex Tutubalin @LibRaw LLC

Understood.

Understood.

Thanks Alex. Looking forward to playing around with this as I'm working on a project that needed support for filtering images by the usual metadata fields. Much appreciated.