Failing on some images

Thanks for the effort on this! I was doing basically the same thing previously, but perhaps not as expansively.

I am having one problem that I do not understand. I am invoking the following methods:

open_buffer(data,length)
unpack()
dcraw_process()
dcraw_make_mem_image(&ret)

I am then taking the image data and with copy_mem_image trying to display it.

This is working with many, but not all raw test files that I have.

I got the prebuilt binary for the simple_dcraw sample, and it works fine on the failing images. I took simple_dcraw and linked it with my copy of libraw... again, this works fine.

In looking at what the function dcraw_ppm_tiff_writer is doing, it is clearly doing a lot more than simply writing out a PPM image and it doesn't appear that there is any function or set of functions exposed by the LibRaw library that does this. Is this what I am missing?

Forums: 

Could you please explain what

Could you please explain what is 'not working':
- error codes (on which call exactly, what code)
- error codes are OK, but resulting image is broken?

Also, dcraw_emu has -mem command-line switch to test memory I/O instead of File I/O. It it working for your case, or not?

-- Alex Tutubalin @LibRaw LLC

More specifics on problems

Well, that is the problem. No error codes. It is difficult to describe the contents of the image, but what I am getting is some fairly random noise.

Here is a link to everything: https://dl.dropboxusercontent.com/u/43288013/Photos.zip
There are three files: 600_0001.CRW (original), 600_0001.CRW.ppm (what it is supposed to show as) and 600_0001.CRW.jpg (what I am seeing).

It seems most of the failures are .CRW (Canon) images, at least so far. Some of the .CRW are working and .NEF is working fine for all of them.

I will see what I can do with the prebuilt binary for dcraw_emu and experiment with the -mem switch. My use of the library has the image loaded into memory already which is why I am using open_buffer. Basically where I am in this is that I took an older dcraw-conversion and ripped it out replacing it with libraw. It was mostly working before (and did work with .CRW files) but it was a dcraw from around 2009. I wanted to pick up more current camera support. Two weeks into the project... here I am.

Paul Crowley

OK, Interesting.

dcraw_emu (prebuilt) says "unknown file: data corrupted at 127149" with the -mem switch. Without it, works fine.

Not sure what this means, but it is different.

Paul Crowley

More likely, it means that

More likely, it means that some of internal tags pointed outside of data buffer.

This is not an error for file case (seek out of file will not read anything, so tag value is ignored), but will raise error in buffer case.

BTW, it means you *have* error code at unpack()

Thank you for file sample, to be investigated in depth.

-- Alex Tutubalin @LibRaw LLC

OK, the -mem switch is instructive

I was trying to understand how an error could be being returned and not being caught. Turns out, I have callbacks turned off (as in LIBRAW_OPIONS_NO_MEMERR_CALLBACK | LIBRAW_OPIONS_NO_DATAERR_CALLBACK) and this error is being reported with the dcraw_emu program via a callback!

So, it would seem that this error is not being reported or returned when the callback is turned off. That is an important note as to the importance of the callbacks which I did not understand originally.

Turns out for the image in question "derror()" is being called from canon_load_raw(). Not exactly clear what is going on with this, but it seems like it is a problem with pixel values. What isn't at all clear is how the image is being processed correctly when this is not asserted.

I am not so sure I am missing something in my use of libraw any longer, other than it should probably pick up data error callbacks and treat the image as unsupported if this occurs.

Paul Crowley

callbacks are optional.

callbacks are optional.

You'll receive error code for call and, also, bits in imgdata.process_warnings for non-fatal errors (data error is not fatal).

BTW, the image is indeed broken, but raw data in place. Still investigating.

-- Alex Tutubalin @LibRaw LLC

Things become stranger:

Things become stranger: recompiled LibRaw 0,17 and the problem gone. Now I cannot reproduce the problem.

Anyway, the file is broken (metadata incomplete).

I still do not know why it not unpacked by 0.17-WIn32-binary from this site and no way to know it because the problem looks un-reproducible.

BTW, derror() count call is returned by LibRaw::error_count() (after unpack() ), so it is better to check it in production code.

-- Alex Tutubalin @LibRaw LLC

That does make things tough

That does make things tough to figure out. I will pull the error count to check for things like this - I can see the counter at 255110 after unpack() with what I have. I think any non-zero value is likely to be fatal.

What is failing is the line:

if ((pixel[(block << 6) + i] = base[i & 1] += diffbuf[i]) >> 10)
derror();

I am not at all sure I understand this line of code. If one separates out things in parentheses, we have an assignment to pixel[block << 6> + I] which makes sense, but the condition being tested is, at best, a little odd. This does not appear in older versions of dcraw, so I have nothing to compare against. But why would anyone form a conditional expression which comes down to (x >> 10)???

I suppose this could be a rather strange way of saying (x > 1023), but why make it so cryptic? I know, that is the dcraw/dave coffin way of things. But I can complain, can't I? It has been a long time since I actually found obfuscated C to be all that funny.

Does this pixel-content test actually make sense in this context?

One thing occurs to me with this - I am thinking that it is a typical debug/release sort of problem. I am working with DEBUG builds right now and building the library as a static library in DEBUG mode. I am wondering if the problem might disappear with a release build - perhaps because of initializing memory with zeros instead of CDCDCDCD, or just because of different stack layout.

I will do some more digging since I now have a much better handle on why it is failing. Maybe it should fail, but there is an image buried in there that can be rendered, so it would be nice to be able to get it out.

Paul Crowley

Not trying to be annoying

Not trying to be annoying here, but ...

OK we get into canon_load_raw with "maximum" set to 4095 (fff), discover that "canon_has_lowbits" returns true so maximum is not changed to 1023. I find this curious since the pixel value evaluation that is going on later is checking against a fixed value of 1023 rather than comparing against "maximum".

So, should the derror check actually be if the pixel value exceeds maximum? I am wondering if in release mode this right-shift operation is actually just being optimized away because there result really isn't being used. One of the things with serious C obfuscation is that today's optimizing compilers tend to interpret fancy side-effects as being irrelevant and just throw them away. This differs compiler-to-compiler, so if you are using GCC (some version?) and I am using VC 2013 we might be seeing vastly different results in optimized code.

Especially when someone is using a right-shift as a value test.

Paul Crowley

This sample is from PowerShot

This sample is from PowerShot 600 camera, so decoder is canon_600_load_raw()

-- Alex Tutubalin @LibRaw LLC

Mystery solved: 64-bit LibRaw

Mystery solved: 64-bit LibRaw is OK, while 32-bit builds gives this error.

To be continued tomorrow morning....

-- Alex Tutubalin @LibRaw LLC

fseek problem in 32-bit build

Yes, I tracked it down to that as well.

There is the odd construct (odd? For dcraw? Whatever!) that is making some interesting assumptions:

fread (make, 64, 1, ifp);
fseek (ifp, strlen(make) - 63, SEEK_CUR);
fread (model, 64, 1, ifp);

With VC 2013 (and probably many others) the return value from strlen() is size_t, which is unsigned. What we want here is to back up the data pointer with a signed offset. The fseek memory buffer implementation wants a signed 64-bit value, but the result of (10u - 63) promoted as an unsigned value to 64 bits is 0x00000000ffffffcb which doesn't work at all.

I suspect there are a few other places where a signed offset for fseek is used as well and anywhere it occurs with an unsigned 32-bit value being promoted to 64 bits is going to have problems.

One way to get this working is:

fread (make, 64, 1, ifp);
fseek (ifp, (INT64)strlen(make) - 63, SEEK_CUR);
fread (model, 64, 1, ifp);

By casting the size_t to an int the promotion of the value from 32 to 64 bits works properly. Unfortunately, this is in the base dcraw code. I suspect there might be a way to fix this in the LibRaw code. I do not believe that the define:

#define strlen(x) (INT64)strlen(x)

is the right way to go, but it would at least get the cast out of the dcraw code. Alex, I'll let you figure out a way to accomplish this with the least amount of work.

Paul Crowley

Yes, this data type

Yes, this data type conversion problem looks like cause:
- model reads incorrectly
- decoder do not set right (canon_load_raw() remains) because it set by make/model name
- and, so, data do not decoded right.

-- Alex Tutubalin @LibRaw LLC

OK, magic is completely

OK, magic is completely resolved:
- on 64-bit compile everything is OK because of 64-bit size_t
- on unix 32 bit compile all is OK because fseek's offset is long, so 32 bit
- on 32-bit LibRaw with file interface, LibRaw_file_datastream is used by default. It is based on 32-bit std::streambuf, so out-of-range offset is converted back to 32-bit '-53' (for the sample).

So, only bigfile_datastream (not used for small files by default) and buffer_datastream (read from memory) are affected.

Fortunately, this is the only place where strlen is used as fseek() parameter.

BTW, best way to fix looks like (in the beginning of dcraw_common.cpp):

int my_strlen(const char *str)
{
	return (int)strnlen(str,0x7fffffff);
}
#define strlen(a) my_strlen((a))

-- Alex Tutubalin @LibRaw LLC

BTW, the sample you sent is

BTW, the sample you sent is very rare PowerShot 600 file. Unfortunately, this is not complete file, but cutted (it is easily visible by exiftool: metadata structure is not full, jpeg preview is also missing)

Could you please send us all files seems problematic? Hope, there is complete file from Canon PS 600 (this is 1996 model) i need to check PS600 route in dcraw/libraw (this is very different from other canon files)

-- Alex Tutubalin @LibRaw LLC

EXIFtool

I find it interesting that the latest EXIFTool incorrectly reports the camera manufacturer as "Canon" and the model as "Inc.". Any reference to the actual camera model (which appears to be properly encoded in the data) is missing. I looked (briefly) at the EXIFTool source and it is far, far beyond what I want to get into.

If you have an EXIFTool which reports the name of the camera model correctly, can you please share? (if nothing else, you can upload it to ftp.infinadyne.com - it takes anonymous uploads in the root.

I could run EXIFTool against all the raw files I have from various sources, but if they do not properly reflect the camera model, it isn't really all that useful.

Paul Crowley

Most likely, this is some

Most likely, this is some 'approximation' if 1st word is Canon, than second is model name.
This camera was produces ~20years ago, current canon models do not use 'Inc' in maker/model

-- Alex Tutubalin @LibRaw LLC