Add new comment

For the files you provide and

For the files you provide and openmp enabled the difference is neglible:

0.22

$ time ./bin/unprocessed_raw ~/tt/GFX/*RAF
Processing file /home/lexa/tt/GFX/DSCF0029.RAF
[skip]
Processing file /home/lexa/tt/GFX/DSCF0307.RAF
[skip]
Processing file /home/lexa/tt/GFX/DSCF0614.RAF
[skip]

real    0m4,604s
user    0m32,124s
sys     0m0,811s

0.21

$ time ./bin/unprocessed_raw ~/tt/GFX/*RAF
Processing file /home/lexa/tt/GFX/DSCF0029.RAF
[skip]
Processing file /home/lexa/tt/GFX/DSCF0307.RAF
[skip]
Processing file /home/lexa/tt/GFX/DSCF0614.RAF
[skip]

real    0m4,687s
user    0m32,431s
sys     0m0,758s

Also, diff in src/decoders/fuji_compressed.cpp is very small (it changes error handling in openmp case):

diff --git a/src/decoders/fuji_compressed.cpp b/src/decoders/fuji_compressed.cpp
index acea0825..40d92d78 100644
--- a/src/decoders/fuji_compressed.cpp
+++ b/src/decoders/fuji_compressed.cpp
@@ -229,9 +229,9 @@ static inline void fuji_fill_buffer(fuji_compressed_block *info)
 {
   if (info->cur_pos >= info->cur_buf_size)
   {
+    bool needthrow = false;
     info->cur_pos = 0;
     info->cur_buf_offset += info->cur_buf_size;
-    bool needthrow = false;
 #ifdef LIBRAW_USE_OPENMP
 #pragma omp critical
 #endif
@@ -1155,14 +1155,16 @@ void LibRaw::fuji_decode_loop(fuji_compressed_params *common_info, int count, IN
   const int lineStep = (libraw_internal_data.unpacker_data.fuji_total_lines + 0xF) & ~0xF;
 #ifdef LIBRAW_USE_OPENMP
   unsigned errcnt = 0;
-#pragma omp parallel for private(cur_block)
+#pragma omp parallel for private(cur_block) shared(errcnt)
 #endif
   for (cur_block = 0; cur_block < count; cur_block++)
   {
-    try{
+    try
+    {
       fuji_decode_strip(common_info, cur_block, raw_block_offsets[cur_block], block_sizes[cur_block],
                         q_bases ? q_bases + cur_block * lineStep : 0);
-    }  catch (...)
+    }
+    catch (...)
     {
 #ifdef LIBRAW_USE_OPENMP
 #pragma omp atomic

In fact, errcnt variable is declared openmp-shared (it is atomically changed if error catched, so the difference should be neglible).

I can only recommend performing detailed profiling of both versions and comparing where exactly you're experiencing performance degradation at the individual operator level.

Since I don't see any performance differences on our end, there's nothing to look for there.

-- Alex Tutubalin @LibRaw LLC