Using RecordReplay to investigate intermittent oranges, bug #2

This is a follow up from my previous blog entry. I wanted to see if I could debug faster with the previous lessons learned. I managed to reproduced the #2 intermittent reftest (#1 which was recently fixed).

This bug was nicer because I had a single replay that had a passing and a failling run in the same trace. This means that I could jump around my replay and compare an expected execution vs. a failing a execution simply by running ‘run 343663’ and ‘run 344535’ (from the reftest TEST-START event log).

Step 1. Compare the frame tree

$ (rr) run 343651

$ (rr) goto-snapshot

$ (rr) next-display-list

Image p=0x20187201a1a0 f=0x7f1261051d38(ImageFrame(img)(0)

(rr) print *((nsIFrame*)0x7f1261051d38)->mImage.mRawPtr

$26 = (mozilla::image::OrientedImage) {
….
rotation = mozilla::image::Angle::D90

vs.

$ (rr) run 344535

$ (rr) goto-snapshot

$ (rr) next-display-list
Image p=0x210a130a89a0 f=0x30f01d0acd38(ImageFrame(img)(0) class:shrinkToFit)

(rr) print *((nsIFrame*)0x30f01d0acd38)->mImage.mRawPtr
$24 = (mozilla::image::OrientedImage) {

rotation = mozilla::image::Angle::D0,

step 2. Trace the mismatched value

Alright so this means that when the intermittent failure occurs the rotation field is incorrectly set incorrectly in the ImageFrame. It’s hard to find out why a field isn’t being set so let’s focus on finding out what is setting D90 in the correct case.

$ (rr) watch -l ((nsIFrame*)0x7f1261051d38)->mImage.mRawPtr->mOrientation.rotation

$ (rr) reverse-continue

(…) We see the rotation is provided from StyleVisibility()->mImageOrientation

mImage = nsLayoutUtils::OrientImage(image, StyleVisibility()->mImageOrientation);

Another reverse continue step and we see that it’s being from mozilla::image::RasterImage::SetMetadata. This is called in both the good and bad test with D90 and D0 respectively.

Another step backwards to ‘mozilla::image::Decoder::PostSize(int, int, mozilla::image::Orientation)’. Again the mismatch is seen.

Digging further brought me to the code in ‘mozilla::image::EXIFParser::ParseOrientation’. Both are trying to read at position 24 of the Exif buffer of the same length.

(rr) x/30x mStart
0x1b807e4201e0:    0x66697845    0x4d4d0000    0x00002a00    0x07000800
0x1b807e4201f0:    0x03001201    0x01000000    0x00000600

vs

0x49394230e1e0:    0x66697845    0x4d4d0000    0x00002a00    0x07000800
0x49394230e1f0:    0x03001201    0x01000000    0x00000100

My original theory was a race in getting the data from the decoder and displaying the frame but at this point it looks like something entirely different. We can see here that the Exif data is the same except for the single rotation byte that we want to read.

Data is still wrong in save_marker at tree/media/libjpeg/jdmarker.c:811. reverse-continue from here leads to me:

#0  __memcpy_avx_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S:238
#1  0x00007b605468563a in mozilla::image::SourceBuffer::Append (this=0x344b09866e20, aData=0x21b203e9e000 “\377\330\377”, <incomplete sequence \340>, aLength=32768) at /home/bgirard/mozilla-central/tree/image/SourceBuffer.cpp:311
#2  0x00007b60546859d3 in mozilla::image::AppendToSourceBuffer (aClosure=0x344b09866e20, aFromRawSegment=0x21b203e9e000 “\377\330\377”, <incomplete sequence \340>, aCount=32768, aWriteCount=0x7ffcab77bba8) at /home/bgirard/mozilla-central/tree/image/SourceBuffer.cpp:368
#3  0x00007b6052fe40d1 in nsPipeInputStream::ReadSegments (this=0x21b203e8d980, aWriter=0x7b6054685996 <mozilla::image::AppendToSourceBuffer(nsIInputStream*, void*, char const*, uint32_t, uint32_t, uint32_t*)>, aClosure=0x344b09866e20, aCount=90700, aReadCount=0x7ffcab77bc20) at /home/bgirard/mozilla-central/tree/xpcom/io/nsPipe3.cpp:1283
#4  0x00007b6054685a3f in mozilla::image::SourceBuffer::AppendFromInputStream (this=0x344b09866e20, aInputStream=0x21b203e8d980, aCount=90700) at /home/bgirard/mozilla-central/tree/image/SourceBuffer.cpp:385
#5  0x00007b605463d893 in mozilla::image::RasterImage::OnImageDataAvailable (this=0x21b203e9abc0, aInputStream=0x21b203e8d980, aCount=90700) at /home/bgirard/mozilla-central/tree/image/RasterImage.cpp:1072
#6  0x00007b605465fc72 in imgRequest::OnDataAvailable (this=0x57a03547860, aRequest=0x4f7d247b7950, aContext=0x0, aInStr=0x21b203e8d980, aOffset=0, aCount=90700) at /home/bgirard/mozilla-central/tree/image/imgRequest.cpp:1106

and for the base case:

#0  __memcpy_avx_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S:238
#1  0x00007b6054684c66 in mozilla::image::SourceBuffer::Compact (this=0xe95558cee70) at /home/bgirard/mozilla-central/tree/image/SourceBuffer.cpp:153
#2  0x00007b6054685f7d in mozilla::image::SourceBuffer::OnIteratorRelease (this=0xe95558cee70) at /home/bgirard/mozilla-central/tree/image/SourceBuffer.cpp:478
#3  0x00007b6054684414 in mozilla::image::SourceBufferIterator::~SourceBufferIterator (this=0x6c1e7a56a0b8, __in_chrg=<optimized out>) at /home/bgirard/mozilla-central/tree/image/SourceBuffer.cpp:29
#4  0x00007b60546803b4 in mozilla::Maybe<mozilla::image::SourceBufferIterator>::reset (this=0x6c1e7a56a0b0) at /home/bgirard/mozilla-central/tree/obj-x86_64-unknown-linux-gnu/dist/include/mozilla/Maybe.h:373
#5  0x00007b605467e8e4 in mozilla::Maybe<mozilla::image::SourceBufferIterator>::~Maybe (this=0x6c1e7a56a0b0, __in_chrg=<optimized out>) at /home/bgirard/mozilla-central/tree/obj-x86_64-unknown-linux-gnu/dist/include/mozilla/Maybe.h:92
#6  0x00007b605466bc74 in mozilla::image::Decoder::~Decoder (this=0x6c1e7a56a000, __in_chrg=<optimized out>) at /home/bgirard/mozilla-central/tree/image/Decoder.cpp:47
#7  0x00007b60546ad304 in mozilla::image::nsJPEGDecoder::~nsJPEGDecoder (this=0x6c1e7a56a000, __in_chrg=<optimized out>) at /home/bgirard/mozilla-central/tree/image/decoders/nsJPEGDecoder.cpp:98
#8  0x00007b60546ad33c in mozilla::image::nsJPEGDecoder::~nsJPEGDecoder (this=0x6c1e7a56a000, __in_chrg=<optimized out>) at /home/bgirard/mozilla-central/tree/image/decoders/nsJPEGDecoder.cpp:115
#9  0x00007b605467b357 in mozilla::image::Decoder::Release (this=0x6c1e7a56a000) at /home/bgirard/mozilla-central/tree/image/Decoder.h:91
#10 0x00007b6054644e55 in RefPtr<mozilla::image::Decoder>::AddRefTraitsReleaseHelper (aPtr=0x6c1e7a56a000) at /home/bgirard/mozilla-central/tree/obj-x86_64-unknown-linux-gnu/dist/include/mozilla/RefPtr.h:362
#11 0x00007b605464467c in RefPtr<mozilla::image::Decoder>::AddRefTraits<mozilla::image::Decoder>::Release (aPtr=0x6c1e7a56a000) at /home/bgirard/mozilla-central/tree/obj-x86_64-unknown-linux-gnu/dist/include/mozilla/RefPtr.h:372
#12 0x00007b605464322f in RefPtr<mozilla::image::Decoder>::~RefPtr (this=0x344b09865598, __in_chrg=<optimized out>) at /home/bgirard/mozilla-central/tree/obj-x86_64-unknown-linux-gnu/dist/include/mozilla/RefPtr.h:56
#13 0x00007b60546816e2 in mozilla::image::NotifyDecodeCompleteWorker::~NotifyDecodeCompleteWorker (this=0x344b09865580, __in_chrg=<optimized out>) at /home/bgirard/mozilla-central/tree/image/DecodePool.cpp:82
#14 0x00007b6054681726 in mozilla::image::NotifyDecodeCompleteWorker::~NotifyDecodeCompleteWorker (this=0x344b09865580, __in_chrg=<optimized out>) at /home/bgirard/mozilla-central/tree/image/DecodePool.cpp:82

Assuming there’s no mistake above it means were getting the wrong data from the input stream. However it would be odd that the single byte for orientation would get corrupted. Something is fishy here! I’ll continue this debugging session and blog post next week!

 

Advertisements

One thought on “Using RecordReplay to investigate intermittent oranges, bug #2

  1. Pingback: Using RecordReplay to investigate intermittent oranges, bug #2 part 2 | Benoit Girard's Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s