I'm currently on my second attempt to migrate my s...
# help
b
I'm currently on my second attempt to migrate my service's integration tests from using mockserver over to using wiremock. My first attempt I got stuck because my tests that hit wiremock would intermittently hang (and not consistently the same test). I had to move on to other tasks so put that on the back burner, and now I'm making another go at it, starting with a fresh branch off of master. Running into the same issue. From what I can tell it seems like wiremock is randomly taking longer to respond to some requests than others. At first I thought it was related to using a result transformer (for a few of my tests, I need to send back an S3 resource path that is sent to the wiremock input back via the wiremock output, which is then populated on the http response to my test for the test to do its assertions), but even if I disable those tests and only run the rest of my tests I still get random timeouts. Google has not been much help in finding people in a similar boat, but I did find this slack server on the wiremock official webpage so thought I'd reach out and see if anyone had any tips on what I might be missing in my setup. I'm using the Junit 5 extension currently, but have run into the same issue either directly spinning up the wiremock server or even loading it up in Docker.
t
Are you running the latest version of WireMock (3.9.2)?
b
Yes, I pulled down the latest yesterday when I realized I'd forgotten to check if there were any updated versions between my first and second attempts
t
When I see problems like this I usually take a thread dump while it's hanging. Can you do that and share the result?
b
how would I do that? I run it in Gradle from the terminal, but can also run the tests via IntelliJ
t
If you find the java process that's running WireMock and kill -3 on it, that should dump threads to stdout
b
ok so I have the tests running in IntelliJ. The built in intellij profiler shows three processes. One is the service jar, one is the main intellij process, and the last is the "unknown" one which I take to mean the test process. The PID is listed in the table there, but if I run kill -3 in the terminal on the listed pid nothing happens.
I can directly get a thread dump through intellij but it seems to only allow me to export one thread's dump at a time
image.png
not sure if any one of those thread names means anything to you? I can pull data from each one
t
The ones starting with
qtp-
and don’t contain
acceptor
are the Jetty worker threads, so it’ll be one of those. Usually it’s really obvious in a testing context because the stack will be way larger for the one that’s actually doing some work, while the rest will be parked in some fashion and have a short stack.
b
ok. This is the longest one
trying to create a snippet with the full output, one sec
👍 1
for some reason the create snippet button isn't working, going to create a text file
lmk if that's not the right thread and I can take another look
t
I’m not sure whether it is or not. There are two threads (that one and the test thread) that seem so to be (dead?)locked around the same logger, so that could be the problem.
WireMock doesn’t use Logback so this might be some code from your app/test framework that’s causing this.
b
hmm. Odd that the logger would get in the way at all. It doesn't struggle with mockserver. I do have some debug log statements stuffed in my result transformer though to try to figure out exactly where things are freezing up. I need to cut out for the evening but I'll try ripping the logger out of the transformer and see if it goes back to freezing at a different spot. I'll ping you tomorrow when I see more
t
Could just be that timings have changed slightly and revealed a race condition that otherwise didn’t happen
b
vaguely possible but the code I'm looking at right now is all strictly synchronous
t
Even if the code driving it is synchronous, the deadlock around the logger is occurring between different threads so is inherently async.
b
so what I'm seeing as I gradually try to make sense of this is that two of the qtp prefixed threads are blocked on each other
qtp231767016-29 is waiting on the lock owned by qtp231767016-24
meanwhile qtp231767016-24 is blocked on the sifting appender owned by qtp231767016-29
t
One thing that might be worth trying is temporarily removing logback as your logger and putting the no-op logger in there instead. That should tell you if this is a red herring at least.
b
Is there a simple way to do that across all my code? What with each class having a call to create it's own class-specific logger I mean
t
If you’re using slf4j as the logging interface it’s usually as simple as changing the dependency in your build file
so you’d delete/comment out the logback dependency and replace it with
org.slf4j:slf4j-nop:<version>
b
ok. I'll give that a try
that's playing havoc with my build. Wouldn't be surprised if there are some hidden hard dependencies on the actual logback jars somewhere in my company's common libs
t
Disabling the loggers and appenders in the logback config might have the same effect
b
I'll try that
ok so that's interesting. My test logback config is the only one that contains references to the sifting appender that is apparently blocking the threads
I still have no idea why switching mock http server libraries would trigger this, particularly if, as you say, wiremock doesn't make use of logback, but maybe if I change the test log to use the same async appender as my production configuration that might make a difference
t
So removing the sifting appender fixes the issue?
b
Seems to. I'm still getting odd test failures but I'm not running into thread deadlock.
t
It’s strange, but I’ve seen this happen enough times in the past. Sometimes just a change to the order of class loading or timings at runtime can trigger latent deadlocks, even if the change you made is apparently unrelated.
WireMock does use slf4j, which means it will be writing to logback if you’ve configured it to
So the messages written via the logger interface will be different vs. when you were using mock server
b
ahh ok. That kinda makes sense. Some weird interaction with this sifting appender then
I'm going to look into why this sifting appender is even being used. See if the guy who added it remembers why. Because I'm kinda scratching my head at his comments.
t
Sounds like a good place to start
b
Anyway thanks for the help. I'll post back again if I run into any more problems
👍 1
See if the guy who added it remembers why
Odds are slim as that config has been there for like a decade...
t
Good luck in that case!
b
hah thanks
Looks like this issue is now resolved. I have another one but I'll open a new thread for that since it's not related to this one