Hi In our project we were using version 2.33.2 and...
# help
m
Hi In our project we were using version 2.33.2 and as tech refresh we upgraded our version to 3.3.1 We have a wiremock pod that has 4 container in it. After upgrade the behavior has changed. While running the performance test, it start throttling and causing 499 errors. I resolve the issue increasing the number of container from 4 to 10 but this is like a work around. I tried couple of parameters like: Previous setup: --async-response-enabled=true --async-response-threads=100 --container-threads=500 --disable-request-logging --no-request-journal after upgrade added params: --async-response-enabled=true --async-response-threads=100 Any helpful suggestions that I can resolve this throttling issue?
t
Did you remove
--container-threads=500
in your latest deployment? If so that might be the problem.
m
No, this is the present set up. Do I need to remove it?
t
No, you should keep it
m
Presently I have all these --async-response-enabled=true --async-response-threads=100 --container-threads=500 --disable-request-logging --no-request-journal
I am trying to understand what is causing throttling and need to resolve that issue.
t
Shouldn’t matter but it looks like you’ve got the async-* params twice
m
sorry I wrote here by mistake while copying
t
Also probably not an issue, but 500 container threads is probably too high. How many CPU cores are you allocating per container?
m
resources: requests: cpu: 250m memory: 1024Mi limits: cpu: 1000m memory: 1024Mi
this is wiremock setup
But 500 thread we were using from older version..
I am using locust performance test and sending per min 20K - 40K and 80K request tests.
t
Hard to know what’s going on without more data.
m
What can I provide as more data
t
I guess the CPU usage profile has shifted between versions, assuming nothing else has changed on your side
But I’d need some profiler data for before/after
e.g. from JFR
m
When I move back to older version all works fine.
t
Is the WireMock version the only variable?
m
yes, same set up with older version works fine and when I change the newer version, this hapens
Hi @Tom I did binary search on versions to see where the issue starts. I have seen that beta-10 has no issue while running the locust performance testing. But in beta-11 the CPU throttling issue starts. It seams due to some change in beta-11 causing this issue. https://github.com/wiremock/wiremock/releases/tag/3.0.0-beta-11 I tried to check the code overall but nothing I see stands out. But you know better and it might ring any bell for you. I tested many versions and beta-11 is the starting point for the issue. Do you mind checking the beta-11 to see what might cause this issue. thanks
t
I can't think of anything off the top of my head that obviously would have changed the performance characteristics between those releases so if you could grab and share some profiler output, ideally before and after the version upgrade that would help get to the root cause.
m
I will check those details and will try to provide you more data but do you suggest which options should we enable for JFR for you that would be helpful for you? Because with 100's of thousands of requests, I’m not sure if the JFR data would be humungous.
t
I don’t think we need tons of data, probably only a few 10s of requests for each version
If each JFR recording shows a significantly different profile when you look at them then they’re probably useful
m
Ok, I will work on it and will get back to you Tom. thanks
Hi @Tom Here is the JFR for 2 version. Version 2.33.2 which is perfectly working. Version 3.33.1 the latest version. I tried to put as much info as possible for you to analyze the issue. In zip file you will find: 1- Dynatrace CPU details 2- Locust perf test results 3- JFR files 4- Text file with some details like start, end time number of request details. 5- Configuration file that we use for wiremock
t
Thank you for taking the time to share this. I’m maxed out trying to get end-of-year stuff sorted at the moment but I’ll try to take a look over the holidays or early next year.
m
@Tom Do you want me to create a GitHub issue and put those details over there ?
t
Yes please, that would be really helpful
m
Sure, Thanks
👍 1
I created the issue in GitHub and here it is: https://github.com/wiremock/wiremock/issues/2541
🙌 1
🙏 1
@Tom Hi Tom, just following up, any near future plan to look at this issue?
t
Yes, I haven't forgotten. Just spread a bit thin at the moment as our head of community recently left the company.
m
Thanks, Tom
@Tom Hi Tom, quick question. When you guys start working on issues, do we get any notification that the issue is under review or something like that or I need to check with you guys to see if it is in progress. We would like to upgrade our wiremock version but due to performance issue keeping it in backlog. May be you can do a valentine’s day surprise and move it to in progress 🙂
t
If there's any progress worth communicating then we'll comment on the GH issue so you can subscribe to that. But it's open source I'm afraid so it's best effort prioritisation. If this is urgent for you, finding the problem and creating a fix PR would move things forward more quickly.
@Mehmet Gul I’ve had a quick scan of this and have 2 hypotheses - 1) large numbers of sub-events being added on each request (fixed in 3.4.1), 2) redundant matching happening on stubs e.g. trying to match body despite the method/URL not being a match. One question - are your tests producing any unmatched requests i.e. where you get the diff report in the response?
m
Hi @Tom Thanks for follow up. For #2, we aren’t sending any requests to wiremock that are not matched by URI. We only call /mock/auth/api1 through /mock/auth/api200 which each have a 1000ms delay.
t
I’ve actually confirmed 2) now and am about to push a fix. Would be really helpful if you could run your tests against my local build. If I sent you a standalone JAR, would you be able to do that?
m
sure, I can try.
t
OK, one sec…running a couple more tests then I’ll send the JAR your way
m
sure, thanks
t
m
Normally we are running wiremock in containers, I will check with our developers and test it. I will let you know once tested.
t
OK, let me know if it’s an issue
m
sure
Hi Tom I have discussed with our developer team and they said we cannot do that. I can try the version 3.4.1 and update you on it. And also the 2nd option is not applicable for as we are not sending any request to wiremock that are nt matched by URI.
t
I’ve just released 3.4.2 anyway so the Docker image will be available shortly. I wouldn’t bother trying 3.4.1 as it doesn’t have my fix in it.
m
I can try 3.4.2 directly once the docker image is available.
t
That’s published now
m
Hi @Tom I tested the 3.4.2-1 version. Here is my observation: There is improvement compare to previous tests. 20K req/per min works fine. 40K req/per min has issue but after 15min on total 30 min testing. 80K req/per min was total fail. I am attaching the performance test results screenshot and CPU Throttling details.
t
I’m not sure what to make of these results. Do you have any JFR traces you can share? Also, what would really help is if you can share the stubs you’re using to test (happy to receive privately via email if you don’t want to send them via Slack).
m
Hi @Tom I DM the details to you asked. I hope that helps.