Marc Posted November 7, 2023 Posted November 7, 2023 Thank you for reporting, and will pass this on to our server guys asd937 and David N. 2
David N. Posted November 9, 2023 Author Posted November 9, 2023 This JUST happened again! Albeit briefly this time. Again, both this site and my site at the same time.
Jim M Posted November 9, 2023 Posted November 9, 2023 Sorry to hear you hit that again. I have reported this to our server guys to track down what is happening. David N. 1
Marc Posted November 14, 2023 Posted November 14, 2023 1 hour ago, David N. said: And again just a few minutes ago. Thank you for reporting
David N. Posted November 15, 2023 Author Posted November 15, 2023 There were more issues this morning, where I could no longer log in to my site for several minutes. Now just now I had some slow response. Trying to submit an answer took over 15 seconds. I checked with downforeveryone and my site was down.
Jim M Posted November 15, 2023 Posted November 15, 2023 Thank you for reporting. I also just responded to your ticket 😉 David N. 1
David N. Posted December 23, 2023 Author Posted December 23, 2023 Both my site and invisioncommunity.com were down again just now:
Jim M Posted December 23, 2023 Posted December 23, 2023 2 hours ago, David N. said: Both my site and invisioncommunity.com were down again just now: Sorry for the inconvenience. There was a short issue which our cloud team immediately resolved. Do not worry, our Cloud team is always around on holidays to ensure our and our client’s communities are up.
David N. Posted December 26, 2023 Author Posted December 26, 2023 (edited) More downtime again this morning, for my site and this time, and for longer periods. For more than 40 minutes from 9:42am to 10:23am CEST my site was mostly down with brief periods of excruciatingly slow uptime throwing all kinds of errors. The issue is still ongoing at this time. Edited December 26, 2023 by David N.
David N. Posted December 26, 2023 Author Posted December 26, 2023 The issue has been on and off all morning. There was another half hour of downtime from around 11:36 am to 12:05 am CEST.
Management Matt Posted December 26, 2023 Management Posted December 26, 2023 We are experiencing a significant DDoS attack (4.5 million requests) which we are taking steps to mitigate. We're all here and working on it and should see an improvement already. Any significant changes will be added here: https://status.invisioncommunity.com
David N. Posted December 26, 2023 Author Posted December 26, 2023 (edited) 1 hour ago, Matt said: We are experiencing a significant DDoS attack (4.5 million requests) which we are taking steps to mitigate. Thanks Matt. Doesn't AWS protect against DDoS attacks? https://aws.amazon.com/shield/ 1 hour ago, Matt said: Any significant changes will be added here: https://status.invisioncommunity.com I don't understand why, like for the downtime on December 23rd, while under "History & Incidents", the downtime for invisioncommunity.com was tracked, the downtime for the U.S. Cloud service (which I'm using) is not tracked, so that it says the uptime is 100% when it is not - far from it. Edited December 26, 2023 by David N. WebCMS 1
Randy Calvert Posted December 26, 2023 Posted December 26, 2023 Remember there are multiple servers and components involved. You’re thinking in terms of single servers where it’s binary… up or down and there is nothing in between. In enterprise architectures, there might be an outage that affects a percentage of users. It could literally be somewhere between 0 and 100. Users in a geographic location might only have problems or users that happened to connect to one portion of their network might be impacted. Most monitoring services (if they’re worth anything) check from multiple locations and report a failure when enough reporting stations agree there is a problem. So it might not catch isolated or regional issues that does not have widespread impact.
David N. Posted December 26, 2023 Author Posted December 26, 2023 15 minutes ago, Randy Calvert said: So it might not catch isolated or regional issues that does not have widespread impact. This was not an isolated or regional issue. Here's what uptrends.com measured during the downtime:
InfinityRazz Posted December 27, 2023 Posted December 27, 2023 So has this problem since been fixed? While I noticed no "slowing" of our cloud community, suddenly users randomly get a 403 "Request could not be satisfied" error randomly while utilizing our sites rest API.
Jim M Posted December 27, 2023 Posted December 27, 2023 2 hours ago, InfinityRazz said: So has this problem since been fixed? While I noticed no "slowing" of our cloud community, suddenly users randomly get a 403 "Request could not be satisfied" error randomly while utilizing our sites rest API. 403 is different than any outage error. That typically states you do not have permission. You will want to check how you're using the API and ensure you're not excessively sending requests, always including a user agent, and other typical best practices.
InfinityRazz Posted December 27, 2023 Posted December 27, 2023 8 hours ago, Jim M said: 403 is different than any outage error. That typically states you do not have permission. You will want to check how you're using the API and ensure you're not excessively sending requests, always including a user agent, and other typical best practices. I'll admit one use case was my fault.. Was trying to use 'ExecuteAsync' instead of 'PostAsync' to generate an oAuth token (whoops) However (and we have tested with multiple users): ever since December 25/26 once a user connects to our site on their 3rd -5th account -> ALL of their connected tokens start throwing 403 exceptions and wind up crashing their active sessions. Upon login we should only be calling oAuth/token for the token-> /core/me for user id/email -> nexus/purchases for users purchases, then query every active license for its custom fields (1-10 license) + it's corresponding records DB entry, as well as periodic heartbeat to /core/hello to verify token validity/site connection which doesn't seem like an "excessive" amount of rest API calls to me. Keeping in mind we haven't touched and/or made changes to the client side of the API calls in over a year.
David N. Posted December 28, 2023 Author Posted December 28, 2023 It's happened again just now. Again, both on my site and this site.
David N. Posted December 31, 2023 Author Posted December 31, 2023 More server errors again today (just a few minutes ago). Again, geographically widespread, and both on my site and this site. The downtime is tracked by uptrends.com and downforeveryoneorjustme.com but not tracked by status.invisioncommunity.com
InfinityRazz Posted December 31, 2023 Posted December 31, 2023 Also getting this issue, tons of 4XX errors on my RestAPI calls in the last hour or so. Couldn't connect to here via PC but had no issue via mobile however 🤷♂️ Some of my users have been having issues since December 24/25 at this point, and at least 1 of the sites mentioned above by @David have shown multiple servers offline the whole week while Invision status tracker shows no problem.
Stuart Silvester Posted December 31, 2023 Posted December 31, 2023 We're still actively mitigating the DDOS attack on some sections of our network. This isn't a network-wide issue and does not affect all customers. 4xx errors are likely to be WAF related, such as making too many requests in a short time. SeNioR- 1
DawPi Posted December 31, 2023 Posted December 31, 2023 I have 504 error here today Stuart. It’s any difference?
InfinityRazz Posted December 31, 2023 Posted December 31, 2023 2 hours ago, Stuart Silvester said: We're still actively mitigating the DDOS attack on some sections of our network. This isn't a network-wide issue and does not affect all customers. 4xx errors are likely to be WAF related, such as making too many requests in a short time. So again, we haven't changed how we handle Rest API requests in a long while now. The library performing said requests hasn't been recompiled in nearly a year as it's been stable until 7 days ago. This seems to be the most common error our users get (Again, we have not changed the type or frequency of calls) Like sure we've had a small influx of new users this month, but it's just replacing users that have already left our community. I would like some clarification on what classifies as "Too many requests" and "Short Time" please. We hardly ever have 65 unique users online on the website / making requests via oAuth at any given moment. Nor do I see any immediate indication of our site hitting our user cap in admin panel. According to admin panel: "Active user" graph Bandwidth usage: I don't see anything in system or error logs to indicate my users having connection issues, nor are Rest API requests made with an oAuth token actually logged in our logs (they're supposed to when checked no?) so I don't get to see what the users actually doing/experiencing. Could it be they're hitting a node that's under attack and therefore can't connect to the site properly? (they do report slow connection speeds via browser as well). Majority of users are fine, but it's the one's affected that complain the loudest 🤷♀️
David N. Posted January 16 Author Posted January 16 Getting more 504 errors or excruciatingly slow experience right now. Both my site and this site.
Recommended Posts