Ron_ Posted May 26, 2021 Posted May 26, 2021 As titled, after updating to v4.5.4.2, Downloads served by IPS are no longer completing successfully. The downloads end prematurely and result in a corrupted archive. I updated from v4.2.x and Downloads were functioning normally prior. We delayed updates for unrelated reasons. IPS Support isn't being helpful or even bothering looking at the situation closely (why tell me anything about Securi when it is obvious my site is using Cloudflare? ... I digress), so I am hoping someone in the community has had this problem and can actually read what I am saying to advise further beyond generic responses. The test file in question is ~200mb and has over 5,000 downloads over the past 4 years. Members started reporting download problems a couple days after applying updates. I am able to reproduce the problem downloading via my admin account, so there are no IPS level permissions affecting this. We have also confirmed this is affecting all file downloads, not just this one test file. Attachments work normally. I opened a ticket and was advised to check for Cloudflare, Securi, mod_security issues. Cloudflare has been active on the site since its inception, and the 5,000 original downloads were served over Cloudflare. I am not convinced Cloudflare is related to this problem. In any case nothing at Cloudflare has changed. I cannot use Securi if I am using Cloudflare, so Securi is irrelevant. (checking my domain's DNS would indicate cloudflare but I guess that would take too much time for IPS support to do) I do not use mod_security on this particular website. I use it on other prod environments and am aware of its caveats. No server settings changed before the update. During the update, the update did fail to create the admin session due to a permission issue on its detected PHP tmp directory. This wasn't a problem on any other updates so I am not sure why it suddenly was now, but I adjusted it anyway, and the update completed fine. Everything else is working normally. The server is running a nginx/php-fpm stack. The server-side configs have not changed since site launch. I use static pools with 20 workers and 1000 max requests per worker. I did try to adjust to dynamic workers as well as disabling worker max request limit, but this did not change the issue. If I extract the file on the server side, the file extracts normally. If I serve the file outside of the IPS software, on the same nginx server over the same Cloudflare routing, the file extracts fine on download completion. If I transfer the file to another server, I can also download and extract it from there with no problems. The only time the download results in a corrupted archive is when IPS is serving the file. All issues point to IPS' handling of the file download. It is the only variable in my tests that I haven't been able to rule out, and support's response was not what I would expect after paying for this software for so long, so I am not yet at all convinced IPS is not the problem. Perhaps there is some issue caused by the version jump from 4.2.x to 4.5.x? Not being timely with updates is the only other abnormal activity on this website, perhaps IPS did not account for something small between these versions? For what is is worth, I've already tried the self-help tools in the Admin CP, cleared cache, all other suggestions including performing a manual update and ensuring all file are in sync with latest IPS archive in my client area. Nothing affected the issue. Has anyone else seen such a problem before, or can you think of any other aspects to check? I am happy to run any and all suggested tests. IPS didn't seem interested in any real debugging so I am hopeless unless someone at IPS different than the original responder sees this, or unless someone else in the community can assist. Thank you in advance for helping me pull less of my hair out.
Ron_ Posted May 26, 2021 Author Posted May 26, 2021 (edited) IPS responded again, they are claiming Cloudflare must be disabled to debug this further. They are refusing to help unless I disable Cloudflare. If you use Cloudflare and you are aware of the site history, you can understand why this would be a concern. In any case, they can simply use the server IP in their local hosts file to bypass Cloudflare entirely. I am not sure why they aren't even offering this as an option and are instead telling me to decrease public site security as a debug measure. I've followed up with them but I am not hopeful at this point as IPS' apparent inexperience and dismissive behavior with debugging sites on Cloudflare has me concerned. Again I've already verified I can download files from the same webserver over Cloudflare as long as the IPS software is not handling the request. My IP address is in my Cloudflare firewall as a global allow to bypass all rules as an additional debug measure. Nothing on CF side is affecting the connection, nginx+cf+raw file access is okay, nginx+cf+IPS file access leads to incomplete corrupted file. I understand IPS wants to see this occur themselves while working around potential problem causes, but there are ways to do this while actually looking at the problem in question, and offering ways to work around security measures without creating an additional security risk. Our site is a common DDoS target and we rely on Cloudflare to help prevent consistent outages. Disabling publicly is not an option, and I would expect them to at least consider the site history (the fact downloads were working over CF prior to IPS upgrade) before out-right blaming a third party, rather than completely dismissing me unless I do what they say. Any help is still sincerely appreciated. Edited May 26, 2021 by Ron_ Clarification
Ron_ Posted May 26, 2021 Author Posted May 26, 2021 Last update for today as I am out of extra time. I have been advised by IPS that using a hosts file to work around Cloudflare and similar services is "against company policy", and they cannot assist me unless I disable Cloudflare. Considering my site is under intermittent DDoS, the latest one now around a hour ago, I simply cannot afford to disable Cloudflare, especially when there is such an easy workaround available to test the problem. My impression is that I am on my own because of my situation and IPS company policy. It is what it is, I guess. I will continue to debug the problem as I get time, and once I find the fix I will share it here. Fortunately I work with PHP full time ... I would have had a better result of my time at this point by reading the core and debugging it myself. Good luck to anyone else using Cloudflare who has to ever interact with IPS support.
CoffeeCake Posted May 27, 2021 Posted May 27, 2021 Have you tried clearing the cache for the download itself via Cloudflare''s clear file from cache mechanism?
Solution Ron_ Posted July 8, 2021 Author Solution Posted July 8, 2021 @CoffeeCakeOnce again, Cloudflare is not the problem. Downloads serve just fine over Cloudflare from the same webserver as long as IPS is not serving the download. Additionally, Cloudflare does not cache file downloads like this. They cache small static resources not large zip archives. The file in question is in excess of 200mb and another is in excess of 1.2gb. Either way, I did already confirm Cloudflare was serving the file just fine outside of IPS if you read into my original messages. This is one of the first things I verified and should be the first thing anyone verifies when using Cloudflare or any similar CDN service. I actually solved the problem with IPS being completely unhelpful. I received a follow-up to my ticket that I decided to ignore. They effectively told me, go ahead and provide us access to the site, and then we can escalate your issue for a chance that our higher tier support can investigate it. They couldn't even tell me that they absolutely would work with me on the issue, just that there was a chance that they may work with me, only after I provide all of my site details, because higher tier "needs to decide how to proceed". This was after me pressing for them to support me and their software without me being required to disable Cloudflare. Here is the last response I received from IPS regarding this issue: <Screenshot removed by Matt M: Let's not identify and single out single staff members for company policy, thanks> Yes, IPS still absolutely refused to help unless I fully disabled Cloudflare. Fat chance as my site during this period was getting DDoS multiple times per day. It thankfully has since slowed. These are just the ones left in my trash: Sorry, but I shouldn't have to beg for proper support. Not wasting my time with exposing my site over a chance to be helped. IPS should support their product without giving customers the run-around, but I digress. I ignored the response and eventually solved it myself. IPS support is probably fine for basic questions but good luck on anything truly technical. Solution It is isolated to the IPS software as I expected. The cause of the problem is my own doing, but it is something IPS may want to address in future versions or at least be sure to clearly document to avoid others wasting their time. Prior to v4.5.4.2 or possibly a few versions before, IPS did not care about file size as shown in the download system. IPS would serve the raw file from disk and the download would complete as expected. Sometime after v4.2.x, IPS started serving download size headers according to the file size stored within IPS. The software no longer reads the size from disk and instead serves its last known size saved in the database, stored when the file is uploaded or updated via the IPS interface. This results in a premature end of download if the size in IPS is smaller than the size of the file on disk. This now made sense as to why we were seeing an unexpected end of file without corrupted headers or data, but the browser would report as a successful download because received the full size as reported by the headers. This was noticed because I am using dev pipeline automation to automatically update specific downloads we offer through our website. This allows us as developers to deploy changes to a repository, then automatically deploy our updates over to the production download link once we deem an update is ready, without having to mess with IPS and reupload the file each time. Pipelines handles all of this for us on a merge to master branch. Of course over time as updates are added, file size will change, usually increasing. This caused a file size skew but we ignored it because we tested and verified downloads were working okay regardless of size in IPS versus size on disk. Sometime between v4.2.x and v4.5.4.2, IPS internal logic was updated to force downloads to the file size stored in IPS. This triggered our sudden unexplained post-update download problems. I only noticed the issue after seeing the resulting download size matched what is reported in IPS. I manually adjusted the file size stored in IPS, and the exact same file, on the exact same webserver, over the exact same Cloudflare configuration, just now served by IPS instead of a raw link, completed the next download successfully with no other changes, no disabling of Cloudflare, and no clearing of any cache either within IPS or at Cloudflare. Naturally I have still ditched the download system and am just serving direct links for these files now, as IPS gives no mechanism to easily update the size of a file. I would need to create a small API to update the size of the file in the database so that IPS knows the correct header size to send. I've adjusted our thought process on how we provide downloads and I no longer need the internal download system. This will affect anyone using modern DevOps to deploy updated file downloads powered by IPS. Hopefully IPS can either revert this behavior or provide a mechanism for developers in our situation to more easily update the size of a file, such as a webhook we can call or something similar that we can trigger through command line calls. Absolute worst case scenario, at least give a visible warning somewhere that modification of files on disk can cause download problems and to avoid this in general. This thread can be marked as resolved. I've also followed up on my ticket to provide IPS with the solution in hopes they will see and address in some way in future versions. I definitely do not look forward to my next interaction with them, and I do not have my hopes up considering we are still waiting for that updated public documentation they promised years ago. TL;DR: IPS fixed a sensible bug that caused a new bug with my non-standard way of working with their download system. The problem is I could not find this documented in update notes, and IPS team was completely unhelpful without demanding me to expose my site, and could not give me any solid answers on possibility of assisting me without exposing my site. They passed the blame to CDN services without understanding the actual problem.
Ron_ Posted July 8, 2021 Author Posted July 8, 2021 As a final slap in the face, I couldn't even reply to the original ticket. The response box was there, but clicking the submit button did nothing. I am assuming they closed the ticket but their wack support system is unable to indicate this to me anywhere. I had to create a duplicate ticket to follow up with them. I love IPS and have been using the software since InvisionFree forums were a thing, but the current Support team and flow to get support is such a joke for the price we pay for this software. Rant finally over. Neoboru 1
CoffeeCake Posted July 8, 2021 Posted July 8, 2021 5 hours ago, Ron_ said: This will affect anyone using modern DevOps to deploy updated file downloads powered by IPS. Not so sure I'd characterize it so broadly. It sounds like your DevOps process was faulty and did not integrate with IPS via its API. Instead you were writing changes directly to the filesystem without updating internal metadata. The particulars about the release notes being less than wonderful are valid and remain an opportunity for improvement. The method you were using to change things within your install is some pretty important information about the particulars of your environment when it comes to troubleshooting the issue, and it's understandable that IPS support would be looking at other areas first. For anyone else wanting to update downloads powered by IPS, they should work on that integration via the API, documented here: https://invisioncommunity.com/developers/rest-api?endpoint=downloads/files/POSTindex
Recommended Posts