Invision Community 4: SEO, prepare for v5 and dormant account notifications By Matt Monday at 02:04 PM
sibomots Posted June 19, 2022 Posted June 19, 2022 (edited) TL;DR; Is this method for supporting legacy BBCode (custom codes, and vanilla codes) a viable solution for a 4.x site? (Link at bottom of page also) If true, then I'm looking for some hints in the ACP of a 4.x site where such custom BBCode handling plugin/extension is deposited. Its not clear in the ACP where to do this. I've scanned the ACP and I cannot find the Developer Center, or do I need to enable it? ---- Details: We've upgraded a 22 year old Invision site to the latest bits from Invision. We were fortunate that the DB schema of the old legacy DB was comprehended enough to transfer and convert the bulk of the media content (posts, members, gallery, etc.) into the new schema (v 4.6.12.1). However, over the last 22 years, the legacy site developed and used a non-trivial amount of BBCode. Not just the vanilla BBCode, but [CustomStuff]. Excessive amounts of it. I have two options (three really). Option 1 is: Leave it. If the members who authored the content want to fix their content manually, they can do so. To do that I'd just have to enable the built-in editor to add the equivalent format capabilities through the Editor widget. This is the least desired solution. Far too many instances and with 50,000+ users, this would be an ordeal for the members to deal with. Plus the time it would take to eventually fold over all the legacy content into the new scheme (via the Editor) would turn off a lot o members. Option 2 is: Roll up sleeves and go in and write a bunch of SQL (automated) to scrub the necessary tables -- revising the records in the database themselves. Identifying the legacy BBCode, and based on the logical HTML/CSS replacement update the records directly in the Database. I've done this for a few already. The simpler ones. But the list of BBCodes that are not vanilla is not trivial. Doable, yes. Best solution, maybe. But time-consuming, definitely. It's on the table. Option 3 is: Find out if the latest bits (v 4.6.x or whatever) provide the capability to add a plugin/extension to the new site such that the capabilities of the new site are extended to detect content with non-vanilla BBCode, and based on the heuristic we used in the legacy DB/site, program the extensions such that the automatic re-rendering of the content with the non-vanilla legacy BBCode into the intended HTML/CSS representation. Eg -- leave the DB records alone and force the fix onto real-time processing through the plugin/extension. From what I've read (linked article below), it appears that Option 3 might be a viable solution. I'm still trying to navigate through the ACP to find the hooks needed to setup and place some minimal viable test plugin/extension to test the idea (on the TESTURL site, not the production site). So here's the question(s) -- Given: I understand and accept the direction that Invision is taking with respect to sunsetting the support for legacy custom BBCode in 4.x. I get it. Fine. Pass the beer nuts. 1. But does the article I'm mentioning (linked below) suggest that there is a path for Option 3? 2. Is there a hook in the baseline system such that if I were to develop the plugin/extension that I can actually make the rendering of the content backward-compatible (ie., DB records containing legacy BBCode are comprehended by the plugin/extension during the rendering of the content to the user -- the user sees what the intended content is... at the cost of a bit more processing in real-time on the site when those pages/topics/etc.. are viewed)? 3. Or is this article (below) out of date?... Is it really a non-starter and I'm left with Option 2 (revise the DB directly itself for the myriad syntax that was accumulated over 22 years)? Option 2 is not ruled out, but it's a pain in the neck. The flaw with Option 2 as far as I can tell is that attributes to HTML tags are generated by the 4.x software based on the content. I could also try to generate the arguments for those attributes automagically in the script/tools that does the DB scrub, but I don't really want to. For example: The venerable [img] URL [/img] code. In Legacy, it was just as simple as that, nothing other than the URL to the image and voilà, it worked. ( Technically, the browser doesn't care as long as the content delivered to the client makes sense. The Invision SW doesn't really care (or should care) if some of the attributes custom to Invision SW are absent (data-ratio, width -- mostly to assist the client in laying out the geometry of the page on rendering in the client. The server doesn't care if it's there or not). Frankly even if converted from Legacy BBCode to HTML/CSS usage, the client (browser) doesn't really benefit from class="ipsImage" ) In the new 4.x system, I don't see a "insert image" button in the Editor, but rather a feature of the Editor to "paste" an image and then the 4..x site generates this content in the DB record: <img alt="bughunt2022.png" class="ipsImage ipsImage_thumbnailed" data-fileid="9" data-ratio="100" width="150" src="<fileStore.core_Attachment>/monthly_2022_06/bughunt2022.png.3ea51f158b54ef81b2150bcd21bb76c2.png" /> The attributes data-fieldid, data-ratio are unique to the Invision Software (at least add intelligence for what to do). Those attributes seem specific to the media element already stored. (FileStore) The attribute class is something that is aligned with the style that should be applied for this kind of media. It could be simply `ipsImage' instead, for last resort. Finally the attribute src is just the URI to the image itself. For the task at hand, all of the images in [img] URL [/img] legacy BBCode usages are generic simple URLs to media files that are simply stored in the filesystem. When dealing with Option 2, the SQL/Script that iterates through the records and encounters the [img] tag in this format: [img]http://example.com/basename.png[/img] Which could be converted via SQL/script into: <img alt="basename.png" class="ipsImage" data-ratio="100" src="http://example.com/basename.png" /> Thus -- I write some code (done), and parse and revise the records (done). Rinse repeat. Rinse repeat for all of the Legacy BBCodes used in all of the records in all of the forums_posts.post fields, core_members.signature fields, etc.. Option 2 becomes an exercise. I can prove the concept for a few codes, but the list of custom codes is not small. Option 3 frees up more weekends since the translation via plugin/extension takes care of it for the most part. Caveat: I have no apprehension or fear or concern about the complexity of the Option 2 solution. It's just code. And SQL is not that big deal, nor is the scripting of execution. I just am lazy and do not want to have to roll some new code up for this if there's a more elegant solution in Option 3. Edited June 19, 2022 by sibomots
sibomots Posted June 20, 2022 Author Posted June 20, 2022 I think Option 2 is my only option. I need some help then understanding the conversion of [topic=ID_NUMBER] nice topic... [/topic] If I enter that in the editor (a new topic, not a record that came from legacy contnet) The SW engine produces this: <a href="<___base_url___>/topic/ID_NUMBER-title-text-from-the-topic" rel=""> nice topic... </a> I can convert this in SQL fine. But I cannot find the table.field that stores the textual "title-text-from-the-topic" suffix on the HREF. Does the SW synthesize that from the mt_title field in core_message_topics? If not, where does the SW derive the text-suffix after the Topic ID number when it synthesizes the URL to the topic?
sibomots Posted June 20, 2022 Author Posted June 20, 2022 An example: The embedded SQL gathers information about a target TOPIC_ID and it gathers the text parsed between [topic=TOPIC_ID] FLAVOR_TEXT [/topic] then the step-wise action is based on this select distinct concat('<a href="<___base_url___>/topic/', T.tid, '-', T.title_seo, '" rel=""> {inserted flavor text} </a>') from ibf_forums_posts P, ibf_forums_topics T where P.topic_id = TOPIC_ID and T.tid = P.topic_id; so the result is, the embedded SQL replaces If the title_seo of the topic is "foo" [topic=TOPIC_ID] stuff [/topic] with this: <a href="<___base_url___>/topic/TOPIC_ID-foo" rel=""> stuff </a> Since so many of the bazillion posts are using this in the legacy records, this will resolve that BBCode tag.
Stuart Silvester Posted June 20, 2022 Posted June 20, 2022 You may want to give your test upgrade time to run all of the background tasks (see list in admin CP). They'll address most of this
sibomots Posted June 20, 2022 Author Posted June 20, 2022 14 minutes ago, Stuart Silvester said: You may want to give your test upgrade time to run all of the background tasks (see list in admin CP). They'll address most of this Interesting. can you elaborate? It's a feature of the ACP to manually trigger for specific "convert Codes now"... Or is the conversion part of the omnibus of tasks rolled into "task.php" that grinds on the site over time?
sibomots Posted June 20, 2022 Author Posted June 20, 2022 2 hours ago, sibomots said: Interesting. can you elaborate? It's a feature of the ACP to manually trigger for specific "convert Codes now"... Or is the conversion part of the omnibus of tasks rolled into "task.php" that grinds on the site over time? This is actually critical differentiation. Can you confirm what it is you meant. A) The `task.php` that runs in the background will find *existing* codes in records, parse them and revise them into coresponding HTML Like a "roomba robot that glides around the hallways" for BBCode that already exists in records.... or B) Only new posts/content created via the built in editor triggers those posts to be processed (converting BBcode into the corresponding HTML I think you mean (B), correct?
Recommended Posts