Jump to content

3.4->4.x upgrade: "Cleaning up tags", necessary? Takes hours


TSP

Recommended Posts

Posted

Hi, 

This task takes hours for me. Some of those hours the upgrade task want to spend on this. For me this process means 2-3 hours, just spent on this task:

Quote

Cleaning up tags (Converted so far: 787000 out of 837576)

 

This task seems to work exclusively on a cache-table, given it's name core_tags_cache from the upgrade file. Will it break anything to just empty this table? Because I presume the content in these cache-tables will be rebuilt if they don't exist? Or is it necessary for something (and not really a cache table?)

This is what the task actually does, from: applications/core/setup/upg_101024/upgrade.php

	/**
	 * We never cleaned up core_tags_cache with previous upgrades
	 *
	 * @return	array	If returns TRUE, upgrader will proceed to next step. If it returns any other value, it will set this as the value of the 'extra' GET parameter and rerun this step (useful for loops)
	 */
	public function step1()
	{
		/* Some init */
		$did		= 0;
		$limit		= 0;
		
		if( isset( \IPS\Request::i()->extra ) )
		{
			$limit	= \IPS\Request::i()->extra;
		}

		/* Try to prevent timeouts to the extent possible */
		$cutOff			= \IPS\core\Setup\Upgrade::determineCutoff();
		
		foreach( \IPS\Db::i()->select( '*', 'core_tags_cache', null, 'tag_cache_date ASC', array( $limit, 500 ) ) as $cache )
		{
			if( $cutOff !== null AND time() >= $cutOff )
			{
				return ( $limit + $did );
			}

			$did++;

			/* The data may be serialized, so check that */
			$results = @unserialize( $cache['tag_cache_text'] );
			$update  = null;

			if( is_array( $results ) AND count( $results ) )
			{
				$update = $results;
			}
			else
			{
				/* It may be json_encoded...which is normally fine, but a previous bug may have resulted in the 'tags' array being two levels deep */
				$results = @json_decode( $cache['tag_cache_text'], true );

				if( is_array( $results ) AND count( $results ) )
				{
					if( isset( $results['tags'] ) AND is_array( $results['tags'] ) )
					{
						if( isset( $results['tags'][0] ) AND is_array( $results['tags'][0] ) )
						{
							$update = array( 'tags' => $results['tags'][0], 'prefix' => $results['prefix'] );
						}
						else
						{
							$update = $results;
						}
					}
				}
			}

			if( $update !== null )
			{
				\IPS\Db::i()->update( 'core_tags_cache', array( 'tag_cache_text' => json_encode( $update ) ), array( 'tag_cache_key=?', $cache['tag_cache_key'] ) );
			}
		}

 

One issue is that the task is using offsets, which makes queries for new rows to process take longer and longer. 

 Execute |    3 | Creating sort index | /*IPS\core\setup\upg_101024\_Upgrade::step1:520*/ SELECT * FROM `ippbe_core_tags_cache` AS `core_tags_cache` ORDER BY tag_cache_date ASC LIMIT 835000,500 |         0 |             0 |

 

The other is possibly the task itself. Is it really necessary? See question at the start of topic.

 

Posted

Yes it's necessary, and no it's not a true "cache" ... more like a consolidation table for certain data.

I can look at switching from using offsets to speed up the higher offsets however.

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...