Migrating Drupal 6 to Drupal 8: Auditing content for migrations

Author picture
Gareth Goodwin profile
Posted by
Gareth Goodwin
Date:

In the first part of our D6 to D8 migration blog posts, we talked about our initial approach to the project. In this post, we’ll talk about how we gave the client the control over what content was needed to be migrated into the new Drupal 8 site. It's worth noting that the technique presented here applies just as much for migrating content from Drupal 7 into Drupal 8.

With over 3,700 nodes across a total of three websites built in Drupal 6, our client needed to grab the opportunity to carry out a content spring clean prior to any migrations. With a project of this scale, we also needed time to get everything prepared on the Drupal 8 side, whilst having a method to be able to migrate the content under a scripted condition just before going live. The Drupal 6 sites were busily growing in terms of content throughout the development process and whatever audit system we provided the client, we needed it to work over months rather than days.

In this case, the content clean up needed to cover three possible editorial options for each piece of content, though the system we came up with could easily support more:

  1. Migrate the content into Drupal 8
  2. Delete the content (do not migrate the content)
  3. Manually recreate the content in Drupal 8 after migration in order to provide the opportunity for rewriting or refactoring

Auditing content with flags

To create a system that would enable the client to get busy with the audit whilst we were busy on the development, we decided to integrate the Flag module into the Drupal 6 site. This provided content editors with the option to 'flag' content with one of three options at the node edit and view level.

Since a great deal of the content could be audited by just seeing some basic information about each node, we extended this approach by adding a Views table display of all content, with filters, that content editors could access and flag content more quickly. Adding a bulk operation action to flag multiple pieces of content made this even quicker.

The result was an interface that our client could use with ease and a set of flags that we could use to drive our scripted migrations.

Not migrating everything

Drupal 8 core gives you the opportunity to carry out a basic migration that is easiest described as migrating everything so long as migrations understands what it is. This is great when it suits your project requirements but we needed something much more controlled and we needed to perform a few operations on the nodes during the import. A good example is processing the body field for file paths that need to change or old in-site linking methods.

This meant that for each node migration, we also needed to check how the content was flagged and whether it needed bringing in.

Here’s an example of the migration source, which checks if a node has been flagged with the ‘auto_migrate’ flag in the Drupal 6 database. Note: This was used in Drupal 8.0. There were changes in how migrate works between Drupal 8.0 & 8.1, so the below snippet may now be outdated.

<?php
/**
 * @file
 * Contains \Drupal\custom_migration\Plugin\migrate\source\CustomNodeMigration.
 */

namespace Drupal\custom_migration\Plugin\migrate\source;

use Drupal\migrate\Row;
use Drupal\node\Plugin\migrate\source\d6\Node;

/**
 * Drupal 6 node source from database.
 *
 * @MigrateSource(
 *   id = "custom_node_migration"
 * )
 */
class CustomNodeMigration extends Node {

  /**
   * {@inheritdoc}
   */
  public function prepareRow(Row $row) {
    // Check that the content was flagged as needing migration.
    if (!$this->needsMigration($row)) {
      return FALSE;
    }
    return parent::prepareRow($row);
  }
  
  /**
   * Check if this node needs migration.
   */
  public function needsMigration($row) {
    // If the node is not flagged as 'automatic' migration, skip the row.
    if (!$this->checkAuoMigrateFlag($row)) {
      return FALSE;
    }
    return TRUE;
  }
  
  /**
   * Check if a node has the auto_migrate flag.
   */
  private function checkAuoMigrateFlag($row) {
    // Select node in its last revision.
    $query = $this->select('flag_content', 'fc')
      ->fields('fc', array(
        'fcid',
      ));
    $query->condition('fc.content_id', $row->getSourceIdValues()['nid']);
    $query->condition('fc.fid', $this->getAutoMigrateFlagId());
    return $query->execute()->fetchAssoc();
  }
  
  /**
   * Get the ID for the auto_migrate flag, which is needed
   * for database queries.
   */
  private function getAutoMigrateFlagId() {
    static $fid;
    if (!$fid) {
      $query = $this->select('flags', 'f')
        ->fields('f', array(
          'fid',
        ));
      $query->condition('f.name', 'auto_migrate');
      $flag = $query->execute()->fetchAssoc();
      if (!$flag) {
        throw new Exception('Failed to get the ID for the auto_migrate flag.');
      }
      $fid = $flag['fid'];
    }
    return $fid;
  }

}

This code checks whether the node being migrated has been flagged with the ‘auto_migrate’ flag in the D6 database - if not, the prepareRow() returns FALSE, which means that Drupal will skip importing this row.

In summary, this proved to be a simple approach to the challenge of dealing with auditing content and using that to drive how content is migrated.