Drupal 7 to Drupal 9 in an hour using AM:A

  • Last updated
  • 1 minute read

Goal

Learn how to migrate a Drupal 7 site to Drupal 9 using Acquia Migrate: Accelerate, and efficiently investigate migration messages.

Overview

Let me take you on a tour of actually using AM:A.

(And if you've never done a migration from Drupal 7 to Drupal 8/9/10 before … you'll get to see what a whirlwind it is 🌪️😅)

Requirements

  1. Generate a Drupal 9 site

    Follow the steps outlined in the "Want it now?" section of "Acquia Migrate: Accelerate — now open source!" 🧑‍💻

    You should see something like this:

    Now let's actually use it! 🥳

  2. The FAQ says we should run the generated AM:A project using PHP 7.4. Primarily most migrations were written against PHP 7, and PHP 8 made non-strict (==) comparisons behave differently, which is used a lot in migrations!

    Unambiguous signs that this is the case: error messages starting with Deprecated function: or PHP Warning:.

    I didn't do that here. Let's find out how far I get.

  3. Start migrating: 4 clicks to >50% 😮

    As you may have noticed in the terminal output, this is my personal site. I do not want to think about which data to migrate. I just want to see how far AM:A would get me. I also don't want to bother with switching from PHP 8.1 to PHP 7.4.

    Seems to have worked fine? 🤷‍♂️ Because this is what I see:

    So, I just went ahead and did 2 clicks:

    Not bad, those two clicks triggered migrations that took half a minute and got me to 645/3466 rows imported, or 53.86%:

    The number of errors crept up from 3 tot 23 … but that seems like a low enough number to continue … so let's just continue — I'm too curious to see how far we can get, and besides, we can easily reset to the original state. I've spent less than 10 minutes on this so far anyway! 😁

    So: let's repeat those 2 clicks from earlier! The result (after another half a minute):

    The error count skyrocketed from 23 to 484, even though the amount of data imported only went up by about less than 300. 🤔

    More frustratingly, AM:A doesn't even allow me to import the content I really care about: "Article" and "Blog post". That's happening because "Dependencies need review".

    Still … let's see what the already migrated content looks like:

    😬 That definitely looks like a PHP problem!

    I can choose to investigate the error messages (255 of which are for "Document media items") and investigate these PHP deprecation errors. But that seems much more time consuming than switching to PHP 7.4.

    I guess I better switch from PHP 8.1 to 7.4 after all.

  4. 🤨 Some of you spotted that 645/3466 is a ratio much lower than 54%. And 37/3466 is definitely not 12% either. So why does the AM:A UI report these progress counts?

    The number of rows that has been migrated is an incomplete assessment. The number of migrations that have already been fully completed is also important to take into account.

    For each migration, typically a few patterns emerge across the failing rows. Once you figure out the pattern, it's relatively simple to get all rows in that migration to work!

    The AM:A UI's progress indicator conveys relative total progress.

    PHP warnings/errors aside, even just seeing "⚠️ Dependencies need review" is sufficient to warrant an investigation.

    AM:A shows this when <70% of the rows in the dependencies of this migration were imported. There's little point in forcefully executing this migration: that would just lead to an avalanche of more migration errors.

  5. Switch to PHP 7.4 ⇒ composer install + Drupal reinstall

    My preferred development environment is the one with the least layers of abstractions: an AMP stack in macOS directly, as described in this guide. That makes switching PHP versions trivial for me:

    sphp 7.4

    Hopefully it's equally simple for you!

    Packages must be reinstalled now that the environment has changed. First remove your composer.lock, to ensure composer chooses the right packages to install. Then run composer install manually (in step 1 above, acli app:new:from:drupal7 did that for us):

    rm composer.lock
    composer install

    This yields mostly the same packages, but some did change — hopefully these will help us be more successful in the migration:

      - Downgrading symfony/finder (v6.3.3 => v5.4.27): Extracting archive
      - Downgrading symfony/filesystem (v6.3.1 => v5.4.25): Extracting archive
      - Downgrading league/uri (6.8.0 => 6.7.2): Extracting archive
      - Downgrading symfony/string (v6.3.2 => v5.4.26): Extracting archive

    Finally: execute the commands in step 7 again:

    vendor/bin/drush site-install✂️
    jq -r '.installModules✂️
    vendor/bin/drush state:set✂️

    This will automatically wipe the Drupal 9 database too, and will set us up for a fresh new attempt. If you're used to switching PHP versions, this probably took you less than a minute. (If not, well, congrats, you've just figured out something that'll be useful in the future anyway!)

  6. Repeat 4 clicks … same result

    After arriving at the freshly installed D9-on-7.4 site's AM:A dashboard, I see exactly the same numbers in the top right.

    Repeating the first 4 clicks … still exactly the same numbers.

    Despite my site being so simple, I am unfortunately running into a hard blocker 😔 (Quite possibly this is happening because it was originally a Drupal 5 site.)

    The only way forward is to dive right in!

  7. Special care is necessary for Drupal 7 sites whose database is actually an upgraded Drupal 5/6 site.

    In those days, Drupal provided automatic upgrade paths: from 5 to 6 and 6 to 7. But not from 7 to 8 — that's why AM:A even exists. Those upgrade paths worked … kinda 🧟 There are plenty of examples to be found about incomplete/inconsistent data in the upgraded databases.

    That's exactly why since Drupal 8, we made Drupal upgrades easy forever, and we achieved that by requiring that every update has test coverage.

    Unfortunately, for sites originally built on Drupal <7 … some database archeology may be necessary!

  8. Investigate 🕵️

    Where to begin investigating? In the migration that has the highest absolute or relative failure rate. By either metric, that is definitely Document media items for me:


    Which migration plugins exist in this migration, which one seems likely to be the cause?

    I first click the "Document media items" link, and not the "255" link, to go to its Details tab:

    Apparently AM:A has clustered 8 migration plugins into this single migration (the ones with a colon in them are derived migration plugins). The first 5 do not have a (xx of yy) indicator at the end — that's because these provide the supporting configuration (these all refer to field type/formatter/widget-esque things.) 
    The last 3 migration plugins do have those numbers, because they're doing the actual data migration.

    There are 255 errors and 254 imported rows for the d7_file_plain::public migration. That's suspiciously similar. 

    Also … the :: looks out of place — the two other data migrations have a string between the colons.

    Do previews work at all?

    Now that we have a suspect already, let's go back to the Preview tab to see if this migration works at all: does the preview succeed in generating a meaningful representation? If so, then not all hope is lost, and this may turn out to be simple after all!

    After clicking "Preview next row" a bunch of times, it … all seems to work? 🤷‍♂️


    Time to investigate the migration messages!

    Now that we have some basic facts checked (which took <30 seconds), let's click the "Total errors: 255" in the top right. Let's scroll through them to find the scariest message, the most explicit/precise message. This one looks rather scary:

    Let's follow the breadcrumbs we've collected.

    1. Searching the Drupal 9 project for "d7_file_plain" turns up:
      • docroot/core/modules/media_migration/src/Plugin/migrate/source/d7/FilePlain.php (a migration source plugin)
      • modules/media_migration/migrations/d7_file_plain.yml (a migration definition, that uses a source plugin + process pipeline + destination plugin)
    2. Those paths both reveal it's a migration plugin provided by the d.o/project/media_migration module. This must be what AM:A uses to migrate files to Media items.
    3. Search this project for "integrity constraint violation" among both open and closed issues yields 6 results. Promising — looks like we're not alone! 🤞 
      But most of those issues are closed, and it's supposedly been fixed a long time ago. I could read through all issues in detail, and hopefully find the answer or at least clues. 
      The second clue I have is "d7_file_plain::public", searching for that yields 0 results. That means I'll need to continue investigating.
    4. My most suspicious migration plugin ID is not d7_file_plain, it is d7_file_plain::public. Where does this come from then, and the two others? 
      d7_file_plain.yml contains a deriver: Drupal\media_migration\Plugin\migrate\D7FileDeriver line. This is a a "plugin deriver", which creates multiple variations of the same source plugin. 
    5. I open D7FileDeriver and search for "ID", because somewhere in here, it must be generating this plugin ID. That should help me understand the structure of the migration plugin. Sure enough, I find this:
    $derivative_id = implode(PluginBase::DERIVATIVE_SEPARATOR, [
      $mime,
      $scheme,
    ]);
    1. I just learned the derivative ID consists of two parts: MIME (as in MIME type) and scheme (as in public vs private vs …). This checks out for the two other migration plugins: …:application:public and …:text:public. But for the problematic one, it's …::public, in other words: MIME is missing!
    2. Searching D7FilePlain for "query" leads me to MediaMigrationDatabaseTrait::getFilePlainBaseQuery(), which contains $query = $db->select('file_managed', 'fm', $options); — great, now I finally know which Drupal 7 database table to look at!
    3. Looking at the contents of that table, something immediately jumps out:
      3 rows near the top do not have a value for the filemime column! 😮 This cannot be a coincidence, can it?!

      I verify that no other rows have an empty filemime column.

      Ironically, these are patch files from the Drupal 5 and 6 days. I can also see that file 34 is also a patch, and it has the MIME type text/x-diff. 🤔

      Could it … really … be this simple? 🥹
       
  9. Testing the hunch

    My hunch is that:

    1.  rolling back the 3 "media items" migrations
    2. setting filemime="text/x-diff" on the rows without a value in the file_managed table in the Drupal 7 database
    3. vendor/bin/drush cr (to wipe all caches in the migration system — and forcing the plugin deriver to run again)
    4. re-importing those three migrations

    might be enough to unblock me!

    So let's roll back these migrations, which is easy in the AM:A UI:

    A second or so after I ran drush cr, the Details tab that I had already open, automatically updated and showed this:

    Succes! 🥳 From 299 rows to migrate down to 48. It looks like this really was just a hitherto undiscovered edge case! (And that makes sense: the Drupal 7 site's file_managed table only contains 271 rows — that was another sign.)

    210 messages are still there though. That's a good reminder that we don't really know what the indirect damage was of this unhandled edge case. I can't wait to see if this results in a successful migration, but I really should reset my environment again (see what we did to switch to PHP 7.4, but this time there's no need to reinstall packages), to avoid such side effects biting me later.

    Still, a quick sanity check of re-importing those 3 "media items" migrations results in this, 10 seconds later:

    Again/still 904 imported rows, but 251 fewer total rows (these were wrongly calculated by the Media Migration), so now at 65%! Better yet: I can actually import my articles now!

    So, another 2 clicks (the checkbox at the top left of the table + the button to import) and one minute later, everything except comments has been imported! Repeating those 2 clicks and to import the comments results in this encouraging screen, 50 seconds later:

    So this:

    Unfortunately, for sites originally built on Drupal <7 … some database archeology may be necessary!" statement really 

    turned out to be very true 😅

  10. When rolling back migrations, immediately after it finishes, you'll see a progress bar at the top of the dashboard that says "Re-importing supporting configuration (N) after a rollback…".

    That means the supporting configuration (such as field types, formatters, widgets, node types, translation settings …) for the migrations that were rolled back were automatically re-imported.

    That supporting configuration is necessary for previews to work. It's also virtually always very fast: there's typically far less configuration than actual content/data to migrate.

  11. Fixed D7 DB ⇒ Drupal reinstall

    I know for a fact that most of those 431 messages are now obsolete. And it's more work to figure out if the >100 new messages are legitimate or not.

    So: time to restart the migration environment again: reinstall Drupal 9, to have a clean destination slate.

    Execute the commands in step 7 again:

    vendor/bin/drush site-install✂️
    jq -r '.installModules✂️
    vendor/bin/drush state:set✂️

    … just like we did when switching to PHP 7.4.

    I personally like to keep things simple while reducing the probability of mistakes. So I like to concatenate repetitive commands with &&, to allow me to quickly find it in my terminal's history and re-execute it:

    vendor/bin/drush site-install✂️ && jq -r '.installModules✂️ && vendor/bin/drush state:set✂️
  12. drush ama:import --all

    Now that we can expect the migration to reach ~98% again with just 10 or so clicks (due to dependencies between migrations, we'd have to repeat 8 clicks spread across ~5 minutes.

    What if we could avoid those clicks?

    What if we could ask AM:A to just keep importing whatever the next set of migrations is whose dependencies are met?

    And while that is happening, I can following along in my browser:

    (On a bigger site, this would take hours instead of minutes, in that case I could investigate the messages of the current migration, live. Or investigate the messages of an already finished migration.)

  13. 98%! 🥳

    Yay:

    The last ETA update:

    🏁 < 1 sec (elapsed: 3 mins for 3215 rows)

    3 minutes to re-migrate an entire site from Drupal 7 to 9. Not bad! 🚀

  14. Time check ⏱️

    This is the result, after:

    • 5 minutes to generate a Drupal 9 AM:A project from my Drupal 7 site
    • 2 minutes to get to ~54%
    • 1 minute to switch to PHP 7.4
    • 2 minutes to get to ~54% … again 
    • 40 minutes to investigate a problem
    • 2 minutes to test the hunch
    • 3 minutes to re-run the entire migration

    That leaves me with a few minutes in my 1 hour to do what that screen is telling me! 🤓

  15. Review, step #1: migration messages

    There are 47 migration messages, of which 28 are entity validation errors (which I can fix manually, worst case), and 19 actual migration errors. Let's focus on those first.

    2 are for filters, and they have suggested solutions ("PHP code" filter 🙈). 1 is for a field formatter that no longer exists, and it also has a suggested solution. 15 are for menu links to views that do not exist (Views has no migration path, unfortunately — you'll have to recreate views manually!), 1 is for a vocabulary I do not want to migrate anyway.

    So: tweak the text formats by hand, I don't care about the formatter, I'll have to create the views anyway and I'll just delete the vocabulary ✅

  16. Review, step #2: missing rows

    Looking at the migrations that did not reach 100% of rows imported, for me those are:

    1. User accounts: 1 row in d7_google_analytics_user_settings fails to migrate, but I do not want to continue using Google Analytics anyway.
    2. Terms: 1 row in d7_taxonomy_term:vocabulary_2, but this was a demo vocabulary anyway, now obsolete (interestingly, the term name was null! 😮)
    3. Block placements in my custom theme: 3 rows in d7_block:wimleers_v2:simple — of course Drupal cannot know which regions to map to! I'll fix this manually.
    4. Site configuration: action_settings, d7_system_authorize and d7_action each had a row that did not import. None of these are actually important: I will uninstall the Actions module.

    The cruft you find! 🕸️ Also, delightfully, this means:

    1. skipping migrations of data I no longer need in the initial "Data to migrate" screen
    2. using the module auditor (see next)

    … should be enough to reach 100%! ✅

  17. Review, step #3: Module Auditor

    AM:A helps doing a 1:1 migration. It incorporates best practices (such as switching to Media + Media Library), but does not know which functionality you may no longer need.

    As long as it's pure functionality modules (rather than modules storing/managing/touching data), it's safe to just uninstall modules: the number of total rows to migrate will not change. But some will affect the migrated data. So it's better to uninstall them before going to the "Select data to migrate" screen in that case!

    This is what the Module Auditor screen looks like for me:

    This makes it simple for me to see which contrib modules AM:A automatically installed for me (AM:A will also "composer install" but not "Drupal install" modules without a vetted migration path — uncheck that checkbox at the top to see those too).

    In my case:

    • I don't need the "Insert" module anymore, even though an upgrade path is provided.
    • I do not want the Google Analytics module anymore, because I'd rather not expose readers of my site to Google's tracking empire.
    • And … to complicate matters: I adopted Markdown over a decade ago but now want to switch to CKEditor 5, so I'll have to do some extra work to convert that. 😅 That is definitely not a 1:1 migration anymore, so it falls out of the scope of this tutorial.
  18. Contribute your findings!

    I figured out the root cause for a problem. This means I should do two things:

    1. create an issue against the module providing the migration: #3390454 — I do not have time right now to fix it, but just sharing my findings is low-effort. It may help another person!
    2. create an issue against the AM:A project on Drupal.org to suggest the solution! The AM:A module includes a file called messages-solutions.yaml, which makes it easy to add more suggested solutions. If I suggest a solution for this specific UUID, the before (top) vs after (bottom) becomes clearly visible: 
       
      That change means that we go from an obscure message to something actionable!

    🙏 Even if you don't have time/energy to create issues with root cause analyses, please gather all your findings and submit an issue with a merge request. That alone can be enough to save fellow Drupal users many hours!

    I just did so here: #3390697.

Conclusion

I did not want to release AM:A into the world without a real-world example of it. It is not a silver bullet.

I hope I've shown you that with a minimum of terminal commands, you can restart the migration process very quickly. Without deep programming expertise but if you have tenacity and a willingness to dig deeper, you can figure out many problems.

When you figure out solutions for those problems, please report them as issues, and feel free to suggest more solutions for migration messages, I will make sure they are merged and released swiftly.

Finally, for sites with so much data that they need hours or even days to migrate: that's what the "Refresh" tab at the top right is for: to only migrate new and changed rows.

P.S.: I applied the conclusions from my review above (pushing me slightly past the hour). Only the 3 irrelevant "Site configuration" rows are left:

🚀