x.com engaged Acquia in a services agreement for development and migration efforts that lasted approximately 6 months and then transitioned into a support phase with Acquia’s Remote Administration offering. Acquia partner VML was engaged to perform site-building and theming development, and Cyrve (acquired by Acquia midway through the project) was contracted to provide migration services. Acquia’s Adam Boysen served as Project Manager.
Drupal 6 or Drupal 7?
The first major decision on the project was what version of Drupal to use - Drupal 6 (specifically Commons) or the recently-released Drupal 7. The only significant reason to consider Drupal 6 was for contributed modules not yet ported to Drupal 7. It was felt that the long-term advantages of Drupal 7 outweighed the short-term risk (i.e., the time spent contributing to porting of modules).
A Skype chat room including all the project participants, particularly x.com folks with understanding of the legacy system, was an essential tool for the project - it was invaluable to be able to quickly get answers to questions about legacy data, discuss design decisions, etc.
Critical to a rapid launch of the migration project was getting access to real data as early as possible - if migration is on the critical path, then data access is on the critical path.
Most of the early work was in establishing the mappings between legacy and Drupal data. The workflow for this process was as follows:
- When some Drupal object to be created via migration was identified, we sketched out an initial migration implementation, making any obvious mappings and entering questions for other fields, with group assignments to either x.com or VML.
- We then exported the mappings to a shared Google spreadsheet (e.g. drush migrate-mappings --csv --full JiveUser >/tmp/user.csv)
- x.com and VML reviewed the questions in their domains, adding comments in the last column, chatting in IRC or Skype if the questions weren’t clear or the situation was more complex.
- Over a couple of weeks, several project-wide teleconferences reviewed and refined the mappings.
- We incorporated the feedback into the migration implementation/documentation – marking fields as DNM (Do Not Migrate), making field mappings, or perhaps reassigning (i.e., if we initially asked x.com if a field needed to be migrated and they said yes, next it went to VML to identify where we needed to migrate it to).
- Back to Step 2 and refine.
A couple of rounds like this takes care of most fields – anything left over (the more complex cases), or that arises once the migration work is in full swing, can be entered as tickets (note that the field mappings in the code can be linked directly to related tickets).
Migrate module enhancements
The x.com content was stored in an Oracle database. At the time of The Economist migration (also Oracle-based) Cyrve was using Migrate V1, which was designed to accept data through Views - thus, the Oracle data was transferred to MySQL, and from there views were developed to feed Migrate. This experience was a major factor motivating the complete refactoring of Migrate for V2, which has a plugin API for both sources and destinations. In subsequent projects with Oracle data, the client chose to present the data through XML feeds. This was our first opportunity to import directly from an Oracle database.
Our first thought was to use the contrib Oracle driver based on the Drupal 7 Database API. This would enable us to use Migrate’s existing MigrateSourceSQL plugin, which works with any database supported by a driver for DBTNG (as it’s commonly known). It turned out, however, that the driver only supported installing Drupal itself into an Oracle database - it did not support using an Oracle database as a secondary connection in a MySQL-based Drupal environment. We looked into making contributions to remove this restriction, but found this to be a non-trivial problem. Before investing much time down this path, we found this statement from Oracle:
The PHP community has let the PDO project languish and Oracle recommends using OCI8 instead whenever possible because of its better feature set, performance and reliability. Only a few minor changes have been made to PDO_OCI in PHP releases since its introduction. The version of PDO_OCI on PECL has not been updated with these fixes and is still at version 1.0.
As a result, we decided to focus our energy on developing a custom source plugin for Oracle. This turned out to be fairly simple - simply cloning the existing Microsoft SQL Server plugin and substituting the appropriate Oracle API calls got us most of the way there, with some tweaking necessary for Oracle quirks (like special handling for CLOBs, which are used in Oracle instead of TEXT fields for large strings).
Jive stores file data (such as images and PDFs) as BLOB fields in the database - to support this, we added the capability of creating an actual file from database-sourced data and attaching it to a file field.
Other contributions included:
- --csv and --full flags added to drush migrate-mappings command, to facilitate communal review of mappings.
- http://drupal.org/node/1190694> - Deletion bug in privatemsg
- http://drupal.org/node/625062 - devel debug mail class
- http://drupal.org/node/1254398 - migrate debug mail class
- Added profile2 and votingapi support to migrate_extras
- http://drupal.org/node/1199150 - fid-based file field migration
- http://drupal.org/node/1202926 - user_relationship documentation
- http://drupal.org/node/1202932 - user_relationship, specify type by name
- http://drupal.org/node/724492 - flag support in migrate_extras
- http://drupal.org/node/1205278 - migrate, prevent file deletion on rollback
- http://drupal.org/node/984050 - privatemsg support in migrate_extras
- http://drupal.org/node/1195802 - fix to mssql plugin
- http://drupal.org/node/1227130 - Fix that was suppressing callback messages
- http://drupal.org/node/1192538 - quicktabs errors