Creating Plone content with Transmogrifier on Python 3
TL;DR; This blog post ends with minimal example of creating Plone 5.2 content with Python 3 compatible Transmogrifier pipeline with command line execution.
Years ago, I forked the famous Plone content migration tool Transmogrifier into a Plone independent and Python 3 compatible version, but never released the fork to avoid maintenance burden. Unfortunately, I was informed that my old examples of using my transmogrifier fork with Plone no longer worked, so I had to review the situation.
The resolution: I found that I had changed some of the built-in reusable blueprints after the post, I updated the old post, fixed a compatibility issue related to updates in Zope Component Architecture dependencies, and tested the results with the latest Plone 5.2 on Python 3.
Transmogrifying RSS into Plone
So, here goes a minimal example for creating Plone 5.2 content with Python 3 Transmogrifier pipeline using my fork:
At first ./buildout.cfg
for the Plone instance:
[buildout]
extends = http://dist.plone.org/release/5-latest/versions.cfg
parts = instance plonesite
versions = versions
extensions = mr.developer
sources = sources
auto-checkout = *
[sources]
transmogrifier = git https://github.com/collective/transmogrifier
[instance]
recipe = plone.recipe.zope2instance
eggs =
Plone
transmogrifier
user = admin:admin
[plonesite]
recipe = collective.recipe.plonesite
site-id = Plone
instance = instance
Then buildout
must be run to create the instance with a Plone site:
$ buildout
Next the transmogrifier ./pipeline.cfg
must be created to define the pipeline:
[transmogrifier]
pipeline =
from_rss
prepare
create
patch
commit
[from_rss]
blueprint = transmogrifier.from
modules = feedparser
expression = python:modules['feedparser'].parse(options['url']).get('entries', [])
url = http://rss.slashdot.org/Slashdot/slashdot
[prepare]
blueprint = transmogrifier.set
portal_type = string:Document
id = python:None
text = path:item/summary
_container = python:context.get('slashdot') or modules['plone.api'].content.create(container=context, type='Folder', id='slashdot')
[create]
blueprint = transmogrifier.set
modules = plone.api
object = python:modules['plone.api'].content.create(container=item.pop('_container'), type='Document', **item)
[patch]
blueprint = transmogrifier.transform
modules = plone.app.textfield
patch = python:setattr(item['object'], 'text', modules['plone.app.textfield'].value.RichTextValue(item['object'].text, 'text/html', 'text/x-html-safe'))
[commit]
blueprint = transmogrifier.finally
modules = transaction
commit = modules['transaction'].commit()
Finally, the execution of transmogrifier with Plone site as its context (remember that this version of transmogrifier also works outside Plone ecosystem, but for a convenience transmogrify
-script also supports calling with instance run
):
$ bin/instance -OPlone run bin/transmogrify pipeline.cfg --context=zope.component.hooks.getSite
This example should result with the latest Slashdot posts in a Plone site. And, because this example is not perfect, running this again would create duplicates.
Transmogrifying JSON files into Plone
There’s never enough simple tutorials on how to build your own Transmogrifier pipelines from scratch. Especially now, when many old pipeline packages have not been ported to Python 3 yet.
In this example we configure a buildout with local custom Transmogrifier blueprints in python and use them to do minimal import from a JSON export generated using collective.jsonify, which is a one of many legacy ways to generate intermediate export. (That said, it might be good to know, that nowadays trivial migrations could be done with just Plone REST API and a little shell scripting.)
At first, we will define a ./buildout.cfg
that expects a local directory ./local
to contain a Python module ./local/custom
and include ZCML configuration from ./local/custom/configure.zcml
:
[buildout]
extends = http://dist.plone.org/release/5-latest/versions.cfg
parts = instance plonesite
versions = versions
extensions = mr.developer
sources = sources
auto-checkout = *
[sources]
transmogrifier = git https://github.com/collective/transmogrifier
[instance]
recipe = plone.recipe.zope2instance
eggs =
Plone
transmogrifier
plone.restapi
user = admin:admin
extra-paths = local
zcml = custom
[plonesite]
recipe = collective.recipe.plonesite
site-id = Plone
instance = instance
Before running buildout
we ensure a proper local Python module structure with:
$ mkdir -p local/custom
$ touch local/custom/__init__.py
$ echo '<configure xmlns="http://namespaces.zope.org/zope" />' > local/custom/__init__.py
Only then we run buildout as usually:
$ buildout
Now, let’s populate our custom module with a Python module ./local/custom/blueprints.py
defining a couple of custom blueprints:
# -*- coding: utf-8 -*-
from transmogrifier.blueprints import Blueprint
import json
import pathlib
class Glob(Blueprint):
"""Produce JSON items from files matching globbing from option `glob`."""
def __iter__(self):
for item in self.previous:
yield item
for p in pathlib.Path(".").glob(self.options["glob"]):
with open(p, encoding="utf-8") as fp:
yield json.load(fp)
class Folders(Blueprint):
"""Minimal Folder item producer to ensure that items have containers."""
def __iter__(self):
context = self.transmogrifier.context
for item in self.previous:
parts = (item.get('_path') or '').strip('/').split('/')[:-1]
path = ''
for part in parts:
path += '/' + part
try:
context.restrictedTraverse(path)
except KeyError:
yield {
"_path": path,
"_type": "Folder",
"id": part
}
yield item
And complete ZCML configuration at ./local/custom/configure.zcml
with matching blueprint registrations:
<configure
xmlns="http://namespaces.zope.org/zope"
xmlns:transmogrifier="http://namespaces.plone.org/transmogrifier">
<include package="transmogrifier" file="meta.zcml" />
<transmogrifier:blueprint
component=".blueprints.Glob"
name="custom.glob"
/>
<transmogrifier:blueprint
component=".blueprints.Folders"
name="custom.folders"
/>
</configure>
Now, by using these two new blueprints and minimal content creating pipeline parts based on built-in expression blueprints, it is possible to:
- generate new pipeline items from exported JSON files
- inject folder items into pipeline to ensure that containers are created before items (because we cannot quarentee any order from the export)
- create minimal Folder and Document objects with plone.api.
[transmogrifier]
pipeline =
generate_from_json
generate_containers
set_container
create_folder
create_document
commit
[generate_from_json]
blueprint = custom.glob
glob = data/**/*.json
[generate_containers]
blueprint = custom.folders
[set_container]
blueprint = transmogrifier.set
_container = python:context.restrictedTraverse(item["_path"].rsplit("/", 1)[0])
[create_folder]
blueprint = transmogrifier.set
condition = python:item.get("_type") == "Folder"
modules = plone.api
_object = python:modules["plone.api"].content.get(item["_path"]) or modules["plone.api"].content.create(container=item["_container"], type="Folder", id=item["id"])
[create_document]
blueprint = transmogrifier.set
condition = python:item.get("_type") == "Document"
modules =
plone.api
plone.app.textfield
_object = python:modules["plone.api"].content.get(item["_path"]) or modules["plone.api"].content.create(container=item["_container"], type="Document", id=item["id"], title=item["title"], text=modules['plone.app.textfield'].value.RichTextValue(item["text"], 'text/html', 'text/x-html-safe'))
[commit]
blueprint = transmogrifier.finally
modules = transaction
commit = modules['transaction'].commit()
Finally, the pipeline can be run and content imported with:
$ bin/instance -OPlone run bin/transmogrify pipeline.cfg --context=zope.component.hooks.getSite
Obviously, in a real migration, the pipeline parts [create_folder]
and [create_document]
should be implemented in Python to properly populate all metadata fields, handle possible exceptions, etc, but consider that as homework.
If this post raised more questions than gave answers, please, feel free to ask more at: https://github.com/collective/transmogrifier/issues.