I’m writing AMP versions of our news pages, and one of the problems is that the editors embed a lot of content, which just doesn’t work on AMP pages.
Rather than write a whole new chunk of CMS I thought I’d see if we can work out a way to just hotswap them out with regex on the AMP pages, then no change of workflow is required.
So, we need embedded items that have AMP versions, so let’s take SoundCloud as an example.
Here’s the normal HTML iframe:

So if we use the following regex:
\<iframe\s+.+api.soundcloud.com\/tracks\/([a-zA-Z0-9]+).+\<\/iframe>
Then this captures the ID shown in green above. Here it is on Regex101 – a great resource!
So now in code we can simply swap out the code with our regex and output the AMP SoundCloud component with the ID in it (depending on your code language):
regEx.Replace(StoryBody, “<amp-soundcloud height=””166″” layout=””fixed-height”” data-trackid=””$1″” data-visual=””true””></amp-soundcloud>”)
With some more work we could get the height out of the original HTML, but I don’t need to do that.
So for YouTube, we do the same, here’s the regex:
\<iframe\s+.+www.youtube.com\/embed\/([a-zA-Z0-9]+).+\<\/iframe\>
And the component:
regEx.Replace(StoryBody, “<amp-youtube data-videoid=””$1″” layout=””responsive”” width=””600″” height=””340″”></amp-youtube><br>”)
And we have nice AMP YouTube video components! Don’t forget to include the header declarations for each type.
And as a bonus, here’s the regex to get the status ID of a Twitter Tweet:
\<blockquote\s+class=\”twitter-tweet\”[\S\s]+\/status\/([0-9]+).+(\n|\r)a*\<script\s+.+\<\/script\>
Comments