I have a bunch of markdown documents with a mix of relative and absolute image destinations. e.g.
This is some text
![optional caption](/sub/folder/image.png)
And more text
![](https://example.com/cool_image.png)
I want to prepend a URL to each of the relative images, e.g. to change the above to
This is some text
![optional caption](https://some-image-host/image-host-subpath/sub/folder/image.png)
And more text
![](https://example.com/cool_image.png)
but preferably without hard-coding /sub/folder/
into the replace script (which is how I currently do it).
Is there a clever way to do this with awk
or sed
or is that a bad idea due to markdown having more edge cases than one expects?
I made some progress with https://pypi.org/project/marko/, e.g.
import marko
with open("myfile.md") as f: s = f.read()
doc = marko.inline.parser.parse_inline(s)
for i, e in eumerate(doc):
if type(e) == marko.inline.Image:
if not e.dest.startswith("http"):
doc[i].dest = "https://some-image-host/image-host-subpath/" + doc[i].dest
which finds all the images and updates the destination of each relative image with the URL, but I'm not quite sure how to render this list of inline elements back into a markdown string again, and I figured I would post here first before re-inventing the wheel in case there is a much simpler way of doing this.
TIA for any help.
This command will do it without altering the original file in-place:
sed 's_\(^!\[.*\](\)_\1https://some-image-host/image-host-subpath_' <input_file
Once you've confirmed it's what you want, you just need to add -i
after the
sed
and before the 's_...
and also remove the <
before input_file:
sed -i 's_\(^!\[.*\](\)_\1https://some-image-host/image-host-subpath_' input_file
The way the command works is as follows:
_
as the pattern delimiters instead of the more common /
,
because it means I don't have to escape every /
in the path name.^!\[.*\](
matches up to where you want to add the path.\(
and the \)
to remember it for
later.\1
, followed by the path.A simpler way would have been to simply replace the ](
part of the line with
])your_url_here
:
sed 's_](_](https://some-image-host/image-host-subpath/_' <test
but it's possible that the ](
combination might be found on other lines of
your files and so I opted for the stronger test ^!\[.*\](
which only matches
lines beginning with ![
and has some stuff before ](
.