-
Notifications
You must be signed in to change notification settings - Fork 38
Content pipeline #132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Content pipeline #132
Conversation
Making Perun more file-based than metadata-based makes composing arbitrary rendering pipelines easier, by leveraging Boot's primary abstraction, the fileset. This opens up more possibilities of using third party tasks that don't know anything about Perun, and makes Perun's own tasks more useful. Summary of changes: - Tasks no longer store content in the :content metadata key, but instead use the actual file's content - `markdown` now sets `:parsed` instead of `:content` - In addition to setting `:parsed`, `markdown` also writes an html file that contains the parsed content, for processing downstream. - Rather than setting `:content` on the file that `markdown` writes, a new flag, `:has-content` is set to indicate to downstream tasks that this file is available for further processing - Default filterers are set to `:has-content` - Most of `markdown`'s functionality is wrapped by `content-pre-wrap`, which can be reused by future input parsing tasks - `collection` can now be used as a kind of input-parsing task that produces a page fragment in the same manner that `markdown` does. This fragment can then be picked up by subsequent `render` tasks and used in the same way as the output from `markdown`. The original usage of `collection` to produce a complete page is also still supported, with no changes. - `collection` no longer sets `:date-build` or `:canonical-url`, because you can now use the `build-date` or `canonical-url` tasks to set these keys, just as you would with any other content. - You can now chain multiple `render`s together in the same task pipeline. This allows you to split your rendering needs into as many logical parts as necessary.
…s`, and make it optional
Rather than only using the original parsed input, if that isn't available, fall back to input content, parsed current file, or current file content, in that order.
Adding this allows users to more precisely control what ends up in their feeds, and also avoids some tedious work in `content-pre-wrap`
|
I should also add:
|
This change prevents perun's `:path` from getting out of sync with where the file actually is. For instance, if a 3rd party task moves a file without perun's knowledge, this change allows perun to still get the correct path.
|
Code looks good to me. Will need to test it locally first. Thanks for the detailed description |
|
Correction: #105 still isn't really possible with this PR. However, I did just complete a branch for a future PR that does fulfill that goal. All in good time :). See https://github.com/bhagany/perun/wiki for details if you're curious. |
src/io/perun.clj
Outdated
| input-meta (meta-by-ext input-fs file-exts) | ||
| output-meta (doall | ||
| (for [{:keys [path parsed filename] :as entry*} input-meta] | ||
| (let [page-filepath (string/replace path #"(?i).[a-z]+$" ".html") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extensions can contain nearly any character including '-' and '_'. Perhaps this should be constructed by the file-exts parameter?
src/io/perun.clj
Outdated
| :original-path path | ||
| :path page-filepath | ||
| :filename (string/replace filename | ||
| #"(?i).[a-z]+$" ".html")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
src/io/perun.clj
Outdated
| :filename (string/replace filename | ||
| #"(?i).[a-z]+$" ".html")) | ||
| (dissoc :parsed :extension :file-type :full-path | ||
| :mime-type :original))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to update the :mime-type, :extension, and :file-type keys rather than removing them. They seems like something that it would be useful to filter on.
src/io/perun.clj
Outdated
| This task will look for files ending with `md` or `markdown` | ||
| and add a `:content` key to their metadata containing the | ||
| and add a `:parsed` key to their metadata containing the | ||
| HTML resulting from processing markdown file's content" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that this also writes out HTML files to the location related, is that correct?
| new-metadata (render-to-paths render-paths (:renderer options) tmp tracer) | ||
| rm-files (keep #(boot/tmp-get fileset (-> % :entry :path)) (vals render-paths))] | ||
| (-> fileset | ||
| (boot/rm rm-files) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might makes sense to log the names of removed files in debug mode.
src/io/perun.clj
Outdated
| new-path (make-path (:out-dir options) permalink path) | ||
| meta-entry (merge meta entry) | ||
| content-entry (assoc meta-entry :content content)] | ||
| (perun/report-debug "render" "rendered page for path" path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logging message is potentially confusing, as the rendering isn't actually occurring here. When someone's render task is crashing they might think that it had worked given that this printed out for all files. Perhaps it should be moved to render-to-paths.
|
Excellent feedback, @MartyGentillon. I believe I've addressed all of your concerns. |
MartyGentillon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
|
(also, yes, the tests pass here) |
podviaznikov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me
|
@MartyGentillon thank you for your help with discussion and reviews |
Credit to @MartyGentillon for the ideas that led to this PR (see the conversation on #109). This is the big change most of my past PR's have been leading up to - it meets the following goals:
:contentkeyrender-pre-wrap)The central conceptual change here is completing the transition from a more metadata-based approach to a more file-based approach. The first major step here was to store file metadata on boot's TmpFiles, rather than in fileset metadata. This allows us to move or delete files without having to maintain their metadata separately. The second (and last) step, is to do away with the
:contentmetadata key in favor of writing and manipulating TmpFiles, and using the content of the files, rather than maintaining that information in metadata. This allows any task to manipulate perun's final output, without that task having to be actually aware of perun and its metadata conventions.For example, I know some here are fans of Tachyons (http://tachyons.io/). With this change, it would be possible to write a general
tachyonstask that manipulates file content and can be used with or without perun. To explain with code,Before this change, the
tachyonstask would need to know about and manipulate perun's:contentkey in order to get what you wanted inperun/render, but now it can just change the file itself and ignore perun.In addition to being more composable with 3rd party tasks, the change from metadata content to file content also allows perun's task to be composed with each other in ways that weren't possible before. Here's a real-life example from the
build.bootfor my blog*:Some interesting things about this:
collectioncan be called beforerender, and the result can be treated the same as files generated bymarkdownrendertasks can be used to progressively build up pagesrenderstep, andcollectioncould render a finished page in one pass, if that works best for you.I'm very interested in any feedback you have, and in particular, if there are any use cases that are currently supported in master, but unsupported here.
Lastly, I would like you all to know that every time I think about this branch, my brain sings "It's the content pipeline" to the tune of "It's the final countdown". You're welcome.
* There's no content on the blog yet. Working on it.