- Dependency version bump for pyshacl and rdflib (thanks to @AutumnIsilme)
- Dropped Python 3.8 support
- Fixed a problem with missing authors/mails due to parsing errors (#53), thanks also to @willynilly
- CI fixes
Minor update after 3.0.1 was mistagged; adds support for detecting "archived" repos at GitHub and derives repostatus from that (unsupported or abandoned).
Minor update, adds support for detecting "archived" repos at GitHub and derives repostatus from that (unsupported or abandoned).
This major release updates the codemeta library for use with codemeta v3. It will hence-forth output codemeta v3 data only. Codemeta v2 can still be read as input and will be automatically converted. See https://github.com/codemeta/codemeta/releases/tag/3.0 for the codemeta v3 release notes which illustrates the changes. Some notable ones:
- We now use https://w3id.org/codemeta/3.0 as JSON-LD context, instead of https://doi.org/10.5063/schema/codemeta-2.0
- The RDF namespace has not changed between version and remains https://codemeta.github.io/terms/
- Codemeta 3 introduces a
isSourceCodeOfproperty. We now use this instead ofschema:targetProductto link between source code and applications/services. See also: codemeta/codemeta#271 .
- Remove distutils and use setuptools (needed for Python 3.12 compatibility) #50
- Fixed
setup.py codemetacommand #50
- Fix incorrect parsing of version from dependencies (closes #48)
- Java/Maven: translate organization from pom.xml to schema:Organization (type was missing)
- Check e-mail validity when inferring a maintainer
- Metadata update
Features:
- python: support "homepage" and "documentation" fields from pyproject.toml, limited support for "readme" too
- python:parse maintainers from pyproject.toml #44
- npm: support 'maintainers' field in package.json #44
- updated to latest codemeta crosswalks (proper v3 compatibility is coming in the next codemetapy release)
Bugfixes:
- python: fix incorrect parsing of versions from dependencies if e.g. extras are stated to be installed #42
- npm: ensure url is retained for all contributors #45
- ensure lists are always sorted in some way so output is deterministic #39
- we eagerly turn literals into resources when they exist in our graph; exempt certain properties like 'url' from this behaviour #46
- Split off all HTML generation code to a seperate project codemeta2html: https://github.com/proycon/codemeta2html
- prevent unnecessary URI remapping
- set pyshacl version fixed to 0.20.0, versions 0.22.0 break stuff; to be re-evaluated later
- better detection of json-ld for --addcontextgraph
- more robust URI generation
- added load function for API usage
- web: extract title from h1 if no title found in head
Bugfix release:
- Remove stub targetProducts (i.e. without url) for web applications/services if we have better ones (i.e. with url)
- Minor fix in verbose log output
- Removed some itemss from deviant context, no longer needed
- Nodejs: fix for contributors parsing
- html visualisation: fix for screenshot display
- html visualisation: also allow screenshots and references on targetproduct pages
- nodejs: fixing parsing of contributors
- nodejs/npm: remove the scope from the name in conversion to codemeta
- Implemented support for converting Rust's Cargo.toml #10
- added --addcontextgraph parameter to add information to the context graph but not to the JSON-LD context
- expand implicit id nodes also when there is a known namespace prefix (CLARIAH/tool-discovery#33, CLARIAH/tool-discovery#34)
- fix recursion problem in item embedding, and skip embedding for certain acyclic properties
- implemented direct parsing of pyproject.toml #28 CLARIAH/tool-discovery#35
- if labels have a language, always choose english (for now)
- minor style fixes for frontend
- allow merging heterogenous developmentStatus
- java: resolve ${project.groupId} and ${project.artifactId} variables
- improved own codemeta metadata
Bugfix release:
- collide blank-nodes that have exact the same content (assume same URI), should solve issue #36
New feature in html visualisation: added support for aggregation of tools in groups/suites.
Bugfixes:
- Fixed namespaces in HTML output
- Fixed template error in table view
New:
- Added richer meta tags in HTML output
This is a pretty big release with a lot of refactoring, bugfixes and various new features:
- Major refactoring and numerous bugfixes
- schema:author and schema:contributor are now always interpreted as ordered lists (even if the context doesn't make this explicit), this has repercussions for querying (e.g. SPARQL) #22
- Reimplemented JSON-LD object framing
- Graphs output (--graph) now also does object framing for per SoftwareSourceCode entry and uses expanded form (= some duplication/redundancy)
- When assigning URIs for SoftwareSourceCode and SoftwareApplication, add a version component. So each version has its own URI (requires --baseuri to be set)
- Added support for DOIs in schema:identifier, shown also in html output #33
- Use schema.org and codemeta context as officially published #32
- Set TMPDIR in a more platform independent way #31
- Assume input to be installed python packages when no explicit type is provided nor can be detected #27
- Reference publications didn't visualize properly yet in html output #18
- HTML output now shows a citation example for the software itself (incl DOI if set)
- Improved license mapping to SPDX vocabulary
- Do some simple license conflict detection and resolution in case multiple licenses are specified
- For the --enrich option: Consider first author as the maintainer if none was specified
- Implemented support for Technology Readiness Levels (use --trl parameters to opt-in)
- Added an --includecontext option that includes further context information in the codemeta JSON-LD output (like from the repostatus ontology, from SPDX, etc, adds redundancy but makes the information more complete)
- Added an --addcontext option to customise extra JSON-LD context to load and add (affects --includecontext)
- Python parsing: Improved parsing of Python Project-URL labels
- Renamed parameter --toolstore to --codemetaserver, set for use with codemeta-server
- Upgrade to v14 of schema.org
- Added --interpreter option to dump the user in an interactive python environment, helps with debugging
- jsonld serialization: serialize lists alphabetically by schema:name/@id/schema:identifier if schema:position is not used (#26)
- fix: properly deal with ~= and != operators in python dependencies
- fix: strip leading/trailing whitespace in author names/mails/etc
- new feature: improved python Project-Url parsing
- Re-implemented
--no-extrasparameter to skip extra (python) dependencies (closes #24) - Allow egg_info directories in subdirectories
- Many fixes and improvements
- Added unit/integration tests #20
- Added support for gitlab API (#19, thanks to @xmichele)
- Added support for private git repos (#19, thanks to @xmichele)
- Implementing support for the software-iodata profile: https://github.com/SoftwareUnderstanding/software-iodata
- Implemented ability to validate metadata against a SHACL schema (#21)
- Generates automatic validation reports and adds those to the metadata
- Visualised as a 0 to 5-star ranking in the html output
- Major updates to the html visualisation
- Added an additional service-oriented index (showing only web applications)
- Added opt-in automatic enrichment of codemeta (based on some inferences we can make)
- Detect redirects by single-sign-on middleware that prevent us from further metadata harvesting
- Allow constructing codemeta.json from scratch without any input, merely passing command line parameters (use /dev/null as input)
- Use repostatus ontology (jantman/repostatus.org#48)
- Implemented support for handling projects with pyproject.toml #17
Bugfix release: don't trip on packages without dependencies (#16)
This is a major new release of codemetapy. It does introduce some backward-incompatible changes.
- Major overhaul of the entire codebase:
- Implements codemeta 2.0 with some extensions (see the README)
- map developmentStatus to repostatus.org vocabulary #7
- map licenses to SPDX vocabulary #8
- The old 'entrypoints' extension to codemeta (as described in https://github.com/codemeta/codemeta#183 ) is now deprecated in favour of the newer software types extension (proposed in https://github.com/codemeta/codemeta#271 and worked out in https://github.com/SoftwareUnderstanding/software_types ).
- Supports
schema:targetProductto link software source code to instances of the software - Supports extended software types, on top of the ones already available in schema.org.
- See the README for more info
- Supports
- Implemented support for parsing and converting Java/Maven
pom.xmlto codemeta #9 - Implemented support for parsing and converting NodeJS/npm
package.jsonto codemeta #11 - Implemented support for parsing and converting remote webservices (via
targetProduct) (https://github.com/CLARIAH/clariah-plus#92)- Can extract
<script>blocks withapplication/json+ldfrom HTML - Parses and converts metadata in HTML
<head>(including RDFa and microdata)
- Can extract
- Improved support for parsing and converting Python/setuptools/distutils to codemeta
- use
runtimePlatforminstead ofprogrammingLanguagewhen converting pip's 'programmingLanguage' classes - No longer requires software to be actually installed prior to parsing
- use
- Implemented supported for parsing and converting from the GitHub API to codemeta
- Set environment variable
GITHUB_TOKENto your personal access token if you run into rate limitations.
- Set environment variable
- Improvements in merging/reconciliating metadata that describe the same source, but from multiple perspectives
- Improvements in joining multiple sources together in one graph (
--graphparameter, replaces the old--registryparameter) - Improvements in author parsing
- Implemented support for ingesting simple textual lists of authors as is customary in files like
AUTHORS,CONTRIBUTORS,MAINTAINERS.
- Implemented support for ingesting simple textual lists of authors as is customary in files like
- Rich HTML visualisation (with RDFa!), is used primarily by codemeta-server (https://github.com/CLARIAH/clariah-plus#99)
- Added a
--strictoption to disable codemeta extensions (the inverse of the old--allparameter that is now removed) - Dropped support for Python 3.5 and below
This release also comes with two related projects that rely on codemetapy, together they form a powerful ensemble:
- codemeta-server - Server for codemeta, in memory triple store, SPARQL endpoint and simple web-based visualisation for end-users
- codemeta-harvester - Harvest and aggregate codemeta from source repositories and service endpoints, automatically converting known metadata schemes in the process. Wraps around codemetapy and other codemeta software.
Added the ability to detect multiple authors #5
Previous release was a bit premature, there was a bug related to #4 still that has now been fixed.
- parse dependency versions and store them explicitly; don't stumble over extras (they will be processed as any other dependency, the 'extra' information bit does not get converted. #4
- added a
-no-extrasparameter that disregards all the extras. #4
Minor bugfix release: do add duplicate entrypoints
Minor bugfix release: do not reset entrypoints when chaining
This release makes some changes to the way codemetapy works:
- Instead of parsing pip output, the tool now uses importlib.metadata to query for metadata. As metadata is read after installation, this work regardless of how the metadata was initially specified (setup.yp, setup.cfg or pyproject.toml)
New features:
- Added an output file parameter (
-O) - Added an integration hook for setuptools, allowing users to add a codemeta command to setup.py
Fixes:
- Prevent duplicates in authors and other fields
Minor update to entrypoint extension: attempt to automatically read the docstring for each entrypoint and use it as a description for the entrypoint metadata
(Minor rerelease without changes just to trigger a DOI on Zenodo)
- better failure and exit code if identifier was not found in registry
- Added some simple support for converting debian package metadata from apt show to codemeta (#1)
Minor bugfix release
Minor update release:
- Added
--with-orcidparameter to generate placeholders for ORCIDs in author details (#2)
- Bugfix release
- Added a
resolve()function that resolves nodes that only have an@idwhen such a node was previously introduced (not used internally yet)
- making registry jsonld complaint
- added schema:audience property
- Work on entrypoints, defining extra context for entrypoints (codemeta/codemeta#183)
- lowercase all identifiers
- Minor fix: omit empty fields, use lower case identifiers in registry