What JATS4R can achieve, with a little help from its friends

Beck J, Harrison M, Laverick S, et al.

Publication Details

Introduction

JATS4R (JATS for Reuse) is an inclusive working group whose mandate is to optimise the reuse and exchange of scholarly content throughout our shared scholarly publishing infrastructure. JATS4R works principally by developing best-practice recommendations for tagging content in JATS XML, ultimately aiming not only to improve content reusability and exchange, but to enable publishers and other content handlers to find efficiencies and facilitate innovation through shared tagging best practices.

Most scholarly publishers today use NISO JATS XML to tag their scholarly content, and JATS4R is often asked what its relationship to the JATS Standing Committee is, and why JATS4R’s work is needed when most organizations already use the same XML standard. The answer is that the two groups complement one another. The JATS Standing Committee develops and maintains the NISO JATS XML standard, and does so in a purposefully “loose” way to encourage its wide adoption. JATS4R’s job is to take this goal of value-through-commonality one step further, by developing prescriptive tagging practices not only for existing article objects, but for objects that are either emerging as a result of the digital landscape of scholarly publishing (such as data availability statements), or else are becoming interesting candidates for more precise markup, to serve the larger goals of openness and transparency (such as conflict-of-interest statements).

To carry out its work, JATS4R requires input from many kinds of people and organisations in scholarly publishing, efficient processes, good communication, and effective organisation. This is no small feat considering that it is run entirely by volunteers, and maintaining momentum is a challenge. Since its launch in 2013, JATS4R has gone through many iterations and procedural experiments to become as efficient as possible while remaining flexible enough to include diverse participation from the scholarly publishing community. In 2017, JATS4R developed an entirely new process that has led to greater engagement from the community, as well as the generation of several new sets of recommendations. As a result of its work, in 2018 NISO approached JATS4R to become a NISO working group. And as of December 2018, JATS4R recommendations became a suite of NISO Recommended Practices.

In this paper, we will outline JATS4R’s processes and workflow, including the Steering Committee members’ commitments and what the association with NISO means. We will also present a few case studies of recommendation development efforts, because they highlight the specific challenges that JATS4R faces in its work. Finally, because JATS4R is an inclusive group whose work is driven by the scholarly publishing community, the conference forum will also be used as an opportunity to engage the audience to join the effort and contribute to generating more recommendations, as well as steer the group to its next set of priorities.

JATS4R Processes and Workflows

The Steering Committee

The seven-member Steering Committee, the persistent core of JATS4R, meets remotely each week to discuss administrative as well as technical- and subject-related questions. Members also convene and chair any subgroups. The subgroups, which are short-term open-invitation working groups, deal with specific article objects from the JATS4R recommendations roadmap. Steering Committee members also represent the organisation through outreach to the community and raise awareness across the industry about the challenges to content reuse and exchange posed by inconsistent XML tagging. The Steering Committee maintains and prioritises the roadmap in response to feedback from the community on objects for which defined best-practices in tagging would solve problems or otherwise be helpful. Occasionally, independent groups outside of JATS4R form to address recommendations for certain article objects; in these cases, JATS4R may decide to adopt the resulting outputs to align with these external consensus groups. Finally, the Steering Committee maintains a validator tool which content providers can use to check their XML against JATS4R recommendations, and regularly adds new rules to the tool following the publication of new recommendations.

The Subgroups

Once the Steering Committee identifies the next article object on the roadmap [https://github.com/orgs/JATS4R/projects/1] to be addressed, a Call for Participation is sent out through various channels to reach as wide an audience as possible across the publishing industry. Ideally, subgroups should consist of at least five members. Once formed, the subgroup is tasked with gathering samples from various sources, although this is not always possible for emerging article objects for which usage is not yet widespread and few if any tagged examples are available. The subgroup convenes regularly (usually bi-weekly or monthly) until a recommendation set has been drafted. Once the draft is finalised, it is reviewed by the Steering Committee to ensure alignment with the different prerogatives of JATS4R and then posted online for a public commenting period that usually lasts one month.

After the commenting period is closed and comments are compiled, the subgroup responds to the comments and reviews the draft recommendation. Depending on the extent of the changes needed to satisfy the public’s feedback, the subgroup may need to reconvene for further discussions, and a second draft version may be submitted for public commenting again. If the comments are minor and have been addressed by the subgroup, the recommendation may be finalised and published online after final review by the Steering Committee and the NISO Topic Committee.

In some cases, the subgroup may decide it is necessary to ask for a change to the JATS DTD and will submit a comment to the JATS Standing Committee for consideration. In such cases, the recommendation development process will be paused until the JATS Standing Committee has made a decision and further action can be taken.

Updating Recommendations

Recommendations are updated and adapted as new article objects come to light, the JATS standard is revised, or new standards emerge. Under our new process, all recommendations are automatically reviewed by the Steering Committee after a 2-year period to assess whether any updates are necessary. Recommendations may also be updated following a shorter period if specific gaps or new issues are identified. Recommendations requiring update are added to the roadmap to be covered when a new subgroup can be formed.

The Validator Tool

JATS4R’s plan had always been to maintain a helpful tool that would enable users to validate their article XML against JATS4R’s recommendations and check their compliance. JATS4R built and released a validator tool (Maloney, Chris, Alf Eaton and Jeff Beck. A client-side JATS4R validator using Saxon-CE. Presented at Balisage: The Markup Conference 2015, Washington, DC, August 11 - 14, 2015. In Proceedings of Balisage: The Markup Conference 2015. Balisage Series on Markup Technologies, vol. 15 (2015). https://doi.org/10.4242/BalisageVol15.Beck01) in 2014, but it was complex and difficult to maintain. A new JATS4R validator is under development and should be available for testing by the conference. Eventually it will be publicly available through the JATS4R website.

The new validator tool provides only error and warning messages. In an effort to streamline the user’s experience with the tool, we no longer supply tagging hints as information messages. The validator tests are written in Schematron, and both errors and warnings are reported. The validator will be distributed three ways:

  1. As an API that can be run against an article with a report returned in .json. This report will include all error and warning messages.
  2. As an online web tool that will take an article XML upload and report errors and warnings.
  3. As a set of downloadable Schematron files that can be built into a production workflows.

The Schematrons are written into files based on the recommendations. For example there are files like

  • permissions-errors.sch
  • permissions-warnings.sch
  • math-errors.sch
  • math-warnings.sch

The downloadable Schematron files enable a user to selectively apply only the tests that they are ready to use.

JATS4R and NISO

In December 2018, JATS4R joined forces with NISO to become a NISO working group, and its recommendations are now NISO Recommended Practices.The JATS4R Steering Committee and NISO representatives had a number of calls and email exchanges to discuss what this relationship would mean to both organisations and what impact it would have on JATS4R particularly. The main challenge was to ensure that JATS4R remains as agile and inclusive as possible, while adhering to the added rigor of process that being part of NISO would provide.

Concerns were allayed once we knew that being part of NISO would pose no limits on subgroup participation: no person (or the organisation they represent) who joins our subgroups needs to be a member of NISO. On the positive side, there are many benefits for JATS4R from becoming affiliated with NISO:

  • Each recommendation set will have another vetting stage in the process, which will provide it with another set of eyes before publication.
  • NISO has a Zoom meeting account, so all JATS4R subgroup calls going forward can be recorded and we will make the recording and transcript available on the JATS4R Google drive. Publishing is a global industry, and with our aim to be as inclusive as possible, we sometimes struggle to hold calls that are at a time that is accessible for all those who would like to join. Having recordings will enable more people to be involved, and also speaks to our values of transparency and inclusivity.
  • New recommendations will be co-published on the JATS4R and NISO websites. The version on the NISO site will have a persistent identifier in the form of a DOI associated with it.
  • The marketing and outreach arm of NISO will provide added publicity for JATS4R’s activities, such as notice of upcoming subgroups being formed and recommendations released for public review or final publication.

Case Studies

Here we discuss a few recommendation sets which highlight some of the challenges we face when developing JATS4R recommendations. One overall limitation is that, despite the size of the Steering Committee, not all of the committee members have the capacity to run subgroups, and for those that do, chairing a subgroup is a significant commitment that can sometimes span several months. We cannot forget that every contributor working on a subgroup is a volunteer, including the members of the Steering Committee. Currently, only Steering Committee members chair the subgroups, and this may ultimately prove to be a constraint to recommendation development that JATS4R will need to find a solution for.

Versioning and History Recommendation

This recommendation started its life at an earlier time in JATS4R’s history, when it did not have a Steering Committee and recommendations were made sequentially by a fairly large but inconsistent group of volunteers. At that time, participants of JATS4R joined biweekly calls and the agenda covered anything JATS4R related, but mainly the topic in hand. As a result, participants tended to get fatigued and attendees varied from call to call, which meant that the work that had been achieved during one call was not necessarily well understood or easily taken up by the people in the next call. In retrospect, we can look back and say this was a lesson in how not to run a recommendation!

The versioning and history recommendation was essentially two topics merged together. The term ‘versioning’ itself can mean different things to different users, so each new attendee would potentially have a different spin on what should be achieved. Versioning is also a relatively new idea in publishing, despite the publication of the JAV guidelines [https://www.niso.org/publications/niso-rp-8-2008-jav] over 10 years previously. As a result, ideas evolved, divided, reversed, reformed, and so on. It took over a year to come to some consensus, which involved a change-request of the JATS schema to the JATS Standing Committee.

The comment submitted to the Steering Committee was considered, and the response was a variation that could work. This work needs to regroup as a dedicated subgroup, and as such, progress should be much quicker and more efficient!

Clinical Trials

This relatively small group consisted of publishers who publish clinical papers. As happens occasionally, Crossref had already done some groundwork via their Linked Clinical Trials initiative (https://www.crossref.org/blog/linked-clinical-trials-are-here/) and where there were areas we felt we needed additional input, we invited the chair of the Crossref working group to join some of our calls and provide additional expert input. By extending the invitation for a few calls to someone less interested in the JATS perspective but expert in the field, we felt our recommendation was enriched. This group also encountered specifications that are more relevant to a publishing world focused on print and PDF, but knowing these limitations helped us modify the recommendation to fit. Although clinical trials have been around for some time, content providers have only recently begun to mark them up in JATS XML, so publishers tend to be flexible in their ability to mark up a new object. The clinical trials recommendation set is there for publishers who are thinking about the machine readability and reuse of clinical trial data and want to begin tagging them in JATS XML, so we hope the clinical trial recommendations will sheppard the JATS XML down one tagging route, rather than allowing multiple tagging options to proliferate, thus posing a potential barrier to reuse (see our next case study!).

Authors and Affiliations

The Authors and Affiliations recommendations subgroup work was probably one of our more challenging set of recommendations to develop and took almost a year from inception to final publication. One reason for this is that authors and affiliations (and how these are associated to one another) are mature forms of metadata, having been around for as long as the journal article itself. As a result, many publishers are quite invested in the styles, processes, and tools they use to handle authors and affiliations, and are leary of potentially time-consuming and expensive changes to their XML markup. But equally challenging is that authors and affiliations are not only quite complex structures, but essential for an article’s reuse and interoperability. So the challenges in the subgroup quickly became (i) how do we develop recommendations that are flexible enough for publishers to adopt and yet are strict enough to provide sufficiently predictable inputs for the machine systems that must ultimately handle them, and (ii) how do we define a workable scope for the subgroup here; meaning, how granular should we get in our recommendations, especially when we don’t necessarily know exactly what aspects of author and affiliation XML currently pose barriers to reuse and exchange. So we took our cue from as many known issues as possible, had many challenging discussions, and listened carefully to public comment. Ultimately, we found a path forward that all subgroup members could live with and see themselves and other publishers adopting. That said, there is more work to be done in this area.

Recent Recommendations

With the increase in output of recommendations, we’ve redesigned the website so recommendations now have their own landing page [https://jats4r.org/recommendations-list] rather than being a drop down list from the tab at the top of the home page. Recently published recommendations are:

Active Subgroups

Current active subgroups [https://jats4r.org/subgroups] are

Sum Up: How Everyone can Participate in JATS4R

An organisation is only as good as the people who carry out its mission. As an organisation run entirely by volunteers, JATS4R has no staff, and very few resources. What it does have is a dedicated Steering Committee and input from many members of our community who have contributed significantly to our current slate of recommendations. Together, we have already achieved so much on a shoestring. But our vision of a world in which scholarly research travels seamlessly from authors to those who need it requires us not only to continue to make recommendations, but to improve their uptake and develop a validator tool that becomes a normal and time-saving part of the production process for most publishers. And all of this means that JATS4R needs more help. And not just from publishers! Vendors, typesetters and compositors, archives, libraries, ID-assigning authorities, and independent XML aficionados of all stripes are all welcome to help.

Here’s how:

If you have some time, consider participating in a JATS4R subgroup. The subgroups meet on a short-term basis and focus on a specific problem, which provides a defined commitment period and ensures that participants have a say in shaping the recommendations that will affect the article objects that matter most to them.

If you have only a little time, you can read and comment on draft recommendations when they are released for public review. You don’t have to be a subgroup member to do this, and you can participate this way no matter your time zone.

Finally, even if you have no time, suggestions for the roadmap are always welcome and are important in helping the Steering Committee to prioritize and decide which recommendations to tackle next. So if you have a suggestion, please email us at gro.r4staj@ofni.

Acknowledgments

This work was carried out in part by staff of the U.S. National Library of Medicine (NLM), National institutes of Health with support from NLM.