NISO STS Project Overview and Update

Wheeler R, Rosenblum B, West L.

Publication Details

In August 2015, NISO announced the approval of a work item and a call for participation in the development of NISO STS, a standard XML tag set for the publication of standards. This work is based on the ISO Standard Tag Set (ISOSTS), which is based on JATS and is already used by ISO and many national standards bodies (NSBs). The NISO STS project will broaden ISOSTS for use by Standards Development Organizations (SDOs) and other organizations in the standards ecosystem. This paper will give an overview history of the ISO and NISO STS projects and a status update on the work of the Policy/Steering and Technical Workgroups. Discussion will include issues to be considered in keeping STS aligned with JATS and BITS, and the logistics for referral of appropriate issues between the three working groups.

The Problem

Consider that a researcher can peruse journal articles from five different scholarly publishers, and will find that the document structure is similar. It’s doubtful the scholar would even consider document structure, and will focus on the content and ideas inherent within the articles. Conversely, a civil engineer working with five different standards documents from five different SDOs while on a project repairing a bridge will find that the standards’ content structure can have significant structural differences. While this engineer gathers the pertinent information, he parses between the documents reading each a different way, which is inherently inefficient.

That describes a manual review. What about programmatic absorption of standards?

A discovery services group or librarian will load standards content into a repository or aggregating service application with a goal of finding and serving pertinent content to a searching engineer or scholar. The system or repository is challenged in that content headers and metadata can vary slightly to significantly, and the responsibility for harmonizing may be passed around and never fully resolved. Content may not be effectively delivered or discovered.

Interoperability/interchange

To reiterate, standards are less similar in make-up than scholarly articles. SDOs are frequently comprised of committees that draft standards and they have been assembling these standards a certain way throughout the organization’s history—these committees have no immediate plans to change structure. In fairness, they are often working with a document that may have originated 100 years ago, and this document has simply gone through modification after modification after modification under a cadence of expert review. Why would they want to change the elements?

The answer is they needn’t change their existing frontal make-up at all, so long as the description of the content can be consistently described via a harmonized XML/HTML structure for systems and applications to use and facilitate improved content ingestion.

The enlightened SDO can encourage a heightened level of interoperability of content to its members or manage it for them on the back end. The enlightened SDO knows the users of their standards are also using other SDO standards and can surmise that SDOs could accelerate time to knowledge by building some level of similarity. SDOs are also becoming aware that more automated tools are in development to make use of content that can identify table content, equations, and basic elements like scope harmoniously if built to an interoperable structure. There are plenty of other examples where interoperability could increase efficiencies for the user and others who are part of the business of delivering standards documents which we review in this paper.

Bringing standardization to standards

Many have gleefully pointed out the irony and even humor that those in the business of standards development have not found much standardization among themselves to date, and in fairness it may have had less recognizable benefit but for recent developments from the last decade. Some advantages have existed as long as SDOs have worked together to co-publish standards, where a document must be generated through two unique publishing systems, but the attempts to standardize have been concessions more than growth oriented.

So what is it about this time, this here and now, that has grown the interest in interoperability among SDOs? SDOs have organically gathered since 2012 to discuss standardizing while developing more modern workflows. Service bureaus and technology companies have concurrently approached SDOs with new tools than can be used by industry for standards content. Technology, production efficiencies, discovery, and an increased and productive dialog help interoperability continue to gain momentum. Often, a few high-profile organizations like IEEE and SAE can offer leadership in this regard for other SDOs to follow, and this was certainly the case with ISO who saw value in utilizing the JATS structure and made their work public to encourage other developers.

ISO Case Study

Old ISO workflow

Like many standards publishers, ISO's workflow prior to 2010 was based on MS Word files authored according to a standardized template [1]. At ISO, the files were edited and often reformatted due to errors in application of the template. When content was final, a PDF was made from Word and published. Full-text XML was not part of the publication process.

This publication process was not sustainable. Increased security and compatibility issues in Windows and Word made template distribution and installation difficult. Even when installed, template use varied widely by committee, often necessitating costly reformatting of documents by the ISO production team. The production team found Word useful for word-processing but often frustrating to use for "typesetting". Publication times were very slow, often measured in months [2].

ISO XML workflow

In 2010, ISO began a new initiative to publish using full-text XML. The new workflow was based on Customized Off-the-Shelf Software applications to convert Word files to XML (eXtyles) and then typeset those files using InDesign (Typefi). The move to automated XML-driven InDesign composition eliminated much of the time and frustration in using Word for page layout.

This new workflow allowed editors to continue to work in Word, regenerate XML at any time, and automatically create updated PDF proofs. The workflow also addressed a crucial element for ISO: the Word file always contained the final edited content of the standard so that the Word document can be returned to the committee for the next round of revisions [2].

ISO DTD selection

ISO needed a DTD to create a full-text XML workflow. The ISO team decided to derive their DTD from an existing model rather than build a new DTD from scratch. With the help of an outside consultant, they reviewed Text Encoding Initiative (TEI) [3], which ISO had used for some earlier projects; DocBook (an OASIS standard) [4]; Darwin Information Typing Architecture (DITA, an OASIS standard) [5]; and the Journal Article Tag Suite (JATS, ANSI/NISO Z39.96) [6]. All of these DTDs are freely available, have active community support, can be custom-modified to suit specific requirements, and have commercial support from tool vendors.

The JATS Connection

ISO selected JATS version 0.4 (the current version in 2011) for the DTD foundation. And while standards look somewhat structurally similar to journal articles (sections, lists, tables, figures, equations, etc.), there are some significant differences that required custom modifications to accommodate standards publication. The DTD was dubbed ISOSTS, and ISO made it freely available for use by other standard publishers along with documentation and Schematron QA scripts [7].

ISOSTS—key differences

A number of modifications were made for ISOSTS so that it was applicable for standards rather than journal articles. The key changes were:

  • The top-level element was changed from <article> to <standard>.
  • The <journal-meta> and <article-meta> elements were replaced with <iso-meta>, <reg-meta>, and <nat-meta> elements. These new elements contain standards-specific metadata fields. The latter two elements were designed for regional (e.g. European Union) and national metadata in anticipation that the XML model would be adopted by National Standards Bodies (NSBs) for XML publication.
  • The TermBase exchange (TBX) model of ISO 30042:2008 [8] was incorporated in a name space for markup of terms and definitions sections of standards.
  • The <std> element was extended with additional attributes and addition of the <std-ref> element to tag references to other standards.
  • Elements were added to tag non-normative notes and examples.

Current Adopters

Table 1 shows about fifteen standards organizations that currently have adopted or are adopting ISOSTS or some JATS-based model.

Table 1. Organizations that have Adopted (1), are Adopting (2) or are Considering (3) ISO STS or another JATS-based DTD for standards publication.

Table 1

Organizations that have Adopted (1), are Adopting (2) or are Considering (3) ISO STS or another JATS-based DTD for standards publication.

With the initiation of the NISO STS project, this number has increased, especially among US-based SDOs.

ISOSTS to NISO STS

There are currently several DTDs used for tagging standard-type information based on JATS, and a number of others that have been independently developed. This variety of DTDs makes interoperability between organizations difficult, at best, and leaves organizations reinventing the wheel instead of building on shared experience. In this regard, XML models for standards is in a not dissimilar place to where journal article XML was prior to JATS.

In the past few years, several SDOs and standards distributors in the United States were looking to upgrade their publishing systems. Some were familiar with JATS and looked at ISOSTS. However, there was reluctance to adopt ISOSTS, as was it not an official standard. There was also concern that if JATS was updated, its updates may not filter into ISOSTS.

There seemed an obvious advantage to making ISOSTS a formal standard and coupling it with JATS. This action would result in greater interoperability of standards, set a foundation for development around and for standards, which in turn would benefit standards users, and improve the overall future of standards publishing.

The NISO STS will provide a stable standard for standards publishers, a standard and guidance for tool and conversion vendors, a common format for sharing metadata, data, and full text, a common XML model across publication types(standards, books, journals), and a lower barrier to entry and ongoing cost savings for XML standards production and publishing.

ASME Case Study

In May 2013, ASME published their most important Code/Standard, the Boiler and Pressure Vessel Code (BPVC) in XML for the first time, 17,000 pages and 31 products, at a near astronomical expense in an incomplete production environment, the product of a project that had begun more than six years earlier.

As soon as 2013 publication was over, ASME looked at ways to make their standards publishing more economical and thus more sustainable. They had much discussion regarding technology options, what to keep, and what aspects, if changed, would have the greatest impact. On May 29, 2014, ASME had an in-person meeting with some of their development partners; following that all-day meeting, they had decided to use DITA as an XML format and an intermediary format (driven by XSL) for composition of the ASME standards. They promptly walked across the street to the SSP Annual Meeting and heard that ISO was considering proposing work on the ISOSTS to couple it with JATS under NISO, and make the ISO STS a formal standard. And they thought, well, if there is going to be a standard, we want to be a part of that!

ISO, NISO, and ASME began discussion, and in 2015 ASTM joined these discussions. In May 2015 ASTM and ASME submitted a NISO work item proposal and sponsorship for the NISO STS project.

Project Origins: Proposal work items

The project proposal submitted by ASTM and ASME included these items that were necessary for these two SDOs (and many other) to use an updated ISOSTS:

  • Expand for SDO and other use beyond NSBs
  • Align with JATS 1.1 standard, and future versions
  • Add a CALS table model option
  • Add the index model from BITS (Book Interchange Tag Suite)
  • Add MathML3 or MathML2 option (which would require 2 different model sets)
  • Add less ISO-specific metadata model (e.g., sdo-meta in conjunction with current iso-meta), which includes support for CrossRef data
  • Add Xinclude, so standards documents can be more easily managed
  • Provision of the vocabulary in DTD, XSD, and RNG formats
  • Make the documentation and the Tag Library, more standards-specific

Forming the working group

Following the initial NISO acceptance of the work item, NISO pointed out that a chair or co-chairs should be proposed. Robert Wheeler of ASME was proposed for his experience in publishing, his role within an SDO, and his part in helping to get the NISO STS projected started, and Bruce Rosenblum of Inera was proposed for his connection and intimate knowledge of JATS, ISO, NISO, his XML expertise, and his role as a service vendor to SDOs and other publishers. Another important strategic choice was to engage Mulberry Technologies, who serve as the project secretariat for JATS and BITS, and who originally developed ISOSTS from JATS for ISO, to serve in the same role for NISO STS.

The call for participation went out in August, the vacation month, and still, NISO received a surprising amount of interest, and organizations and individuals looking to be involved. (A few organizations joined NISO specifically for this project, though membership was not required.) It quickly became apparent that to allow broad representation and keep the project moving, a single workgroup would not be ideal. A two-tier approach was decided on, a Policy/Steering and a Technical Workgroup.

Committees

The workgroups include about 50 participants representing 30 organizations.

Steering Group

It was thought that the Steering Group would meet a lot at the beginning and toward the end of the process, and periodically throughout, to drive general direction, goals, and make high-level decisions associated with the work. This may still be true, but an early decision to define scope, which in the end will allow the workgroups to quickly gauge something as in or out of scope, has taken time to work through. The Steering Committee continues to meet monthly, which also helps support the Technical Group, as questions arise.

Technical Group

The Technical Group performs more detailed analysis and determines specific requirements/solutions. The Technical group has begun work on a number of items that have been approved, e.g., bringing ISOSTS, based on JATS 0.4 up to JATS 1.1 and the subcommittee work mentioned below. The Technical Group meets monthly, and the subcommittees meet more frequently, as needed.

Subcommittees

Subcommittees were formed to work in smaller groups on more challenging items that have been deemed in scope.

Translations

This subgroup is investigating metadata requirements concerning provenance and translation, such as what specific metadata is added to the translation of a standard and how any changes to content or metadata are identified. It is determining the requirements when translation has portions that either omit or add (non-mandatory) content from/to the original document.

The Steering Group determined that complete multi-language documents are out of scope, as the current use cases treat these as separate XML instances that may be combined in composition/presentation.

Metadata

The standard-level metadata in the current ISOSTS (a combination of: <iso-meta>, <reg-meta>, and <nat-meta> elements) is ISO-specific, and potentially different from metadata requirements found in standards produced by many American SDOs and other standards-producing organizations. The Subcommittee is examining these differences to articulate requirements, i.e., what may be missing, then categorizing metadata that may be document- or organization-specific, including a review of <iso-meta>, <reg-meta>, and <nat-meta> models.

They are also reviewing these models in light of recent work done at CrossRef to support the unique needs of standards, for example, dated or undated designation references, whereas an undated reference should always point to the current edition and a dated reference may have legal requirements to reference a specific edition.

Citing standards

This subcommittee is examining the variety of common practices for tagging standards as related objects and as citations, including, at least, citations within: normative reference lists, non-normative reference lists, references to standards in ordinary bibliographies, and normative references within the narrative text. The group is determining commonality and differences across different standards publishers, and also comparing with JATS markup of standards in bibliographies. This subcommittee will also review these use cases and requirements in light of recent work done at CrossRef.

Terms & definitions

ISOSTS defines the terms and definitions section of a standard using TBX (TermBase eXchange) defined in a Terminology Section (<term-sec>) element. Changes to TBX are explicitly out of scope for this project. However, many standards organizations do not use TBX to organize their terminology sections. The Subcommittee is investigating current terms and definitions sections in non-ISO standards and recommend an alternate model for organizations that want to use a simpler model than TBX.

Project plan (timeline)

The project goal is to have a Committee draft XML model(s) by summer 2016 and to present a NISO STS version 1.0 for vote no later than April 2017. The April 2017 deadline is because NISO rules require a working group to present work for vote no more than 18 months after the formation of the working group.

To meet these goals, there are:

  • Monthly steering group calls
  • Monthly technical group calls
  • More frequent Subcommittee meetings, as needed

Future phases after version 1.0 is a NISO standard may address items that have been explicitly deemed out-of-scope for phase 1.

NISO STS Project Status

The Steering and Technical Committees have been meeting monthly by tele-conference since October 2015. Decisions on agenda items have been reached by member consensus. While most major issues have been resolved, there are still some issues to be addressed in upcoming calls. Once the issues are addressed, the NISO STS model will be created by Mulberry Technologies, who serve the as the project secretariat for NISO STS in addition to serving that role for JATS and BITS.

NISO STS Major Decisions

Major decisions to-date by the Steering Technical Committees include:

  • The working group will make all reasonable efforts to make NISO STS 1.0 backwards compatible with ISOSTS. Although the new NISO STS model may recommend changes from ISOSTS in some ways that documents and metadata are modeled, the NISO STS DTD will remain backwardly compatible with ISOSTS.
  • Phase I of this tag set will be designed (as much as possible) for the standards publishing cases that are most similar to ISO standards (the base for the NISO STS Tag Set). This does not mean that it is explicitly excluding any particular standards documents: outliers and corner cases that fit the models may be tagged using the Tag Set, but such documents will not be the focus of the analysis and design.
  • Phase I of this tag set will consider only normative documents. Use of NISO STS for non-normative materials is not precluded, but the working group will not make additions during this project phase to support non-normative documents.
  • Phase I of this tag set will model current content, that is standards and specifications that are under active creation or update for publication, not those that have been withdrawn or superseded. (Some “current” standards may have begun life many years ago, but they are current content.) The working group does not mean to exclude any particular back-file content, and many back-file standards may fit nicely in to the NISO STS models, but the group will not analyze back files for metadata and structure.
  • The XML model will be focused on the requirements of published documents. It will not focus on the needs of XML authoring, or production requirements should they differ from final publication requirements, e.g. some organizations use line numbers during the drafting phase that do not appear in the final publication.
  • Phase I design will not consider formatting, branding, and publisher-specific look-and-feel. The design will focus on preserving sufficient structural and semantic information to provide a readable interchange format.
  • The XML model will be available in DTD, XSD, and RelaxNG format, as has been the case for JATS and BITS.
  • NISO STS will be based on the JATS 1.1 Publishing (Blue) model rather than the looser JATS 1.1 Archiving and Interchange (Green) model. The Steering Committee prefers a more constrained XML model to facilitate smooth interchange.
  • Separate models will be available with MathML2 and MathML3 because MathML3 is not fully backwards compatible with MathML2. Additionally, some working group members do not feel they can automatically convert thousands of equations in their current XML archive from MathML2 to MathML3 without reproofing each equation, and that is a burden they do not want to take on; other working group members want to start XML directly with MathML3.
  • Separate models will be available with only the XHTML table model (the only table model in ISOSTS), and with both the XHTML and CALS table models.
    The XHTML-only model has provisionally been dubbed the “Interchange” model, with a goal that when standards organizations interchange XML (for adoptions, co-publication, or distribution), there will be minimal differences from one organization’s XML to another’s.
    The XHTML+CALS model has provisionally been dubbed the “Production” model, and is meant for in-house production by those organizations that prefer the CALS table model. While two organizations are not precluded from agreeing to interchange STS XML tagged according to this second model, the first model is intended to be the primary and default interchange model.
  • The result of supporting two math models and two table models will be a matrix of four available models for NISO STS:
  • The BITS models for Indexes and Tables of Contents will be added to NISO STS.

The following extensions are currently under discussion by the working group:

  • A new metadata element to hold SDO-specific metadata may be added
  • A second model, in addition to TBX, may be added to tag terms and definitions. This addition is being made because TBX is quite complex and some organizations prefer to have a simpler model available

Issues Encountered

JATS and BITS have been coordinated projects since the BITS working group was formed in 2012. This coordination has occurred because of a strong overlap in group members, and a conscious decision to keep the two models structurally aligned to facilitate use by publishers who want to use both models.

ISOSTS, while based on the JATS 0.4 available in 2011, was not developed with this planned coordination, and early work to incorporate the JATS 1.1 core for NISO STS and add the BITS models for Indexes and Tables of Contents have encountered some issues.

<version>

ISOSTS added a version element in 2011 in iso-meta/std-ident for the “version number for this document” (inward-looking). JATS 1.1 added a version element in 2014 “for data or software that is cited or described” in citations (outward-looking). Because version is only available in an inward-looking context in ISOSTS and an outward-looking context in JATS, the NISO STS Technical Working group is recommending documentation updates for both STS and JATS to make this more explicit. In this regard, the <version> element will be similar to <volume> or <issue> in JATS, i.e. the focus will be based on the context in which the element appears.

<title-group>

BITS used title-group in some lower-level structural elements: <index>, <index-div>, and <index-group> and <toc>, <toc-div>, and <toc-group>. Use of this JATS document-level metadata grouping for structural elements created an uncomfortable dependency on the definition of <title-group> in the structural Index or Table of Contents models.

Fortunately this issue was discovered shortly before BITS 2.0 was to be posted, and because BITS 2.0 is not backwards compatible with BITS 1.0 (the Q&A model changed), it was possible to update BITS 2.0 with two new elements, <index-title-group> and <toc-title-group>, and these two new elements will appear in the index and TOC models, respectively, in NISO STS.

<std>

JATS and ISOSTS have a <std> element, but the use in citation elements such as <element-citation> and <mixed-citation> in JATS is somewhat different than the use of the <std> element within <ref> in ISOSTS. As of February 2016, a subcommittee of the NISO STS Technical Working group is working on recommendations.

Longer-Term Plan

Participants hope for an approved standard by fall 2016 but the official timeline requirement is to start a NISO vote no later than April 2017. Although a very positive move for the SDO community, the NISO STS work can be viewed as somewhat ‘late to the table,’ as seen by SDOs who developed their own JATS-based DTDs ahead of this work. Several SDOs have indicated they are ready to begin XML conversion and are scheduled to start prior to the completion of the first approved NISO STS standard. A positive indication is that many of the service groups assisting these SDOs have overlap with work for ISO, IEEE, and SAE; these groups may also be Steering and Technical group members. The commonalities among JATS, ISOSTS, and NISO STS will ease normalizing to the NISO STS at a later time.

As mentioned above, the committees have tabled some development for the next iteration of NISO STS, and through the principle of harnessing NISO STS to JATS the DTD will continue to grow and improve. Another driver we can look to is exciting advancements by the vendor community as more SDOs come on board to improve the engineer and scholarly user experience.

Benefits for the Standards Community

The authors of this paper have discussed NISO STS benefits internally, with each other, with vendors, and with other SDOs. The following summarizes the top wins for the SDO community and its wide base of users from academia, government, and industry, globally.

Workflows

Here’s an easy win for SDOs: they will garner all the expected benefits of any XML workflow. SDOs will be able to streamline production workflows across committees and other ad hoc publication projects while retaining the option of a comfortable and familiar workflow for members/contributors/editors.

Using available technologies, SDOs can infuse more automation into their workflows and composition. For example, one has indicated strong interest in automating all section, figure, and table numbering programmatically to free members and editors to focus more on technical content quality, leaving this remedial task to the system.

Verification is a beautiful thing. The well-crafted workflow will set alarms for whatever it is prompted to check, increasing accuracy which can be extremely critical in the world of engineering, and reducing the need for errata.

Standards writers in industry often claim a need for speed. Some standards may take years to be developed and agreed on, or they can be hurried due to need, technological innovation, or more critical societal issues. Any element that speeds publication is a benefit.

Already, innovative vendors are looking at ways to deliver standards content to industry programmatically reducing time for oversight by staff engineers. These tools will continue to come to market as the size of standards developers using NISO STS expands. By reducing the need for customization SDO by SDO, vendors can focus more on universal improvements to their own services, and this can potentially lower costs to SDOs.

Products

The scholarly publishing community might be surprised to learn of the limited availability of standards in HTML, EPUB, and other e-formats. Easier production of HTML, EPUB, and mobile document formats through an XML base will increase the choices SDOs can offer users, especially younger born-digital users of standards. These formats can better offer change management/change recognition. In the standards world, documents go through many iterations and a critical value is to quickly inform users what changes have been made and this can be made available as a premium.

It will be easier for well-developed platforms to serve content in innovative ways, e.g. ISO Online Browsing Platform [9] and ASTM Compass® [10]. Already discussed was the concept of co-published standards. In this two or more SDOs work together to co-brand a standard to meet member/committee/industry needs, and two publishing systems working to the same DTD will ease this process. Also, SDOs frequently work with distributor partners and a future result could include better aggregation of many SDOs’ standards, and more user features.

While DOIs have been a fixture in scholarly publishing, it is very recently that DOIs have come to standards. In addition to now deploying DOIs the NISO STS workflow will allow groups to easily turn on and off inter-standard linking for smooth transition between referenced standards.

Adoption

The SDO community recognizes that in addition to SDO-derived standards, industry and government create their own internal standards that reference SDO standards. Therefore, industry has been encouraged to participate in the dialog around NISO STS development. The right tools could create smooth internal integration between internal industry documents and SDO standards. Theoretically, this harmony could encourage greater adoption of standards by leading industry members.

Discovery

At the Special Library Association meeting in Boston, June 2015, SDOs informally discussed concerns about how it is challenging for their subscribers to find standards using leased discovery services (and index engines like Google Scholar). Other SDOs are joining IEEE in looking at the NISO Open Discovery Initiative (ODI) and its Open Discovery Initiative: Promoting Transparency in Discovery document.[11] Although ODI does not directly reference the JATS metadata structure, there has already been discussion that in future ODI may incorporate JATS metadata as part of its best practice and standards publishers will benefit via the NISO STS. Concurrent to this, site-crawling indexers should also benefit from metadata consistencies.

This discussion would be incomplete without mentioning the improvement in information delivery to librarians. Consistent metadata will also aid library discovery for internally managed library systems and repositories. This was addressed at a presentation at the Charleston Conference, 2015, by Jennifer Abbot and Tami Sandberg. [12]

What’s Next

JATS and BITS have grown smoothly and thrived due to the informal coordination between the two working groups. NISO STS faced a few early challenges because it has lived for five years in isolation of those two groups. Overlap between committee members is a great asset towards coordination. There are six people who are members of both the JATS standing committee and the NISO STS working groups. A future task is to define a sustainable structure to allow the JATS, BITS, and STS groups to coordinate developments in a manner that does not hinder the development of any one of these models.

To follow developments NISO has devoted a few online spaces for public view. [13]